home / skills / gigaverse-app / skillet / testing

testing skill

safe

This skill guides you to apply TDD and invariant-based testing practices to write robust tests, fix bugs, and verify code changes.

npx playbooks add skill gigaverse-app/skillet --skill testing

Review the files below or copy the command above to add this skill to your agents.

Files (7)

SKILL.md

4.3 KB

---
name: testing
description: Use when writing or modifying tests, fixing bugs with TDD, reviewing test code, or when user mentions "test", "tests", "testing", "TDD", "test-driven", "pytest", "add tests", "write tests", "unit test", "integration test", "test coverage", "bug fix", "fix bug", "verify", "edge case", "mocking", "patch.object".
---

# Testing Best Practices

## Core Philosophy

**Write invariant-based tests that verify what SHOULD be true, not bug-affirming tests that prove bugs existed.**

## Test-Driven Development (TDD)

**ALWAYS use TDD when fixing bugs:**

1. Find existing tests for the broken functionality
2. Run them to verify they pass (shouldn't catch bug)
3. Improve tests until they fail (exposing the bug)
4. Fix the code to make tests pass
5. Verify all tests pass

## Quick Start

```bash
# Look for existing fixtures in conftest.py
grep -r "@pytest.fixture" tests/conftest.py

# Look at sibling test files for patterns
ls tests/test_<module_name>/

# Run tests with coverage
pytest --cov=src --cov-report=term-missing
```

## Critical Rules Quick Reference

### Mocking

```python
# ALWAYS use patch.object
@patch.object(MyClass, 'method_name')

# ALWAYS mock dependencies, NEVER the system under test
generator = NewsPostGenerator()
generator._queries_chain = AsyncMock()  # Mock dependency
await generator.generate_news_post(...)  # Test actual code
```

### Test Data

```python
# ALWAYS use Pydantic models
return UserResult(user_id=user_id, name="Test User", ...)

# NEVER use naked literals - extract constants
DEFAULT_TIMEOUT = 60
STALE_THRESHOLD = DEFAULT_TIMEOUT + 10
```

### Invariant Testing

```python
# Test what SHOULD be true
def test_selector_populated_with_all_names():
    """INVARIANT: Selector contains all names from config."""
    config = make_config_with_items(["item1", "item2", "item3"])
    page = setup_page_with_config(config)
    assert page.item_selector.options == ["item1", "item2", "item3"]
```

### E2E Testing

```python
# Call actual production code
await service.process_request(request_input)
published_event = mock_queue.publish.call_args.kwargs["output"]
assert published_event.data is not None
```

### Intent & Testability

```python
# Extract pure functions from infrastructure code
def apply_suffix(user_id: str, name: str, is_special: bool) -> tuple[str, str]:
    """Pure function - easily testable without mocks."""
    if is_special:
        return f"{user_id}_special", f"{name}_special"
    return user_id, name

# Test behavior (inputs -> outputs), not implementation
@pytest.mark.parametrize("user_id,name,is_special,expected_id,expected_name", [
    ("alice", "Alice", True, "alice_special", "Alice_special"),
    ("alice", "Alice", False, "alice", "Alice"),
])
def test_suffix_behavior(user_id, name, is_special, expected_id, expected_name):
    result_id, result_name = apply_suffix(user_id, name, is_special)
    assert result_id == expected_id

# NEVER write sham tests that mock __init__ and set internal state
# They test the mock setup, not production code
```

## Testing Checklist

Before committing:
- [ ] All imports at top of file
- [ ] Using `patch.object`, not `patch`
- [ ] No mocking of the system under test
- [ ] No mocking of model classes
- [ ] Test data uses Pydantic models
- [ ] Checked conftest.py for existing fixtures
- [ ] No naked literals - all values are constants
- [ ] Using correct fixture decorators
- [ ] Asserting invariants early
- [ ] Mock args accessed with `.args[N]` / `.kwargs["name"]`
- [ ] E2E tests call actual production code
- [ ] Tests verify invariants, not bug existence
- [ ] 100% coverage of new code
- [ ] All tests pass

## Reference Files

For detailed patterns and examples:
- [references/mocking.md](references/mocking.md) - Mocking strategies and when to mock
- [references/test-data.md](references/test-data.md) - Test data creation patterns
- [references/e2e-testing.md](references/e2e-testing.md) - E2E and user journey patterns
- [references/intent-and-testability.md](references/intent-and-testability.md) - Intent documentation, pure functions, testability
- [references/concurrency.md](references/concurrency.md) - Concurrency testing patterns
- [references/fixtures.md](references/fixtures.md) - Common fixtures and decorators

**Remember**: When user reports bug, use TDD - find tests -> run -> improve until fail -> fix code -> verify pass.

Overview

This skill helps you write, review, and fix tests using test-driven development and invariant-based practices. It focuses on clear test intent, proper mocking strategies, consistent test data, and end-to-end verification. Use it when adding tests, diagnosing failing suites, or applying TDD to bug fixes.

How this skill works

The skill inspects test intent and structure, ensuring tests assert invariants (what should be true) instead of affirming past bugs. It enforces patch.object-based mocking of dependencies only, Pydantic-backed test data, and extraction of pure functions for easy unit testing. It also guides running tests with coverage and turning bug reports into failing tests before fixing code.

When to use it

Writing new unit or integration tests
Fixing bugs using TDD (make a failing test first)
Reviewing or refactoring existing tests for correctness
Adding end-to-end tests that call production code
Improving test data and mocking strategies to reduce brittle tests

Best practices

Always prefer invariant-based assertions: test expected behavior, not error history.
When fixing bugs: find or add tests that fail first, then implement the fix and verify all tests pass.
Mock dependencies only; never mock the system under test. Use patch.object for reliable patches.
Model test data with Pydantic models or typed factories; avoid naked literals and inline constants.
Extract pure functions from infrastructure-heavy code to simplify unit tests and parameterized cases.
Cover new code paths with tests and include E2E checks that exercise production code where appropriate.

Example use cases

User reports a bug: write a failing test that exposes the bug, fix code, then run the suite to confirm regression is closed.
Adding a new feature: write TDD-style unit tests first, implement minimal code, then expand tests for edge cases and integration.
Refactoring: extract a pure function from complex code and add focused unit tests to keep behavior stable.
Improving flaky tests: replace ad-hoc mocks with patch.object on dependencies and switch to Pydantic fixtures for consistent test data.
Writing E2E tests: call production services directly in the test and assert resulting published events and outputs.

FAQ

What if existing tests already pass but bug exists?

Locate or add a targeted test that exercises the buggy behavior and modify it until it fails; that failing test is your TDD safety net before fixing the code.

When should I mock vs call real code?

Mock external dependencies and side-effecting integrations. Do not mock the system under test; prefer calling real production code for E2E scenarios and critical behavior verification.