home / skills / openclaw / skills / test-driven-development

test-driven-development skill

/skills/paulpete/test-driven-development

This skill enforces test-first development across spec, task, or description inputs, guiding red-green-refactor cycles with repository patterns.

npx playbooks add skill openclaw/skills --skill test-driven-development

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
4.4 KB
---
name: test-driven-development
description: Unified TDD skill with three input modes — from spec, from task, or from description. Enforces test-first development using repository patterns, with proptest guidance and backpressure integration.
type: anthropic-skill
version: "1.0"
---

# Test-Driven Development

## Overview

One skill for all TDD workflows. Enforces test-first development using existing repository patterns. Three input modes handle different entry points — specs, task files, or ad-hoc descriptions — but the core cycle is always RED → GREEN → REFACTOR.

## Input Modes

Detect the input type and follow the corresponding mode:

### Mode A: From Spec (`.spec.md`)

Use when the input references a `.spec.md` file with Given/When/Then acceptance criteria.

1. **Locate and parse** the spec file — extract all Given/When/Then triples
2. **Generate one test stub per criterion** with `todo!()` bodies:
   ```rust
   /// Spec: <spec-file> — Criterion #<N>
   /// Given <given text>
   /// When <when text>
   /// Then <then text>
   #[test]
   fn <spec_name>_criterion_<N>_<slug>() {
       todo!("Implement: <then text>");
   }
   ```
3. **Verify stubs compile** but fail: `cargo test --no-run -p <crate>`
4. Proceed to the [TDD Cycle](#tdd-cycle) to make stubs pass

**Programmatic support:** `ralph_core::preflight::{extract_acceptance_criteria, extract_criteria_from_file, extract_all_criteria}` can parse criteria from spec files.

### Mode B: From Task (`.code-task.md`)

Use when the input references a `.code-task.md` file or a specific implementation task.

1. **Read the task** and identify acceptance criteria or requirements
2. **Discover patterns** (see [Pattern Discovery](#pattern-discovery))
3. **Design test scenarios** covering normal operation, edge cases, and error conditions
4. **Write failing tests** for all requirements before any implementation
5. Proceed to the [TDD Cycle](#tdd-cycle)

### Mode C: From Description

Use for ad-hoc tasks without a spec or task file.

1. **Clarify requirements** from the description
2. **Discover patterns** (see [Pattern Discovery](#pattern-discovery))
3. **Write failing tests** targeting the described behavior
4. Proceed to the [TDD Cycle](#tdd-cycle)

## Pattern Discovery

Before writing tests, discover existing conventions:

```bash
rg --files -g "crates/*/tests/*.rs"
rg -n "#\[cfg\(test\)\]" crates/
```

Read 2-3 relevant test files near the target code. Mirror:
- Test module layout, naming, and assertion style
- Fixture helpers and test utilities
- Use of `tempfile`, scenarios, or harnesses

## TDD Cycle

### 1) RED — Failing Tests

- Write tests for the exact behavior required
- Run tests to confirm failure **for the right reason**
- If tests pass without implementation, the test is wrong

### 2) GREEN — Minimal Implementation

- Write the minimum code to make tests pass
- No extra features or refactoring during this step

### 3) REFACTOR — Clean Up

- Improve implementation and tests while keeping tests green
- Align with surrounding codebase conventions
- Re-run tests after every change

## Proptest Guidance

Use `proptest` only when ALL of:
- Function is pure (no I/O, no time, no globals)
- Deterministic output for given input
- Non-trivial input space or edge cases

```rust
proptest! {
    #[test]
    fn round_trip(input in "[a-z0-9]{0,32}") {
        let encoded = encode(input.as_str());
        let decoded = decode(&encoded).expect("should decode");
        prop_assert_eq!(decoded, input);
    }
}
```

Don't introduce proptest as a new dependency without strong justification.

## Backpressure Integration

Include coverage evidence in completion events:

```bash
ralph emit "build.done" "tests: pass, lint: pass, typecheck: pass, audit: pass, coverage: pass (82%)"
```

Run `cargo tarpaulin --out Html --output-dir coverage --skip-clean` when feasible. If coverage cannot be run, state why and include targeted test evidence instead.

## Test Location Rules

- Spec maps to a single module → inline `#[cfg(test)]` tests
- Spec spans multiple modules → integration test in `crates/<crate>/tests/`
- CLI behavior → `crates/ralph-cli/tests/`
- Follow existing patterns in the target crate

## Anti-Patterns

- Writing implementation before tests
- Generating tests that pass without implementation
- Copying tests from other crates without adapting to local patterns
- Adding proptest when a simple example test suffices
- Emitting completion events without coverage evidence

Overview

This skill provides a unified Test-Driven Development (TDD) workflow with three input modes—from spec, from task, or from ad-hoc description—enforcing test-first development and the RED → GREEN → REFACTOR cycle. It adapts tests to existing repository patterns, integrates proptest-style property testing guidance, and emits backpressure signals with coverage evidence. The goal is predictable, low-risk changes that match project conventions.

How this skill works

The skill detects the input mode and extracts acceptance criteria or requirements. It generates failing test stubs or targeted tests that mirror local patterns, runs tests to confirm correct failures, implements the minimal code to pass them, then refactors while keeping tests green. It advises when to use property tests, how to place tests in the codebase, and how to include coverage and completion evidence in build events.

When to use it

  • You have a .spec.md file with Given/When/Then acceptance criteria.
  • You are handed a .code-task.md or a specific implementation task for a crate or module.
  • You receive an ad-hoc description and need to clarify behavior before coding.
  • You must align new tests with existing repository conventions and fixtures.
  • You need to include test coverage evidence in completion events for CI/backpressure.

Best practices

  • Always write a failing test for each acceptance criterion before implementing code.
  • Read 2–3 nearby tests to mirror naming, layout, and assertion styles.
  • Use property tests only for pure, deterministic functions with non-trivial input spaces.
  • Keep the GREEN step minimal: implement just enough to make tests pass.
  • Emit coverage or targeted test evidence when signaling completion to CI.

Example use cases

  • Parse a spec file, generate one failing test per Given/When/Then criterion, then implement features iteratively.
  • Take a task file, design edge-case and error-condition tests, run them failing, implement, and refactor.
  • Turn an informal feature description into clarified requirements and test stubs before coding.
  • Add integration tests to crates/<crate>/tests/ when a spec spans multiple modules.
  • Use targeted coverage runs or tarpaulin output to include coverage percentages in build events.

FAQ

When should I add proptest-based tests?

Only when the function is pure, deterministic, and the input space or edge cases are large enough to justify a new dependency.

What if generated tests pass without implementation?

Revise the test: it likely asserts a default behavior or duplicates existing code; failing for the right reason is essential.

How do I report completion if coverage tools are unavailable?

State why coverage couldn't run and include targeted test evidence and test counts in the completion event.