home / skills / secondsky / claude-skills / mutation-testing

This skill helps you validate test effectiveness through mutation testing with Stryker and mutmut, revealing weak tests and improving quality.

npx playbooks add skill secondsky/claude-skills --skill mutation-testing

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.2 KB
---
name: mutation-testing
description: Validate test effectiveness with mutation testing using Stryker (TypeScript/JavaScript) and mutmut (Python). Find weak tests that pass despite code mutations. Use to improve test quality.
allowed-tools: Bash, Read, Edit, Write, Grep, Glob, TodoWrite
---

# Mutation Testing

Expert knowledge for mutation testing - validating that your tests actually catch bugs by introducing deliberate code mutations.

## Core Concept

- **Mutants**: Small code changes introduced automatically
- **Killed**: Test fails with mutation (good - test caught the bug)
- **Survived**: Test passes with mutation (bad - weak test)
- **Score**: Percentage of mutants killed (aim for 80%+)

## TypeScript/JavaScript (Stryker)

### Installation

```bash
# Using Bun
bun add -d @stryker-mutator/core @stryker-mutator/vitest-runner

# Using npm
npm install -D @stryker-mutator/core @stryker-mutator/vitest-runner
```

### Configuration

```typescript
// stryker.config.mjs
export default {
  packageManager: 'bun',
  reporters: ['html', 'clear-text', 'progress'],
  testRunner: 'vitest',
  coverageAnalysis: 'perTest',
  mutate: ['src/**/*.ts', '!src/**/*.test.ts'],
  thresholds: { high: 80, low: 60, break: 60 },
  incremental: true,
}
```

### Running Stryker

```bash
# Run mutation testing
bunx stryker run

# Incremental mode (only changed files)
bunx stryker run --incremental

# Specific files
bunx stryker run --mutate "src/utils/**/*.ts"

# Open HTML report
open reports/mutation/html/index.html
```

### Example: Weak Test

```typescript
// Source code
function calculateDiscount(price: number, percentage: number): number {
  return price - (price * percentage / 100)
}

// ❌ WEAK: Test passes even if we mutate calculation
test('applies discount', () => {
  expect(calculateDiscount(100, 10)).toBeDefined() // Too weak!
})

// ✅ STRONG: Test catches mutation
test('applies discount correctly', () => {
  expect(calculateDiscount(100, 10)).toBe(90)
  expect(calculateDiscount(100, 20)).toBe(80)
  expect(calculateDiscount(50, 10)).toBe(45)
})
```

## Python (mutmut)

### Installation

```bash
uv add --dev mutmut
```

### Running mutmut

```bash
# Run mutation testing
uv run mutmut run

# Show results
uv run mutmut results

# Show specific mutant
uv run mutmut show 1

# Generate HTML report
uv run mutmut html
open html/index.html
```

## Common Mutation Types

```typescript
// Arithmetic Operator
// Original: a + b → a - b, a * b, a / b

// Relational Operator
// Original: a > b → a >= b, a < b, a <= b

// Logical Operator
// Original: a && b → a || b

// Boolean Literal
// Original: true → false
```

## Mutation Score Targets

| Score | Quality | Action |
|-------|---------|--------|
| 90%+ | Excellent | Maintain quality |
| 80-89% | Good | Small improvements |
| 70-79% | Acceptable | Focus on weak areas |
| < 60% | Poor | Major improvements needed |

## Improving Weak Tests

### Pattern: Insufficient Assertions

```typescript
// Before: Mutation survives
test('calculates sum', () => {
  expect(sum([1, 2, 3])).toBeGreaterThan(0) // Weak!
})

// After: Mutation killed
test('calculates sum correctly', () => {
  expect(sum([1, 2, 3])).toBe(6)
  expect(sum([0, 0, 0])).toBe(0)
  expect(sum([])).toBe(0)
})
```

### Pattern: Boundary Conditions

```typescript
// After: Tests boundaries
test('validates age boundaries', () => {
  expect(isValidAge(18)).toBe(true)   // Min valid
  expect(isValidAge(17)).toBe(false)  // Just below
  expect(isValidAge(100)).toBe(true)  // Max valid
  expect(isValidAge(101)).toBe(false) // Just above
})
```

## Best Practices

- Start with core business logic modules
- Ensure 80%+ coverage before mutation testing
- Run incrementally (only changed files)
- Focus on important files first
- Don't expect 100% mutation score (equivalent mutants exist)

## Workflow

```bash
# 1. Ensure good coverage first
bun test --coverage
# Target: 80%+ coverage

# 2. Run mutation testing
bunx stryker run

# 3. Check report
open reports/mutation/html/index.html

# 4. Fix survived mutants
# 5. Re-run incrementally
bunx stryker run --incremental
# or: npx stryker run --incremental
```

## See Also

- `vitest-testing` - Unit testing framework
- `test-quality-analysis` - Detecting test smells

Overview

This skill validates test effectiveness by running mutation testing with Stryker for TypeScript/JavaScript and mutmut for Python. It injects small code mutations, reports which mutants survive, and produces HTML reports to guide test improvements. Use it to find weak tests and raise your mutation score toward recommended targets (80%+).

How this skill works

The skill runs mutation engines (Stryker or mutmut) against your codebase, generating mutants by applying common mutation operators (arithmetic, relational, logical, boolean). It executes your test suite against each mutant, marks mutants as killed or survived, and calculates a mutation score. Reports highlight surviving mutants and link back to source so you can strengthen assertions and cover boundary cases.

When to use it

  • After you have decent unit coverage (target ~80%+).
  • When you want to measure actual fault-detection ability of tests, not just coverage.
  • Before releases for high-risk modules or core business logic.
  • During CI for incremental checks on changed files.
  • When improving test quality or onboarding new testing guidelines.

Best practices

  • Start mutation testing on core business logic and utilities before expanding to UI code.
  • Ensure unit tests are deterministic and isolate external dependencies (mocks/stubs) so mutants reliably fail.
  • Target 80%+ mutation score as a practical goal; don’t chase 100% due to equivalent mutants.
  • Run mutation tests incrementally — only changed files — to save CI time.
  • Write specific assertions, test multiple inputs and boundary conditions to kill subtle mutants.

Example use cases

  • Run Stryker for a TypeScript library to detect weak numeric and branching tests; open the HTML report to find survived mutants.
  • Use mutmut on a Python data-processing module to see which logical operator mutations survive and add stronger assertions.
  • Add mutation testing to CI for critical services and fail the build when score drops below team thresholds.
  • Use incremental mutation runs during development to validate tests only for files you changed.
  • Convert weak smoke tests into precise behavior tests (value assertions and edge cases) after inspecting surviving mutants.

FAQ

Will mutation testing replace coverage tools?

No. Coverage measures which code is exercised; mutation testing measures whether tests detect faults. Use both: coverage first, then mutation testing to validate test quality.

How long do mutation runs take and how to speed them up?

Mutation testing is slower than unit tests because it runs many mutant versions. Speed up by running incrementally, focusing on critical files, using per-test coverage analysis, and parallelizing runners where supported.