home / skills / madappgang / claude-code / tester-detective

tester-detective skill

needs review

/plugins/code-analysis/skills/tester-detective

This skill helps you audit test coverage using AST callers analysis to identify gaps, edge cases, and undocumented functions.

npx playbooks add skill madappgang/claude-code --skill tester-detective

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

17.1 KB

---
name: tester-detective
description: "⚡ Test analysis skill. Best for: 'what's tested', 'find test coverage', 'audit test quality', 'missing tests', 'edge cases'. Uses claudemem AST with callers analysis for efficient test discovery."
allowed-tools: Bash, Task, Read, AskUserQuestion
---

# Tester Detective Skill

This skill uses claudemem's callers analysis for test coverage investigation.

## Why Claudemem Works Better for Test Analysis

| Task | claudemem | Native Tools |
|------|-----------|--------------|
| Find tests for function | `callers` shows test files | Grep *.test.ts |
| Coverage gaps | No test callers = untested | Manual checking |
| Test distribution | Caller analysis | Unknown |
| Test relationships | AST direct mapping | Text search |

**Primary commands:**
- `claudemem --agent callers <name>` - Find tests that call this function
- `claudemem --agent map "test spec"` - Map test infrastructure

# Tester Detective Skill

**Version:** 3.3.0
**Role:** QA Engineer / Test Specialist
**Purpose:** Test coverage investigation using AST callers analysis and automated test-gaps detection

## Role Context

You are investigating this codebase as a **QA Engineer**. Your focus is on:
- **Test coverage** - What is tested vs. untested
- **Test callers** - Which tests call each function
- **Edge cases** - Boundary conditions in tests
- **Test quality** - Are tests meaningful or superficial
- **Coverage gaps** - Functions without test callers

## Why `callers` is Perfect for Test Analysis

The `callers` command shows you:
- **Test callers** = Tests appear as callers of the function
- **Coverage gaps** = No test callers = untested code
- **Test distribution** = Which tests cover which code
- **Direct relationships** = Exact test-to-code mapping

## Tester-Focused Commands (v0.3.0)

### Find Tests for a Function

```bash
# Who calls this function? (tests will appear as callers)
claudemem --agent callers processPayment
# Filter: callers from test files are your tests
# src/services/payment.test.ts:45 → This is a test!
```

### Map Test Infrastructure

```bash
# Find all test files
claudemem --agent map "test spec describe it"
# Find test utilities
claudemem --agent map "test helper mock stub"
# Find fixtures
claudemem --agent map "fixture factory builder"```

### Test Coverage Gaps (v0.4.0+ Required)

```bash
# Find high-importance untested code automatically
claudemem --agent test-gaps
# Output:
# file: src/services/payment.ts
# line: 45-89
# name: processPayment
# pagerank: 0.034
# production_callers: 4
# test_callers: 0
# ---
# This is CRITICAL - high PageRank but no tests!
```

**Why test-gaps is better than manual analysis**:
- Automatically finds high-PageRank symbols
- Automatically counts test vs production callers
- Prioritized list of coverage gaps

**Handling Empty Results:**
```bash
GAPS=$(claudemem --agent test-gaps)
if [ -z "$GAPS" ] || echo "$GAPS" | grep -q "No test gaps"; then
  echo "Excellent test coverage! All high-importance code has tests."
  echo ""
  echo "Optional: Check lower-importance code:"
  echo "  claudemem --agent test-gaps --min-pagerank 0.005"
else
  echo "Test Coverage Gaps Found:"
  echo "$GAPS"
fi
```

**Limitations Note:**
Test detection relies on file naming patterns:
- `*.test.ts`, `*.spec.ts`, `*_test.go`, etc.
- Integration tests in non-standard locations may not be detected
- Manual test files require naming convention updates

### Find Untested Code

**Method 1: Automated (v0.4.0+ Required - Recommended)**

```bash
# Let claudemem find all gaps automatically
GAPS=$(claudemem --agent test-gaps)

if [ -z "$GAPS" ]; then
  echo "No high-importance untested code found!"
else
  echo "$GAPS"
fi

# Focus on critical gaps only
claudemem --agent test-gaps --min-pagerank 0.05```

**Method 2: Manual (for specific functions, v0.3.0 compatible)**

```bash
# Get callers for a function
claudemem --agent callers importantFunction
# If NO callers from *.test.ts or *.spec.ts files:
# This function has NO tests!
```

### Test Coverage Analysis

```bash
# For each critical function, check callers
claudemem --agent callers authenticateUserclaudemem --agent callers processPaymentclaudemem --agent callers saveToDatabase
# Note which have test callers and which don't
```

## PHASE 0: MANDATORY SETUP

### Step 1: Verify claudemem v0.3.0

```bash
which claudemem && claudemem --version
# Must be 0.3.0+
```

### Step 2: If Not Installed → STOP

Use AskUserQuestion (see ultrathink-detective for template)

### Step 3: Check Index Status

```bash
# Check claudemem installation and index
claudemem --version && ls -la .claudemem/index.db 2>/dev/null
```

### Step 3.5: Check Index Freshness

Before proceeding with investigation, verify the index is current:

```bash
# First check if index exists
if [ ! -d ".claudemem" ] || [ ! -f ".claudemem/index.db" ]; then
  # Use AskUserQuestion to prompt for index creation
  # Options: [1] Create index now (Recommended), [2] Cancel investigation
  exit 1
fi

# Count files modified since last index
STALE_COUNT=$(find . -type f \( -name "*.ts" -o -name "*.tsx" -o -name "*.js" -o -name "*.jsx" -o -name "*.py" -o -name "*.go" -o -name "*.rs" \) \
  -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | grep -v "dist" | grep -v "build" | wc -l)
STALE_COUNT=$((STALE_COUNT + 0))  # Normalize to integer

if [ "$STALE_COUNT" -gt 0 ]; then
  # Get index time with explicit platform detection
  if [[ "$OSTYPE" == "darwin"* ]]; then
    INDEX_TIME=$(stat -f "%Sm" -t "%Y-%m-%d %H:%M" .claudemem/index.db 2>/dev/null)
  else
    INDEX_TIME=$(stat -c "%y" .claudemem/index.db 2>/dev/null | cut -d'.' -f1)
  fi
  INDEX_TIME=${INDEX_TIME:-"unknown time"}

  # Get sample of stale files
  STALE_SAMPLE=$(find . -type f \( -name "*.ts" -o -name "*.tsx" \) \
    -newer .claudemem/index.db 2>/dev/null | grep -v "node_modules" | grep -v ".git" | head -5)

  # Use AskUserQuestion (see template in ultrathink-detective)
fi
```

### Step 4: Index if Needed

```bash
claudemem index
```

---

## Workflow: Test Coverage Analysis (v0.3.0)

### Phase 0: Automated Gap Detection (v0.4.0+ Required)

```bash
# Run test-gaps FIRST - it does the work for you
GAPS=$(claudemem --agent test-gaps)

if [ -z "$GAPS" ]; then
  echo "No gaps found at default threshold"
  echo "Optionally check with lower threshold:"
  claudemem --agent test-gaps --min-pagerank 0.005else
  # This gives you a prioritized list of:
  # - High-PageRank symbols
  # - With 0 test callers
  # - Sorted by importance
  echo "$GAPS"
fi
```

### Phase 1: Map Test Infrastructure

```bash
# Find test configuration
claudemem --agent map "jest vitest mocha config"
# Find test utilities and mocks
claudemem --agent map "mock stub spy helper"```

### Phase 2: Identify Critical Functions

```bash
# Map the feature area
claudemem --agent map "payment processing"
# High-PageRank functions are most critical to test
```

### Phase 3: Check Test Coverage via Callers

```bash
# For each critical function, check callers
claudemem --agent callers PaymentService
# Look for callers from test files:
# src/services/payment.test.ts:23 ← TEST CALLER
# src/controllers/checkout.ts:45 ← NOT A TEST
```

### Phase 4: Find Coverage Gaps

```bash
# Functions with NO test callers = untested
# Make a list of untested critical functions
```

### Phase 5: Analyze Test Quality

```bash
# For functions with test callers, read the tests
# Check: Are they testing edge cases? Error paths?
```

## Output Format: Test Coverage Report

### 1. Test Infrastructure Summary

```
┌─────────────────────────────────────────────────────────┐
│                   TEST INFRASTRUCTURE                    │
├─────────────────────────────────────────────────────────┤
│  Framework: Vitest 2.x                                  │
│  Test Files: 156 files (*.spec.ts, *.test.ts)          │
│  Test Utils: src/__tests__/utils/                       │
│  Search Method: claudemem v0.3.0 (callers analysis)    │
└─────────────────────────────────────────────────────────┘
```

### 2. Coverage by Function (via callers)

```
| Function            | Test Callers | Coverage |
|---------------------|--------------|----------|
| authenticateUser    | 5 tests      | ✅ Good   |
| processPayment      | 3 tests      | ✅ Good   |
| calculateDiscount   | 0 tests      | ❌ None   |
| sendEmail           | 1 test       | ⚠️ Low    |
| updateUserProfile   | 0 tests      | ❌ None   |
```

### 3. Untested Critical Functions

```
🔴 HIGH PRIORITY - No Test Callers:
   └── calculateDiscount (PageRank: 0.034)
       └── callers show: 4 production callers, 0 test callers
   └── updateUserProfile (PageRank: 0.028)
       └── callers show: 3 production callers, 0 test callers

⚠️ MEDIUM PRIORITY - Few Test Callers:
   └── sendEmail (PageRank: 0.021)
       └── callers show: 1 test, no edge case tests
```

### 4. Test Quality Notes

```
📝 OBSERVATIONS:

1. calculateDiscount has 4 production callers but 0 test callers
   → Critical business logic untested!

2. sendEmail has 1 test caller
   → Only happy path tested, no error scenarios

3. authenticateUser has 5 test callers
   → Good coverage including edge cases
```

## Scenarios

### Scenario: "What's tested?"

```bash
# Step 1: Map the feature
claudemem --agent map "payment"
# Step 2: For each function, check callers
claudemem --agent callers processPaymentclaudemem --agent callers validateCardclaudemem --agent callers chargeCustomer
# Step 3: Count test callers vs production callers
```

### Scenario: Finding Coverage Gaps

```bash
# Step 1: Find high-PageRank (important) functions
claudemem --agent map
# Step 2: Check callers for each
claudemem --agent callers importantFunc1claudemem --agent callers importantFunc2
# Step 3: Functions with 0 test callers = gap
```

### Scenario: Test Quality Audit

```bash
# Step 1: Find test callers
claudemem --agent callers targetFunction
# Step 2: Read each test file at the caller line
# Step 3: Check: Does test cover edge cases? Errors?
```

## Result Validation Pattern

After EVERY claudemem command, validate results:

### Callers Validation for Tests

When checking test coverage:

```bash
CALLERS=$(claudemem --agent callers processPayment)
EXIT_CODE=$?

# Check for command failure
if [ "$EXIT_CODE" -ne 0 ]; then
  DIAGNOSIS=$(claudemem status 2>&1)
  # Use AskUserQuestion for recovery
fi

# Validate we got callers, not an error
if echo "$CALLERS" | grep -qi "error\|failed"; then
  # Actual error, not 0 callers
  # Use AskUserQuestion
fi

# Count test vs production callers
TEST_CALLERS=$(echo "$CALLERS" | grep -E "\.test\.|\.spec\.|_test\." | wc -l)
PROD_CALLERS=$(echo "$CALLERS" | grep -v -E "\.test\.|\.spec\.|_test\." | wc -l)

# Report coverage ratio
if [ "$TEST_CALLERS" -eq 0 ]; then
  echo "WARNING: No test coverage found for this function"
fi
```

### Empty Results Validation

```bash
RESULTS=$(claudemem --agent map "test spec describe")

if [ -z "$RESULTS" ]; then
  echo "WARNING: No test infrastructure found"
  # May indicate:
  # 1. Tests in non-standard locations
  # 2. Index doesn't include test files
  # 3. Wrong query terms
  # Use AskUserQuestion
fi
```

---

## FALLBACK PROTOCOL

**CRITICAL: Never use grep/find/Glob without explicit user approval.**

If claudemem fails or returns irrelevant results:

1. **STOP** - Do not silently switch tools
2. **DIAGNOSE** - Run `claudemem status`
3. **REPORT** - Tell user what happened
4. **ASK** - Use AskUserQuestion for next steps

```typescript
// Fallback options (in order of preference)
AskUserQuestion({
  questions: [{
    question: "claudemem test coverage analysis failed or found no tests. How should I proceed?",
    header: "Test Coverage Issue",
    multiSelect: false,
    options: [
      { label: "Reindex codebase", description: "Run claudemem index (~1-2 min)" },
      { label: "Try different query", description: "Search for different test patterns" },
      { label: "Use grep (not recommended)", description: "Traditional search - loses caller analysis" },
      { label: "Cancel", description: "Stop investigation" }
    ]
  }]
})
```

**See ultrathink-detective skill for complete Fallback Protocol documentation.**

---

## Anti-Patterns

| Anti-Pattern | Why Wrong | Correct Approach |
|--------------|-----------|------------------|
| `grep "test"` | No caller relationships | `claudemem --agent callers func` |
| Assume tests exist | Miss coverage gaps | Verify with callers analysis |
| Count test files | Doesn't show what's tested | Check callers per function |
| Skip PageRank | Miss critical gaps | Focus on high-PageRank untested |
| `cmd \| head/tail` | Hides test coverage gaps | Use full output |

### Output Truncation Warning

╔══════════════════════════════════════════════════════════════════════════════╗
║                                                                              ║
║  ❌ Anti-Pattern 7: Truncating Claudemem Output                              ║
║                                                                              ║
║     FORBIDDEN (any form of output truncation):                               ║
║     → BAD: claudemem --agent map "query" | head -80                         ║
║     → BAD: claudemem --agent callers X | tail -50                           ║
║     → BAD: claudemem --agent search "x" | grep -m 10 "y"                    ║
║     → BAD: claudemem --agent map "q" | awk 'NR <= 50'                       ║
║     → BAD: claudemem --agent callers X | sed '50q'                          ║
║     → BAD: claudemem --agent search "x" | sort | head -20                   ║
║     → BAD: claudemem --agent map "q" | grep "pattern" | head -20            ║
║                                                                              ║
║     CORRECT (use full output or built-in limits):                            ║
║     → GOOD: claudemem --agent map "query"                                   ║
║     → GOOD: claudemem --agent search "x" -n 10                              ║
║     → GOOD: claudemem --agent map "q" --tokens 2000                         ║
║     → GOOD: claudemem --agent search "x" --page-size 20 --page 1            ║
║     → GOOD: claudemem --agent context Func --max-depth 3                    ║
║                                                                              ║
║     WHY: Output is pre-optimized; truncation hides critical results         ║
║                                                                              ║
╚══════════════════════════════════════════════════════════════════════════════╝

---

## Feedback Reporting (v0.8.0+)

After completing investigation, report search feedback to improve future results.

### When to Report

Report feedback ONLY if you used the `search` command during investigation:

| Result Type | Mark As | Reason |
|-------------|---------|--------|
| Read and used | Helpful | Contributed to investigation |
| Read but irrelevant | Unhelpful | False positive |
| Skipped after preview | Unhelpful | Not relevant to query |
| Never read | (Don't track) | Can't evaluate |

### Feedback Pattern

```bash
# Track during investigation
SEARCH_QUERY="your original query"
HELPFUL_IDS=""
UNHELPFUL_IDS=""

# When reading a helpful result
HELPFUL_IDS="$HELPFUL_IDS,$result_id"

# When reading an unhelpful result
UNHELPFUL_IDS="$UNHELPFUL_IDS,$result_id"

# Report at end of investigation (v0.8.0+ only)
if claudemem feedback --help 2>&1 | grep -qi "feedback"; then
  timeout 5 claudemem feedback \
    --query "$SEARCH_QUERY" \
    --helpful "${HELPFUL_IDS#,}" \
    --unhelpful "${UNHELPFUL_IDS#,}" 2>/dev/null || true
fi
```

### Output Update

Include in investigation report:

```
Search Feedback: [X helpful, Y unhelpful] - Submitted (v0.8.0+)
```

---

## Testing Tips

1. **Use callers to find tests** - Tests appear as callers of functions
2. **No test callers = no tests** - Coverage gap identified
3. **High PageRank + no tests = critical gap** - Prioritize these
4. **Read test callers** - Verify quality, not just existence
5. **Check edge cases** - Are error paths tested?

## Notes

- **`callers` reveals test coverage** - Tests are just callers from test files
- **High-PageRank untested = critical gap** - Most impactful coverage issues
- **Production callers vs test callers** - Ratio shows coverage health
- Filter callers by file path (*.test.ts, *.spec.ts) to find tests
- Works best with TypeScript, Go, Python, Rust codebases

---

**Maintained by:** MadAppGang
**Plugin:** code-analysis v2.7.0
**Last Updated:** December 2025 (v3.3.0 - Cross-platform compatibility, inline templates, improved validation)

Overview

This skill is a test analysis assistant that discovers what is tested, finds coverage gaps, and audits test quality using claudemem's AST-based callers analysis. It surfaces which tests call each function, highlights high‑importance untested symbols, and maps test infrastructure for focused QA work. The goal is fast, reliable test discovery and prioritized gap detection for engineering and QA teams.

How this skill works

The skill runs claudemem callers and mapping agents to locate direct AST caller relationships between tests and production code. It flags functions with zero test callers, computes prioritized gaps using PageRank or caller counts, and maps test files, utilities, and fixtures. When claudemem is unavailable or results look empty, it provides a strict fallback protocol to diagnose and ask the user before switching tools.

When to use it

Determine exactly which tests exercise a given function or module
Find untested critical code (high PageRank, no test callers)
Audit test quality and check for missing edge case coverage
Map test infrastructure, helpers, and fixtures across a repo
Prioritize testing work for releases or refactors

Best practices

Run claudemem index and verify index freshness before analysis
Start with automated test-gaps (v0.4.0+) to get prioritized issues
Use callers output to treat tests as first-class callers, not file counts
Validate claudemem command exit codes and search results before acting
Do not fallback to grep/find without explicit user approval

Example use cases

"What's tested?" — map a feature area and inspect callers for each function
Coverage audit — run test-gaps to list high-PageRank functions with zero tests
Test quality review — open each test caller and verify edge case and error-path coverage
Refactor safety check — confirm critical functions have test callers before changing signatures
CI triage — produce a concise list of urgent untested functions for the release checklist

FAQ

What if claudemem returns no results?

First validate index freshness and run claudemem status. If index is missing or stale, prompt to reindex. Only consider alternative searches or grep after asking the user.

How are coverage gaps prioritized?

Gaps are prioritized by symbol importance (PageRank), production caller counts, and absence of test callers to surface high‑impact untested code first.