home / skills / vudovn / antigravity-kit / systematic-debugging

systematic-debugging skill

safe

This skill guides systematic debugging across four phases to reproduce, isolate, understand root causes, and verify fixes, reducing guesswork.

npx playbooks add skill vudovn/antigravity-kit --skill systematic-debugging

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

2.3 KB

---
name: systematic-debugging
description: 4-phase systematic debugging methodology with root cause analysis and evidence-based verification. Use when debugging complex issues.
allowed-tools: Read, Glob, Grep
---

# Systematic Debugging

> Source: obra/superpowers

## Overview
This skill provides a structured approach to debugging that prevents random guessing and ensures problems are properly understood before solving.

## 4-Phase Debugging Process

### Phase 1: Reproduce
Before fixing, reliably reproduce the issue.

```markdown
## Reproduction Steps
1. [Exact step to reproduce]
2. [Next step]
3. [Expected vs actual result]

## Reproduction Rate
- [ ] Always (100%)
- [ ] Often (50-90%)
- [ ] Sometimes (10-50%)
- [ ] Rare (<10%)
```

### Phase 2: Isolate
Narrow down the source.

```markdown
## Isolation Questions
- When did this start happening?
- What changed recently?
- Does it happen in all environments?
- Can we reproduce with minimal code?
- What's the smallest change that triggers it?
```

### Phase 3: Understand
Find the root cause, not just symptoms.

```markdown
## Root Cause Analysis
### The 5 Whys
1. Why: [First observation]
2. Why: [Deeper reason]
3. Why: [Still deeper]
4. Why: [Getting closer]
5. Why: [Root cause]
```

### Phase 4: Fix & Verify
Fix and verify it's truly fixed.

```markdown
## Fix Verification
- [ ] Bug no longer reproduces
- [ ] Related functionality still works
- [ ] No new issues introduced
- [ ] Test added to prevent regression
```

## Debugging Checklist

```markdown
## Before Starting
- [ ] Can reproduce consistently
- [ ] Have minimal reproduction case
- [ ] Understand expected behavior

## During Investigation
- [ ] Check recent changes (git log)
- [ ] Check logs for errors
- [ ] Add logging if needed
- [ ] Use debugger/breakpoints

## After Fix
- [ ] Root cause documented
- [ ] Fix verified
- [ ] Regression test added
- [ ] Similar code checked
```

## Common Debugging Commands

```bash
# Recent changes
git log --oneline -20
git diff HEAD~5

# Search for pattern
grep -r "errorPattern" --include="*.ts"

# Check logs
pm2 logs app-name --err --lines 100
```

## Anti-Patterns

❌ **Random changes** - "Maybe if I change this..."
❌ **Ignoring evidence** - "That can't be the cause"
❌ **Assuming** - "It must be X" without proof
❌ **Not reproducing first** - Fixing blindly
❌ **Stopping at symptoms** - Not finding root cause

Overview

This skill teaches a 4-phase systematic debugging methodology that prevents guesswork and ensures durable fixes. It focuses on reproducing issues, isolating their source, discovering root causes, and verifying fixes with evidence. The workflow emphasizes documentation, repeatable tests, and regression prevention to reduce wasted time.

How this skill works

Start by creating a reliable reproduction and measuring how often it occurs. Narrow the problem with isolation questions and minimal test cases, then perform root cause analysis (for example, the 5 Whys) to move past symptoms. Implement a fix, verify it against the reproduction case and related functionality, and add tests or checks to prevent regressions. The skill includes checklists, commands, and anti-patterns to keep investigations disciplined.

When to use it

Investigating intermittent or complex bugs that resist quick fixes
When multiple systems or teams are involved and you need a reproducible scope
Before rolling out a patch to avoid regression and hidden side effects
When symptoms recur after apparent fixes
When you need to document cause and rationale for future maintenance

Best practices

Always reproduce the bug before making changes and record exact steps
Create a minimal reproduction to isolate variables and reduce noise
Ask isolation questions: What changed, when did it start, and does it occur across environments?
Use evidence (logs, debugger, git history) to support hypotheses rather than assumptions
Document root cause and add regression tests or monitoring after verification

Example use cases

A production service occasionally returns 500s; reproduce locally with a minimal request and trace recent deploys
A frontend UI displays wrong data only for certain users; isolate by reproducing with a minimal user record and checking API responses
Performance regression after dependency upgrade; use bisection and small repros to find offending change
Flaky CI tests; reproduce locally, add logging, and add deterministic tests to prevent future flakiness

FAQ

What if I can't reliably reproduce the issue?

Build instrumentation and logging to capture the failing state in production, then create a reduced test case from that evidence; aim for at least intermittent reproduction before changing code.

How many 'Whys' should I ask?

Use the 5 Whys as a guideline, but continue until you reach an actionable root cause—stop when you can implement a fix that prevents recurrence.