home / skills / sjungling / claude-plugins / git-bisect-debugging
git-bisect-debugging skill

not checked
/plugins/workflow/skills/git-bisect-debugging
npx playbooks add skill sjungling/claude-plugins --skill git-bisect-debugging
Review the files below or copy the command above to add this skill to your agents.
Files (1)
SKILL.md
22.3 KB
---
name: git-bisect-debugging
description: Use when debugging regressions or identifying which commit introduced a bug - provides systematic workflow for git bisect with automated test scripts, manual verification, or hybrid approaches. Can be invoked from systematic-debugging as a debugging technique, or used standalone when you know the issue is historical.
---

# Git Bisect Debugging

## Overview

Systematically identify which commit introduced a bug or regression using git bisect. This skill provides a structured workflow for automated, manual, and hybrid bisect approaches.

**Core principle:** Binary search through commit history to find the exact commit that introduced the issue. Main agent orchestrates, subagents execute verification at each step.

**Announce at start:** "I'm using git-bisect-debugging to find which commit introduced this issue."

## Quick Reference

| Phase | Key Activities | Output |
|-------|---------------|--------|
| **1. Setup & Verification** | Identify good/bad commits, verify clean state | Confirmed commit range |
| **2. Strategy Selection** | Choose automated/manual/hybrid approach | Test script or verification steps |
| **3. Execution** | Run bisect with subagents | First bad commit hash |
| **4. Analysis & Handoff** | Show commit details, analyze root cause | Root cause understanding |

## MANDATORY Requirements

These are **non-negotiable**. No exceptions for time pressure, production incidents, or "simple" cases:

1. **✅ ANNOUNCE skill usage** at start:
   ```
   "I'm using git-bisect-debugging to find which commit introduced this issue."
   ```

2. **✅ CREATE TodoWrite checklist** immediately (before Phase 1):
   - Copy the exact checklist from "The Process" section below
   - Update status as you progress through phases
   - Mark phases complete ONLY when finished

3. **✅ VERIFY safety checks** (Phase 1 - no skipping):
   - Working directory MUST be clean (`git status`)
   - Good commit MUST be verified (actually good)
   - Bad commit MUST be verified (actually bad)
   - If ANY check fails → abort and fix before proceeding

4. **✅ USE AskUserQuestion** for strategy selection (Phase 2):
   - Present all 3 approaches (automated, manual, hybrid)
   - Don't default to automated without asking
   - User must explicitly choose

5. **✅ LAUNCH subagents** for verification (Phase 3):
   - Main agent: orchestrates git bisect state
   - Subagents: execute verification at each commit (via Task tool)
   - NEVER run verification in main context
   - Each commit tested in isolated subagent

6. **✅ HANDOFF to systematic-debugging** (Phase 4):
   - After finding bad commit, announce handoff
   - Use superpowers:systematic-debugging skill
   - Investigate root cause, not just WHAT changed

## Red Flags - STOP and Follow the Skill

If you catch yourself thinking ANY of these, you're about to violate the skill:

- "User is in a hurry, I'll skip safety checks" → NO. Run all safety checks.
- "This is simple, no need for TodoWrite" → NO. Create the checklist.
- "I'll just use automated approach" → NO. Use AskUserQuestion.
- "I'll run the test in my context" → NO. Launch subagent.
- "Found the commit, that's enough" → NO. Handoff to systematic-debugging.
- "Working directory looks clean" → NO. Run `git status` to verify.
- "I'll verify good/bad commits later" → NO. Verify BEFORE starting bisect.

**All of these mean: STOP. Follow the 4-phase workflow exactly.**

## The Process

Copy this checklist to track progress:

```
Git Bisect Progress:
- [ ] Phase 1: Setup & Verification (good/bad commits identified)
- [ ] Phase 2: Strategy Selection (approach chosen, script ready)
- [ ] Phase 3: Execution (first bad commit found)
- [ ] Phase 4: Analysis & Handoff (root cause investigation complete)
```

### Phase 1: Setup & Verification

**Purpose:** Ensure git bisect is appropriate and safe to run.

**Steps:**

1. **Verify prerequisites:**
   - Check we're in a git repository
   - Verify working directory is clean (`git status`)
   - If uncommitted changes exist, abort and ask user to commit or stash

2. **Identify commit range:**
   - Ask user for **good commit** (where it worked)
     - Suggestions: last release tag, last passing CI, commit from when it worked
     - Commands to help: `git log --oneline`, `git tag`, `git log --since="last week"`
   - Ask user for **bad commit** (where it's broken)
     - Usually: `HEAD` or a specific commit where issue confirmed
   - Calculate estimated steps: ~log2(commits between good and bad)

3. **Verify the range:**
   - Checkout bad commit and verify issue exists
   - Checkout good commit and verify issue doesn't exist
   - If reversed, offer to swap them
   - Return to original branch/commit

4. **Safety checks:**
   - Warn if range is >1000 commits (ask for confirmation)
   - Verify good commit is ancestor of bad commit
   - Note current branch/commit for cleanup later

**Output:** Confirmed good commit hash, bad commit hash, estimated steps

### Phase 2: Strategy Selection

**Purpose:** Choose the most efficient bisect approach.

**Assessment:** Can we write an automated test script that deterministically identifies good vs bad?

**MANDATORY: Use AskUserQuestion tool** to present these three approaches (do NOT default to automated):

```javascript
AskUserQuestion({
  questions: [{
    question: "Which git bisect approach should we use?",
    header: "Strategy",
    multiSelect: false,
    options: [
      {
        label: "Automated - test script runs automatically",
        description: "Fast, no manual intervention. Best for: test failures, crashes, deterministic behavior. Requires: working test script."
      },
      {
        label: "Manual - you verify each commit",
        description: "Handles subjective issues. Best for: UI/UX changes, complex scenarios. Requires: you can manually check each commit."
      },
      {
        label: "Hybrid - script + manual confirmation",
        description: "Efficient with reliability. Best for: mostly automated but needs judgment. Requires: script for most cases, manual for edge cases."
      }
    ]
  }]
})
```

**Three approaches to present:**

**Approach 1: Automated Bisect**
- **When to use:** Test failure, crash, deterministic behavior
- **How it works:** Script returns exit 0 (good) or 1 (bad), fully automatic
- **Benefits:** Fast, no manual intervention, reproducible
- **Requirements:** Can write a script that runs the test/check

**Approach 2: Manual Bisect**
- **When to use:** UI/UX changes, subjective behavior, complex scenarios
- **How it works:** User verifies at each commit, Claude guides
- **Benefits:** Handles non-deterministic or subjective issues
- **Requirements:** User can manually verify each commit

**Approach 3: Hybrid Bisect**
- **When to use:** Mostly automatable but needs human judgment
- **How it works:** Script narrows range, manual verification for final confirmation
- **Benefits:** Efficiency of automation with reliability of manual check
- **Requirements:** Can automate most checks, manual for edge cases

**If automated or hybrid selected:**

Write test script following this template:

```bash
#!/bin/bash
# Exit codes: 0 = good, 1 = bad, 125 = skip (can't test)

# Setup/build (required for each commit)
npm install --silent 2>/dev/null || exit 125

# Run the actual test
npm test -- path/to/specific-test.js
exit $?
```

**Script guidelines:**
- Make it **deterministic** (no random data, use fixed seeds)
- Make it **fast** (runs ~log2(N) times)
- **Exit codes:** 0 = good, 1 = bad, 125 = skip
- Include build/setup (each commit might need different deps)
- Test ONE specific thing, not entire suite
- Make it **read-only** (no data modification)

**If manual selected:**

Write specific verification steps for subagent:

**Good example:**
```
1. Run `npm start`
2. Open browser to http://localhost:3000
3. Click the "Login" button
4. Check if it redirects to /dashboard
5. Respond 'good' if redirect happens, 'bad' if it doesn't
```

**Bad example:**
```
See if the login works
```

**Output:** Selected approach, test script (if automated/hybrid), or verification steps (if manual)

### Phase 3: Execution

**Architecture:** Main agent orchestrates bisect, subagents verify each commit in isolated context.

**Main agent responsibilities:**
- Manage git bisect state (`start`, `good`, `bad`, `reset`)
- Track progress and communicate remaining steps
- Launch subagents for verification
- Handle errors and cleanup

**Subagent responsibilities:**
- Execute verification in clean context (no bleeding between commits)
- Report result: "good", "bad", or "skip"
- Provide brief reasoning for result

**Execution flow:**

1. **Main agent:** Run `git bisect start <bad> <good>`

2. **Loop until bisect completes:**

   a. Git checks out a commit to test

   b. **Main agent launches subagent** using Task tool:

   **For automated:**
   ```
   Prompt: "Run this test script and report the result:

   <script content>

   Report 'good' if exit code is 0, 'bad' if exit code is 1, 'skip' if exit code is 125.
   Include the output of the script in your response."
   ```

   **For manual:**
   ```
   Prompt: "We're testing commit <hash> (<message>).

   Follow these verification steps:
   <verification steps>

   Report 'good' if the issue doesn't exist, 'bad' if it does exist.
   Explain what you observed."
   ```

   **For hybrid:**
   ```
   Prompt: "Run this test script:

   <script content>

   If exit code is 0 or 1, report that result.
   If exit code is 125 or script is ambiguous, perform manual verification:
   <verification steps>

   Report 'good', 'bad', or 'skip' with explanation."
   ```

   c. **Subagent returns:** Result ("good", "bad", or "skip") with explanation

   d. **Main agent:** Run `git bisect good|bad|skip` based on result

   e. **Main agent:** Update progress
      - Show commit that was tested and result
      - Calculate remaining steps: `git bisect log | grep "# .*step" | tail -1`
      - Example: "Tested commit abc123 (bad). ~4 steps remaining."

   f. Repeat until git bisect identifies first bad commit

3. **Main agent:** Run `git bisect reset` to cleanup

4. **Main agent:** Return to original branch/commit

**Error handling during execution:**

- **Subagent timeout/error:** Allow user to manually mark as "skip"
- **Build failures:** Use `git bisect skip`
- **Too many skips (>5):** Suggest manual investigation, show untestable commits
- **Bisect interrupted:** Ensure `git bisect reset` runs in cleanup

**Output:** First bad commit hash, bisect log showing the path taken

### Phase 4: Analysis & Handoff

**Purpose:** Present findings and analyze root cause.

**Steps:**

1. **Present the identified commit:**
   ```
   Found first bad commit: <hash>

   Author: <author>
   Date: <date>

   <commit message>

   Files changed:
   <list of files from git show --stat>
   ```

2. **Show how to view details:**
   ```
   View full diff: git show <hash>
   View file at that commit: git show <hash>:<file>
   ```

3. **Handoff to root cause analysis:**
   - Announce: "Now that we've found the breaking commit at `<hash>`, I'm using systematic-debugging to analyze why this change caused the issue."
   - Use superpowers:systematic-debugging skill to investigate
   - Focus analysis on the changes in the bad commit
   - Identify the specific line/change that caused the issue
   - Explain WHY it broke (not just WHAT changed)

**Output:** Root cause understanding of why the commit broke functionality

## Safety & Error Handling

### Pre-flight Checks (Phase 1)
- ✅ Working directory is clean
- ✅ In a git repository
- ✅ Good/bad commits exist and are valid
- ✅ Good commit is actually good (issue doesn't exist)
- ✅ Bad commit is actually bad (issue exists)
- ✅ Good is ancestor of bad
- ⚠️ Warn if >1000 commits in range

### During Execution (Phase 3)
- **Subagent fails:** Log error, allow skip or abort
- **Build fails:** Use `git bisect skip`, continue
- **Ambiguous result:** Use `git bisect skip`, max 5 skips
- **Can't determine good/bad:** Ask user for guidance

### Cleanup & Recovery
- **Always** run `git bisect reset` when done (success or failure)
- If interrupted, prompt user to run `git bisect reset`
- Return to original branch/commit
- If bisect is running and skill exits, warn user to cleanup

### Failure Modes
- **Too many skips:** Report untestable commits, suggest narrower range or manual review
- **Good/bad reversed:** Detect pattern (all results opposite), offer to restart with swapped inputs
- **No bad commit found:** Verify bad commit is actually bad, check if issue is environmental

## Best Practices

### Optimizing Commit Range
- **Narrow the range first** if possible:
  - Issue appeared last week? Start from last week, not 6 months ago
  - Use `git log --since="2 weeks ago"` to find starting point
  - Use tags/releases as good commits when possible

### Writing Good Test Scripts
**Do:**
- ✅ Test ONE specific thing
- ✅ Make it deterministic (fixed seeds, no random data)
- ✅ Make it fast (runs log2(N) times)
- ✅ Include setup/build in script
- ✅ Use proper exit codes (0=good, 1=bad, 125=skip)

**Don't:**
- ❌ Run entire test suite (too slow)
- ❌ Depend on external state (databases, APIs)
- ❌ Use random data or timestamps
- ❌ Modify production data

### Manual Verification
**Be specific:**
- ✅ "API returns 200 for GET /health"
- ✅ "Login button redirects to /dashboard"
- ❌ "See if it works"
- ❌ "Check if login is broken"

**Give exact steps:**
1. Run server with `npm start`
2. Open browser to http://localhost:3000
3. Click element with id="login-btn"
4. Verify URL changes to /dashboard

### Common Patterns

| Issue Type | Recommended Approach | Script/Steps Example |
|------------|---------------------|---------------------|
| Test failure | Automated | `npm test -- failing-test.spec.js` |
| Crash/error | Automated | `node app.js 2>&1 | grep -q ERROR && exit 1 || exit 0` |
| Performance | Automated | `time npm run benchmark \| awk '{if ($1 > 5.0) exit 1}'` |
| UI/UX change | Manual | "Click X, verify Y appears" |
| Behavior change | Manual or Hybrid | Script to check, manual to confirm subjective aspects |

### Progress Communication
- After each step: "Tested commit abc123 (<result>). ~X steps remaining."
- Show bisect log periodically: `git bisect log`
- Estimate remaining steps: log2(commits in range)
- Example: 100 commits → ~7 steps, 1000 commits → ~10 steps

## Common Rationalizations (Resist These!)

| Rationalization | Reality | What to Do Instead |
|----------------|---------|-------------------|
| "User is in a hurry, skip safety checks" | Broken bisect from dirty state wastes MORE time | Run all Phase 1 checks. Always. |
| "This is simple, no need for TodoWrite" | You'll skip phases without tracking | Create checklist immediately |
| "I'll just use automated approach" | User might prefer manual for vague issues | Use AskUserQuestion tool |
| "I'll run the test in my context" | Context bleeding between commits breaks bisect | Launch subagent for each verification |
| "Working directory looks clean" | Assumptions cause failures | Run `git status` to verify |
| "I'll verify good/bad commits later" | Starting with wrong good/bad wastes all steps | Verify BEFORE `git bisect start` |
| "Found the commit, user knows why" | User asked to FIND it, not debug it | Hand off to systematic-debugging |
| "Production incident, no time for process" | Skipping process in incidents causes MORE incidents | Follow workflow. It's faster. |
| "I remember from baseline, no need to test" | Skills evolve, baseline was different session | Test at each commit with subagent |

**If you catch yourself rationalizing, STOP. Go back to MANDATORY Requirements section.**

## Integration with Other Skills

### Called BY systematic-debugging
When systematic-debugging determines an issue is historical:

```
systematic-debugging detects:
- Issue doesn't exist in commit from 2 weeks ago
- Issue exists now
→ Suggests: "This appears to be a regression. I'm using git-bisect-debugging to find when it was introduced."
→ Invokes: git-bisect-debugging skill
→ Returns: First bad commit for analysis
→ Resumes: systematic-debugging analyzes the breaking change
```

### Calls systematic-debugging
In Phase 4, after finding the bad commit:

```
git-bisect-debugging completes:
→ Announces: "Found commit abc123. Using systematic-debugging to analyze root cause."
→ Invokes: superpowers:systematic-debugging
→ Context: "Focus on changes in commit abc123"
→ Goal: Understand WHY the change broke functionality
```

## Limitations (By Design)

This skill focuses on straightforward scenarios. It does NOT handle:

- ❌ Complex merge commit issues (would need `--first-parent`)
- ❌ Flaky/intermittent test failures (would need statistical approaches)
- ❌ Build system failures across many commits (would need advanced skip strategies)

For these scenarios, manual git bisect with user guidance is recommended.

## Example Workflows

### Example 1: Automated Test Failure

```
User: "The login test started failing sometime in the last 50 commits."

Claude: "I'm using git-bisect-debugging to find which commit introduced this issue."

[Phase 1: Setup]
- git status → clean
- Good commit: v1.2.0 tag (last release)
- Bad commit: HEAD
- Verify: checkout v1.2.0, run test → passes
- Verify: checkout HEAD, run test → fails
- Range: 47 commits, ~6 steps estimated

[Phase 2: Strategy Selection]
AskUserQuestion: Which approach?
User selects: Automated

Script created:
#!/bin/bash
npm install --silent 2>/dev/null || exit 125
npm test -- tests/login.spec.js
exit $?

[Phase 3: Execution]
Step 1: Launch subagent for commit abc123
  Subagent: "Test failed. Exit code 1. bad"
  Main: git bisect bad → ~3 steps remaining

Step 2: Launch subagent for commit def456
  Subagent: "Test passed. Exit code 0. good"
  Main: git bisect good → ~2 steps remaining

Step 3: Launch subagent for commit ghi789
  Subagent: "Test failed. Exit code 1. bad"
  Main: git bisect bad → ~1 step remaining

Step 4: Launch subagent for commit jkl012
  Subagent: "Test passed. Exit code 0. good"
  Main: git bisect good

Git bisect identifies: ghi789 is first bad commit

[Phase 4: Analysis]
Found first bad commit: ghi789

Author: John Doe
Date: 2025-11-10

feat: update authentication middleware

Files changed:
  src/auth/middleware.js | 12 ++++---
  tests/login.spec.js    |  3 +-

Using systematic-debugging to analyze why this change broke the test...
[Hands off to systematic-debugging skill]
```

### Example 2: Manual UI Regression

```
User: "The dashboard layout looks wrong, but I'm not sure when it broke."

Claude: "I'm using git-bisect-debugging to find which commit introduced this issue."

[Phase 1: Setup]
- git status → clean
- Good commit: 2 weeks ago (user remembers it working)
- Bad commit: HEAD
- Range: 89 commits, ~7 steps estimated

[Phase 2: Strategy Selection]
AskUserQuestion: Which approach?
User selects: Manual

Verification steps created:
1. Run `npm run dev`
2. Open browser to http://localhost:3000/dashboard
3. Check if the sidebar and main content are side-by-side
4. Respond 'good' if layout is correct, 'bad' if sidebar overlaps content

[Phase 3: Execution]
Step 1: Checkout commit abc123
  Launch subagent with verification steps
  User verifies: "Layout is correct. good"
  Main: git bisect good → ~3 steps remaining

Step 2: Checkout commit def456
  Launch subagent with verification steps
  User verifies: "Sidebar overlaps content. bad"
  Main: git bisect bad → ~2 steps remaining

... continues until first bad commit found

[Phase 4: Analysis]
Found first bad commit: mno345

Author: Jane Smith
Date: 2025-11-08

refactor: migrate to CSS Grid layout

Files changed:
  src/components/Dashboard.css | 25 +++++++++++----------

Using systematic-debugging to analyze the CSS changes...
[Hands off to systematic-debugging skill]
```

### Example 3: Hybrid Performance Regression

```
User: "API response times got slower recently."

Claude: "I'm using git-bisect-debugging to find which commit introduced this issue."

[Phase 1: Setup]
- git status → clean
- Good commit: last release tag v2.1.0
- Bad commit: HEAD
- Range: 120 commits, ~7 steps estimated

[Phase 2: Strategy Selection]
AskUserQuestion: Which approach?
User selects: Hybrid

Script created:
#!/bin/bash
npm install --silent 2>/dev/null || exit 125

# Run benchmark 3 times, take average
total=0
for i in 1 2 3; do
  time=$(npm run benchmark:api 2>/dev/null | grep "response_time" | awk '{print $2}')
  [ -z "$time" ] && exit 125  # Can't test
  total=$(echo "$total + $time" | bc)
done
avg=$(echo "scale=2; $total / 3" | bc)

# Threshold: 500ms is acceptable
if (( $(echo "$avg > 500" | bc -l) )); then
  exit 1  # bad (too slow)
else
  exit 0  # good (fast enough)
fi

Manual fallback steps:
"If script is ambiguous, manually test API and verify response time is <500ms"

[Phase 3: Execution]
Steps proceed with script automation...
If script returns 125 (can't test), subagent asks user to manually verify

[Phase 4: Analysis]
Found first bad commit: pqr678

Author: Bob Johnson
Date: 2025-11-11

feat: add caching layer for user preferences

Files changed:
  src/api/middleware/cache.js | 45 ++++++++++++++++++++++++++++++++

Using systematic-debugging to analyze the caching implementation...
[Reveals: Cache lookup is synchronous and blocking, causing slowdown]
```

## Troubleshooting

### "Good and bad are reversed"
If early results suggest good/bad are swapped:
- Stop bisect
- Verify issue description is correct
- Swap good/bad commits and restart

### "Too many skips, can't narrow down"
If >5 commits skipped:
- Review skipped commits manually
- Check if builds are broken in that range
- Consider narrowing the range or manual investigation

### "Bisect is stuck/interrupted"
If bisect state is corrupted or interrupted:
```bash
git bisect reset  # Clean up bisect state
git checkout main  # Return to main branch
# Restart bisect with better range/script
```

### "Subagent is taking too long"
- Set reasonable timeout for verification
- If automated: optimize test script
- If manual: simplify verification steps
- Consider marking commit as 'skip'

## Summary

**When to use:** Historical bugs, regressions, "when did this break" questions

**Key strengths:**
- ✅ Systematic binary search (efficient)
- ✅ Subagent isolation (clean context)
- ✅ Automated + manual + hybrid approaches
- ✅ Integrates with systematic-debugging

**Remember:**
1. Always verify good is good, bad is bad
2. Keep test scripts deterministic and fast
3. Use subagents for each verification step
4. Clean up with `git bisect reset`
5. Hand off to systematic-debugging for root cause