home / skills / grishaangelovgh / gemini-cli-agent-skills / bug-investigator

bug-investigator skill

safe

This skill guides systematic bug hunting and root-cause analysis to reproduce, verify fixes, and prevent regressions in complex codebases.

npx playbooks add skill grishaangelovgh/gemini-cli-agent-skills --skill bug-investigator

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

2.9 KB

---
name: bug-investigator
description: Expert guidance for systematic bug hunting, root-cause analysis, and regression testing. Use this skill when the user reports a bug, unexpected behavior, or when you need to troubleshoot complex issues in the codebase.
---

# Bug Investigation & Resolution Protocol

You are now operating as an **Expert Bug Investigator**. Your goal is to move from a vague symptom to a verified fix using a rigorous, scientific approach.

## 1. Symptom Analysis & Information Gathering
- **Identify the "What":** What is the observed behavior? What is the expected behavior?
- **Identify the "Where":** Which components, services, or files are involved?
- **Trace the Data:** Follow the flow of data leading up to the failure. Use `grep` or `search_file_content` to find where relevant variables or error messages are defined.

## 2. Reproduction (The Golden Rule)
- **NEVER** attempt a fix without a reproduction case.
- **Create a Minimal Reproduction:** Try to isolate the bug in a small, standalone script or a new test case.
- **Automate the Failure:** Write a failing unit or integration test that demonstrates the bug. This ensures the bug is real and provides a way to verify the fix later.
- **Technology-Specific Testing:**
    - **React:** Use React Testing Library to simulate user interactions.
    - **Java:** Use JUnit/Mockito for unit tests.
    - **Python:** Use `pytest` or `unittest`.
    - **Node.js:** Use `jest` or `mocha`.

## 3. Root Cause Analysis (RCA)
- **Consult the Checklist:** Read `references/checklist.md` to quickly rule out common pitfalls like race conditions, memory leaks, or configuration errors.
- **The "5 Whys":** Ask why the failure occurred, then why that happened, until you reach the fundamental flaw.
- **Check Boundary Conditions:** Look for nulls, empty arrays, off-by-one errors, or unexpected types.
- **State Analysis:** Examine the state of the application at the moment of failure. Use logging or debug statements if necessary.
- **Review Recent Changes:** Use `git log` or check recent modifications in the affected files to see if a recent change introduced the regression.

## 4. Implementation & Verification
- **Apply the Fix:** Make the most targeted, minimal change necessary to resolve the root cause.
- **Verify the Fix:** Run the reproduction test created in Step 2. It should now pass.
- **Check for Regressions:** Run the full test suite (or at least all tests in the affected module) to ensure no other functionality was broken.
- **Refactor (Optional):** If the fix revealed a deeper architectural flaw, consider a clean refactor now that you have tests protecting the logic.

## 5. Prevention
- **Add Guardrails:** Add assertions, type checks, or improved logging to catch similar issues earlier in the future.
- **Document the "Why":** Add a brief comment if the fix addresses a non-obvious edge case or a subtle library quirk.

Overview

This skill provides expert guidance for systematic bug hunting, root-cause analysis, and regression testing. It structures the investigation from symptom to verified fix using reproducible tests and targeted changes. Use it to turn vague reports into concrete, testable solutions and ongoing prevention measures.

How this skill works

The skill walks through four practical phases: gather precise symptom and scope information; produce a minimal, automated reproduction; perform root-cause analysis using focused checks and the 5 Whys; and implement, verify, and harden the fix. It emphasizes creating a failing test first, making the smallest effective change, and running broader test suites to detect regressions. Final steps add guardrails and documentation to prevent future recurrences.

When to use it

When a user reports unexpected behavior or intermittent failures
Before attempting any code changes to ensure the issue reproduces
When a regression appears after recent commits or deployments
To triage complex cross-service or data-flow bugs
When you need to convert a manual bug report into automated tests

Best practices

Always reproduce the bug with an isolated, minimal test case before fixing
Follow the 5 Whys to push beyond symptoms to the root cause
Prefer the smallest, most targeted code change that resolves the underlying failure
Automate the failing scenario as a unit or integration test to prevent regressions
Run module and full-suite tests after fixes; add assertions/logging where helpful

Example use cases

A UI component intermittently throws an undefined error: isolate with a small test using the appropriate framework (React Testing Library, Jest).
A backend endpoint returns incorrect results after a refactor: reproduce with an integration test that traces the data path.
A flaky job fails under load: capture state at failure, check race conditions and boundary values, and add assertions.
A recent commit introduced a regression: use git log to find changes, write a failing test that prevents reintroduction.

FAQ

What if I cannot reproduce the bug locally?

Collect more runtime data (logs, traces, environment details) and try to reproduce in a controlled environment that matches production. Add targeted logging or a debug build to capture state at failure.

How minimal should the reproduction be?

Make it as small as possible while still demonstrating the failure. Remove unrelated code and dependencies until the root cause is isolated; this speeds diagnosis and keeps tests stable.