home / skills / hoangnguyen0403 / agent-skills-standard / debugging

debugging skill

/skills/common/debugging

This skill helps you debug effectively using the scientific method, guiding observe, hypothesize, experiment, fix, and verify to root out issues.

npx playbooks add skill hoangnguyen0403/agent-skills-standard --skill debugging

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
1.6 KB
---
name: Debugging Expert
description: Systematic troubleshooting using the Scientific Method (Observe, Hypothesize, Experiment, Fix).
metadata:
  labels: [debugging, troubleshooting, bug-fixing, root-cause]
  triggers:
    keywords: [debug, fix bug, crash, error, exception, troubleshooting]
---

# Debugging Expert

## **Priority: P1 (OPERATIONAL)**

Systematic, evidence-based troubleshooting. Do not guess; prove.

## 🔬 The Scientific Method

1. **OBSERVE**: Gather data. What exactly is happening?
    - Logs, Stack Traces, Screenshots, Steps to Reproduce.
2. **HYPOTHESIZE**: Formulate a theory. "I think X is causing Y because Z."
3. **EXPERIMENT**: Test the theory.
    - Create a reproduction case.
    - Change _one variable at a time_ to validate the hypothesis.
4. **FIX**: Implement the solution once the root cause is proven.
5. **VERIFY**: Ensure the fix works and doesn't introduce regressions.

## 🚫 Anti-Patterns

- **Shotgun Debugging**: Randomly changing things hoping it works.
- **Console Log Spam**: Leaving `print`/`console.log` in production code.
- **Fixing Symptoms**: masking the error (e.g., `try-catch` without handling) instead of fixing the root cause.

## 🛠 Best Practices

- **Diff Diagnosis**: What changed since it last worked?
- **Minimal Repro**: Create the smallest possible code snippet that reproduces the issue.
- **Rubber Ducking**: Explain the code line-by-line to an inanimate object (or the agent).
- **Binary Search**: Comment out half the code to isolate the failing section.

## 📚 References

- [Bug Report Template](references/bug-report-template.md)

Overview

This skill provides a disciplined, evidence-first approach to diagnosing and resolving software problems using the Scientific Method. It prevents guesswork by guiding the agent to collect data, form testable hypotheses, and verify fixes. The goal is reliable, minimal-impact solutions and clear reproducible reports.

How this skill works

The skill inspects runtime data like logs, stack traces, screenshots, and reproduction steps to precisely describe the failure. It guides forming a single hypothesis and designing controlled experiments—typically a minimal reproducible case and one-variable changes. After verifying the root cause, it prescribes a fix and runs regression checks to confirm correctness.

When to use it

  • When failures lack a clear cause and require structured investigation
  • When a bug must be reproduced and proven before fixing
  • Before applying changes to production or when hotfixes are risky
  • When multiple contributors disagree on root cause or proposed solutions
  • When tracking regressions across versions or environments

Best practices

  • Collect comprehensive evidence first: logs, stack traces, steps to reproduce, and environment details
  • Create a minimal reproducible example to isolate the issue
  • Change only one variable per experiment to attribute effects reliably
  • Use diff diagnosis: identify what changed since the last known-good state
  • Avoid shotgun debugging and console-spam; remove instrumentation before committing
  • Verify fixes with tests and regression checks before merging to main

Example use cases

  • Reproduce a mobile crash by building a stripped-down app that triggers the bug and inspecting device logs
  • Investigate a performance regression by comparing recent commits and running benchmarks on a minimal case
  • Isolate a web UI rendering bug by toggling components and CSS to binary-search the failing element
  • Diagnose intermittent backend failures by capturing stack traces, increasing log detail, and creating a test harness
  • Validate a proposed fix by running unit/integration tests and a quick smoke test in a staging environment

FAQ

What if I can’t reproduce the bug?

Collect as much evidence as possible (logs, environment, timing) and try to create a minimal scenario; add deterministic inputs or mocks to force the state. If still unreproducible, capture runtime traces or increase logging temporarily in staging.

When is a workaround acceptable?

A documented workaround is acceptable for immediate mitigation when impact is high and a root-cause fix requires time. Prefer short-lived, low-risk workarounds and keep a ticket for the verified fix.