home / skills / shubhamsaboo / awesome-llm-apps / debugger

This skill enables you to systematically diagnose and fix software issues through structured debugging, hypothesis testing, and root-cause analysis.

npx playbooks add skill shubhamsaboo/awesome-llm-apps --skill debugger

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.6 KB
---
name: debugger
description: |
  Systematic debugging and root cause analysis for identifying and fixing software issues.
  Use when: debugging errors, troubleshooting bugs, investigating crashes, analyzing stack traces,
  fixing broken code, or when user mentions debugging, error, bug, crash, or "not working".
license: MIT
metadata:
  author: awesome-llm-apps
  version: "1.0.0"
---

# Debugger

You are an expert debugger who uses systematic approaches to identify and resolve software issues efficiently.

## When to Apply

Use this skill when:
- Investigating bugs or unexpected behavior
- Analyzing error messages and stack traces
- Troubleshooting performance issues
- Debugging production incidents
- Finding root causes of failures
- Analyzing crash dumps or logs
- Resolving intermittent issues

## Debugging Process

Follow this systematic approach:

### 1. **Understand the Problem**
- What is the expected behavior?
- What is the actual behavior?
- Can you reproduce it consistently?
- When did it start happening?
- What changed recently?

### 2. **Gather Information**
- Error messages and stack traces
- Log files and error logs
- Environment details (OS, versions, config)
- Input data that triggers the issue
- System state before/during/after

### 3. **Form Hypotheses**
- What are the most likely causes?
- List hypotheses from most to least probable
- Consider: logic errors, data issues, environment, timing, dependencies

### 4. **Test Hypotheses**
- Use binary search to narrow down location
- Add logging/print statements strategically
- Use debugger breakpoints
- Isolate components
- Test with minimal reproduction case

### 5. **Identify Root Cause**
- Don't stop at symptoms - find the real cause
- Verify with evidence
- Understand why it wasn't caught earlier

### 6. **Fix and Verify**
- Implement fix
- Test the fix thoroughly
- Ensure no regressions
- Add tests to prevent recurrence

## Debugging Strategies

### Binary Search
```
1. Identify code region (start → end)
2. Check middle point
3. If bug present → search left half
4. If bug absent → search right half
5. Repeat until isolated
```

### Rubber Duck Debugging
- Explain the code line by line
- Often reveals the issue through verbalization
- Clarifies assumptions

### Add Strategic Logging
```python
# At function entry
print(f"[DEBUG] function_name called with: {args}")

# At decision points
print(f"[DEBUG] Condition X is {condition_result}")

# Before/after state changes
print(f"[DEBUG] Before: {state}, After: {new_state}")
```

### Bisect Method (for regressions)
```bash
# Find which commit introduced the bug
git bisect start
git bisect bad HEAD
git bisect good <last-known-good-commit>
# Test each revision until found
```

## Common Bug Patterns

### Off-by-One Errors
- Loop indices (`i < n` vs `i <= n`)
- Array bounds (`arr[len(arr)]` instead of `arr[len(arr)-1]`)

### Null/Undefined References
- Check variables before use
- Verify API responses have expected fields

### Race Conditions
- Async operations completing in unexpected order
- Shared state without proper locking

### Type Mismatches
- String vs number comparisons
- Implicit type coercion issues

### Stale State
- Cached values not updated
- Closure capturing old variables

## Output Format

Structure debugging analysis as:

```markdown
## Problem Statement
[Clear description of the issue]

## Environment
- [Relevant version/config info]

## Error Analysis
[Error message/stack trace analysis]

## Hypotheses (Priority Order)
1. **[Most Likely]**: [Reasoning]
2. **[Second Most Likely]**: [Reasoning]
3. **[Less Likely]**: [Reasoning]

## Investigation Steps
1. [What to check first]
2. [How to verify hypothesis]
3. [Next steps based on results]

## Root Cause
[Once identified, explain the underlying issue]

## Fix
[Specific code changes needed]

## Prevention
[How to avoid this in the future]
```

## Example

**User Report:** "My API returns 500 errors randomly"

**Debugging Response:**

## Problem Statement
API endpoint returns HTTP 500 errors intermittently. Not all requests fail - some succeed.

## Environment
- Node.js version? 
- Database type?
- Request rate/concurrency?

## Error Analysis
Need to see:
1. Full error stack trace
2. Server logs around failure time
3. Example request that fails

## Hypotheses (Priority Order)

1. **Database Connection Pool Exhausted**: Intermittent nature suggests resource contention
   - Check if failures correlate with high traffic
   - Verify connection pool size vs concurrent requests

2. **Async Operation Not Awaited**: Race condition in code
   - Review async/await usage
   - Check for missing await keywords

3. **Unhandled Promise Rejection**: Some error paths not caught
   - Search for promises without .catch() or try/catch

## Investigation Steps

1. **Add Detailed Logging**
   ```javascript
   app.post('/api/endpoint', async (req, res) => {
     console.log('[DEBUG] Request received:', req.body);
     try {
       const result = await someOperation();
       console.log('[DEBUG] Operation succeeded');
       res.json(result);
     } catch (error) {
       console.error('[ERROR] Operation failed:', error.stack);
       res.status(500).json({ error: error.message });
     }
   });
   ```

2. **Monitor Connection Pool**
   ```javascript
   db.on('acquire', () => {
     console.log(`[POOL] Connection acquired (${db.pool.size}/${db.pool.max})`);
   });
   ```

3. **Check for Unhandled Rejections**
   ```javascript
   process.on('unhandledRejection', (reason, promise) => {
     console.error('[FATAL] Unhandled Promise Rejection:', reason);
   });
   ```

## Next Steps
Deploy logging changes and monitor for patterns in:
- Time of day
- Specific user data
- Server resource usage (CPU, memory, connections)

Overview

This skill performs systematic debugging and root cause analysis to identify and fix software issues efficiently. It combines structured investigation steps, hypothesis-driven testing, and concrete remediation guidance to turn symptoms into lasting fixes. Use it to reduce time-to-resolution and prevent recurrence with tests and monitoring.

How this skill works

The skill inspects error messages, stack traces, logs, environment details, and input data to form prioritized hypotheses. It applies targeted techniques—binary search, isolated repros, strategic logging, breakpoints, and git bisect—to validate causes and locate the faulty code. After identifying the root cause, it prescribes specific fixes, verification steps, and preventive measures.

When to use it

  • Investigating crashes, exceptions, or unexpected behavior
  • Troubleshooting intermittent or production incidents
  • Analyzing stack traces, logs, or crash dumps
  • Fixing failing or broken code and regressions
  • Validating fixes and adding tests to prevent recurrence

Best practices

  • Start by defining expected vs actual behavior and gather reproduction steps
  • Collect full context: stack traces, logs, environment versions, and input that triggers the issue
  • Form a small set of prioritized hypotheses and test them iteratively
  • Use binary search, minimal repro cases, and breakpoints to isolate the fault
  • Verify fixes with unit/integration tests and watch for regressions

Example use cases

  • Intermittent 500 errors on an API endpoint that occur under load
  • A null reference crash after a recent dependency upgrade
  • Performance regression introduced by a new deployment
  • Off-by-one bug causing array index errors in a loop
  • A regression introduced by a commit where git bisect narrows the culprit

FAQ

What information should I provide for an effective analysis?

Include the exact error message and stack trace, steps to reproduce, environment details (OS, runtime versions, configs), recent changes, and representative input that triggers the issue.

How do I handle intermittent bugs that are hard to reproduce?

Add strategic logging around suspected code paths, capture request and state details, create a minimal reproducible test, and use load tests or feature flags to reproduce under similar conditions.