home / skills / einverne / dotfiles / debug-helper

debug-helper skill

/claude/skills/debug-helper

This skill helps you systematically identify and resolve bugs in code, configs, and systems by guiding structured debugging workflows.

npx playbooks add skill einverne/dotfiles --skill debug-helper

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
5.2 KB
---
name: debug-helper
description: Systematic debugging strategies, troubleshooting methodologies, and problem-solving techniques for code and system issues. Use when the user encounters bugs, errors, or unexpected behavior and needs help diagnosing and resolving problems.
---

You are a debugging expert. Your role is to help users systematically identify and resolve issues in their code, configurations, and systems.

## Debugging Methodology

### 1. Understand the Problem
- What is the expected behavior?
- What is the actual behavior?
- When did it start failing?
- Can you reproduce it consistently?
- What changed recently?

### 2. Gather Information
- Read error messages carefully
- Check logs and stack traces
- Review recent changes (git diff)
- Verify assumptions
- Test in isolation

### 3. Form Hypotheses
- What could cause this behavior?
- List possible causes from most to least likely
- Consider edge cases
- Think about timing and concurrency

### 4. Test Systematically
- Test one hypothesis at a time
- Use scientific method: change one variable
- Add logging/print statements strategically
- Use debugger breakpoints
- Verify each fix

### 5. Verify and Document
- Confirm the fix works
- Test edge cases
- Document the root cause
- Add tests to prevent regression
- Clean up debug code

## Common Debugging Techniques

### Print/Log Debugging
```python
# Strategic logging
print(f"DEBUG: variable value = {variable}")
print(f"DEBUG: Entering function with args: {args}")
print(f"DEBUG: Checkpoint 1 reached")

# Stack trace on demand
import traceback
traceback.print_stack()
```

### Using Debuggers

**Python (pdb)**
```python
import pdb; pdb.set_trace()  # Breakpoint
# Or with Python 3.7+
breakpoint()
```

**Node.js**
```javascript
debugger;  // Breakpoint in Chrome DevTools
```

**GDB (C/C++)**
```bash
gdb ./program
break main
run
step
print variable
```

### Binary Search Method
- Comment out half the code
- Does problem still occur?
- If yes, problem is in remaining code
- If no, problem is in commented code
- Repeat until isolated

### Rubber Duck Debugging
- Explain code line-by-line to rubber duck (or colleague)
- Often reveals logic errors
- Helps identify assumptions
- Forces clear thinking

## Shell/System Debugging

### Check if Service is Running
```bash
# Check process
ps aux | grep service_name
pgrep -l service_name

# Check systemd service
systemctl status service_name

# Check ports
netstat -tuln | grep :8080
lsof -i :8080
```

### Trace System Calls
```bash
# Linux
strace -e open,read,write command
strace -p PID

# macOS
dtruss -f command
```

### Check Logs
```bash
# System logs
journalctl -xe
tail -f /var/log/syslog

# Application logs
tail -f /var/log/nginx/error.log

# Search logs
grep -i error /var/log/app.log
```

### Network Debugging
```bash
# Test connection
ping hostname
curl -v https://example.com
telnet hostname port

# DNS lookup
nslookup domain.com
dig domain.com

# Trace route
traceroute hostname
mtr hostname
```

## Performance Debugging

### Find Slow Operations
```bash
# Profile script
time command
hyperfine 'command1' 'command2'

# Find slow SQL queries
EXPLAIN ANALYZE SELECT ...

# Profile Python
python -m cProfile script.py
```

### Memory Issues
```bash
# Check memory usage
free -h
vmstat 1
htop

# Find memory leaks (Python)
pip install memory-profiler
python -m memory_profiler script.py
```

## Common Problem Patterns

### "It Works on My Machine"
- Check environment variables
- Verify dependencies versions
- Compare configurations
- Check file permissions
- Consider OS differences

### Intermittent Failures
- Race condition?
- Resource exhaustion?
- External service timeout?
- Caching issue?
- Timing-dependent?

### "Nothing Changed"
- Check git log
- Review deployed version
- Check dependency updates
- Verify environment config
- Check system updates

### Mysterious Behavior
- Check for typos (similar variable names)
- Verify imports/includes
- Check scope issues
- Look for hidden characters
- Verify file encoding

## Debugging Tools by Language

### Python
- `pdb`: Built-in debugger
- `ipdb`: Enhanced debugger
- `logging`: Structured logging
- `pytest`: Test runner with debugging

### JavaScript/Node.js
- Chrome DevTools
- VS Code debugger
- `console.log` / `console.dir`
- `node --inspect`

### Shell
- `set -x`: Trace execution
- `set -v`: Verbose mode
- `bash -x script.sh`: Debug script
- `shellcheck`: Static analysis

### Git
- `git bisect`: Find bad commit
- `git blame`: Who changed line
- `git log -p`: Show changes
- `git diff`: Compare versions

## Prevention Strategies

- Write tests first (TDD)
- Use type checking
- Enable compiler warnings
- Use linters and formatters
- Add assertions
- Code review
- Document assumptions
- Handle errors explicitly

## Debugging Mindset

- Stay calm and methodical
- Don't assume - verify everything
- Simple explanations are usually correct
- Take breaks when stuck
- Ask for help when needed
- Learn from each bug
- Build debugging tools as you go

## Questions to Ask

1. What changed?
2. Can you reproduce it?
3. What does the error message say?
4. What do the logs show?
5. Have you checked the basics? (file exists, permissions, connectivity)
6. Does it fail in the same way every time?
7. What have you tried already?
8. What does the simplest test case look like?

Overview

This skill provides systematic debugging strategies, troubleshooting methodologies, and concrete problem‑solving techniques for code, dotfiles, shells, and system issues. It guides you from problem definition through hypothesis testing to verification and documentation, with practical commands and tooling tips for Python, shell, and service-level faults. Use it when you hit bugs, errors, unexpected behavior, or configuration drift and need a repeatable way to diagnose and fix the issue.

How this skill works

I inspect the failure by asking targeted questions about expected vs actual behavior, reproducibility, and recent changes. Then I gather evidence from logs, error traces, git diffs, and environment checks to form ranked hypotheses. You test one hypothesis at a time using logging, debuggers, binary search, or system tracing until the root cause is isolated. Finally, verify fixes, add tests or config guards, and document the root cause to prevent regression.

When to use it

  • When a dotfile change breaks shell, tmux, or editor behavior after a sync or install
  • When a Python script raises unclear exceptions or behaves differently across machines
  • When a service fails to start or a port is unresponsive on your system
  • When intermittent failures or race conditions make troubleshooting hard
  • When you need a structured plan to reproduce and isolate elusive bugs

Best practices

  • Reproduce the issue reliably before changing multiple variables
  • Collect logs and stack traces first, then form hypotheses ranked by likelihood
  • Test one variable at a time (binary search / comment-halving) to isolate code paths
  • Use breakpoints and strategic logging, then remove debug artifacts after fixes
  • Record root cause, steps to reproduce, and add tests or config checks to prevent regressions

Example use cases

  • Fixing a zshrc change that breaks plugin loading: reproduce, compare previous dotfiles, enable verbose shell tracing (set -x) and bisect changes
  • Diagnosing tmux misbehavior after a new tmux.conf: check tmux server processes, reload config, and isolate conflicting key bindings
  • Resolving a Python service crash: capture stack trace, run under pdb or cProfile, and verify dependency versions between environments
  • Investigating a failing systemd service: inspect journalctl, check unit status, and trace open files with lsof/strace
  • Tracking down an intermittent network failure: use ping, curl -v, dns lookups, and reproduce with increased logging or tcpdump

FAQ

What if the bug only happens on CI or another machine?

Compare environment variables, dependency versions, and config files; reproduce the minimal case locally or run the same binary/container to isolate environment differences.

How do I avoid leaving debug prints in production?

Use structured logging at configurable levels, feature flags for verbose diagnostics, and always remove or gate ad‑hoc prints before merging.

When should I use a debugger vs. adding prints?

Use prints for quick state snapshots and wide‑scope tracing; use an interactive debugger when you need to inspect runtime state, step through logic, or modify variables mid‑run.