home / skills / simhacker / moollm / debugging
This skill helps you systematically debug Python issues by tracking hypotheses, tests, and learnings to quickly identify root causes.
npx playbooks add skill simhacker/moollm --skill debuggingReview the files below or copy the command above to add this skill to your agents.
---
name: debugging
description: Systematic bug investigation with hypothesis tracking
license: MIT
tier: 2
allowed-tools:
- read_file
- write_file
- list_dir
- search_replace
- run_terminal_cmd
- grep
related: [adventure, sniffable-python, scratchpad, research-notebook, sister-script, session-log, self-repair, play-learn-lift, constructionism]
tags: [moollm, development, investigation, hypothesis, testing]
inputs:
symptom:
type: string
required: true
description: "What's the observable problem?"
context:
type: string
required: false
description: "When/where does it happen?"
expected:
type: string
required: false
description: "What should happen instead?"
outputs:
- DEBUG.yml
- HYPOTHESES.md
- TESTS.md
- ROOT_CAUSE.md
templates:
- DEBUG.yml.tmpl
---
# π§ Debugging Skill
> **"Hypothesize, test, learn, repeat."**
Debug problems methodically. Track hypotheses, test systematically, converge on root causes.
## Purpose
Debug problems methodically. Track hypotheses, document tests, record what you learn, and converge on root causes.
## When to Use
- Something isn't working as expected
- Mysterious behavior needs explanation
- Performance problems need diagnosis
- "Works on my machine" situations
## The Debugging Loop
```
OBSERVE β HYPOTHESIZE β TEST β LEARN β (repeat or) β FIX
```
### Terminal States
- `FIX` β Bug resolved
- `WONTFIX` β Intentional behavior
- `DEFER` β Not addressing now
## Protocol
### Observation Phase
Before guessing, gather facts:
```yaml
observation:
symptom: "What's the observable problem?"
context: "When does it happen?"
expected: "What should happen instead?"
evidence:
- "Error message (exact text)"
- "Logs showing the issue"
- "Steps to reproduce"
constraints:
- "What we know for sure"
- "What we've already ruled out"
```
### Hypothesis Tracking
```yaml
hypothesis:
id: "hyp-001"
claim: "The bug is caused by X"
confidence: "high|medium|low"
if_true:
- "We would expect to see..."
- "Changing X should fix it"
test:
action: "What to try"
expected: "What we expect if hypothesis is correct"
result:
status: "confirmed|refuted|inconclusive"
observation: "What actually happened"
learned: "What this tells us"
```
### Test Documentation
```yaml
test:
id: "test-001"
hypothesis: "hyp-001"
action: "What we did"
before:
state: "System state before test"
after:
state: "System state after test"
result: "confirmed|refuted|inconclusive"
learned: "What we now know"
```
## Schemas
### Observation Schema
| Field | Required | Purpose |
|-------|----------|---------|
| `symptom` | β | Observable problem |
| `expected` | β | What should happen |
| `error_message` | | Exact error text |
| `logs` | | Relevant log entries |
| `steps_to_reproduce` | | How to trigger |
| `constraints` | | Known facts |
| `ruled_out` | | Eliminated possibilities |
### Hypothesis Schema
| Field | Required | Purpose |
|-------|----------|---------|
| `id` | β | Unique identifier |
| `claim` | β | What you think is wrong |
| `test` | β | How to validate |
| `confidence` | | high/medium/low |
| `if_true` | | Expected observations |
| `result` | | Test outcome |
| `learned` | | Insight gained |
### Test Schema
| Field | Required | Purpose |
|-------|----------|---------|
| `id` | β | Unique identifier |
| `hypothesis` | β | Which hypothesis |
| `action` | β | What was tried |
| `result` | β | confirmed/refuted/inconclusive |
| `before` | | State before |
| `after` | | State after |
| `learned` | | Insight |
## Core Files
| File | Purpose |
|------|---------|
| `DEBUG.yml` | Current debugging session |
| `HYPOTHESES.md` | All hypotheses and their status |
| `TESTS.md` | Test log |
| `ROOT_CAUSE.md` | Final analysis |
## Commands
| Command | Action |
|---------|--------|
| `DEBUG [symptom]` | Start debugging session |
| `OBSERVE [fact]` | Record observation |
| `HYPOTHESIZE [claim]` | Propose hypothesis |
| `TEST [action]` | Document test |
| `LEARN [insight]` | Record what you learned |
| `ROOT-CAUSE [explanation]` | Document root cause |
## The Scientific Method for Bugs
1. **Observe**: What exactly is happening?
2. **Question**: Why might this be happening?
3. **Hypothesize**: Form testable explanation
4. **Predict**: What would we see if hypothesis is true?
5. **Test**: Try to confirm or refute
6. **Analyze**: What did we learn?
7. **Iterate**: New hypothesis or fix
## Debugging Techniques
### Binary Search
Narrow down where the bug lives. Use when the bug is somewhere in a large space.
```yaml
technique: binary_search
steps:
- "Find a known good state"
- "Find a known bad state"
- "Check the middle"
- "Repeat until found"
```
### Rubber Duck
Explain the problem in detail. Use when stuck and need fresh perspective. Write detailed observation in DEBUG.yml β forces you to articulate assumptions.
### Minimal Reproduction
Simplify until bug is isolated. Use when complex system with unclear cause.
### Git Bisect
Find the commit that introduced bug. Use when bug is a regression.
### Print Debugging
Add logging to trace execution. Use when you need to understand flow.
## Working Set
Always include in context:
- `DEBUG.yml`
- `HYPOTHESES.md`
## Integration
| Direction | Skill | Relationship |
|-----------|-------|--------------|
| β | [play-learn-lift](../play-learn-lift/) | Debugging IS learning |
| β | [session-log](../session-log/) | Log all debugging activities |
| β | [research-notebook](../research-notebook/) | Complex bugs need research |
| β | [honest-forget](../honest-forget/) | Compress debugging wisdom |
| β | [adventure](../adventure/) | Debugging IS adventure |
| β | [room](../room/) | Debug sessions are rooms |
| β | [card](../card/) | Git Goblin π§, Index Owl π¦ companions |
This skill provides a systematic debugging workflow focused on hypothesis tracking, test documentation, and iterative learning. It guides you to observe facts, form testable hypotheses, run targeted tests, and converge on root causes. The goal is reproducible investigations and clear records that support fixes or intentional decisions to defer or not fix.
Start by recording observable symptoms, context, and concrete evidence. For each suspected cause create a hypothesis with expected observations, run a defined test, and record the result as confirmed, refuted, or inconclusive. Repeat the observeβhypothesizeβtestβlearn loop until you reach a terminal state: FIX, WONTFIX, or DEFER. Maintain simple YAML-like records for observations, hypotheses, and tests to ensure traceability.
What is the first thing to record when debugging?
Record the observable symptom, the exact error message, steps to reproduce, and the expected behavior.
How do I know when to stop iterating?
Stop when a test confirms the root cause and you can apply a fix, or when you explicitly mark the issue WONTFIX or DEFER with documented rationale.
What if a test is inconclusive?
Treat it as learning: update your constraints and hypotheses, refine the test, or try a different diagnostic technique such as binary search or minimal reproduction.