home / skills / lookatitude / beluga-ai / fix_bug

fix_bug skill

safe

/.agent/skills/fix_bug

This skill guides you through investigating and fixing bugs in Beluga AI with root cause analysis and verification.

npx playbooks add skill lookatitude/beluga-ai --skill fix_bug

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

4.9 KB

---
name: Fix Bug
description: Bug investigation and fix workflow with root cause analysis
personas:
  - backend-developer
---

# Fix Bug

This skill guides you through investigating and fixing bugs in Beluga AI with proper root cause analysis and verification.

## Prerequisites

- Bug report or description of unexpected behavior
- Access to relevant logs/error messages
- Ability to reproduce the issue

## Steps

### 1. Understand the Bug

Gather information:

1. **Symptoms**: What is the observed behavior?
2. **Expected**: What should happen instead?
3. **Reproduction**: Steps to trigger the bug
4. **Frequency**: Always, intermittent, specific conditions?
5. **Impact**: Severity and affected functionality

### 2. Reproduce the Issue

Create a minimal reproduction:

```go
func TestBugReproduction(t *testing.T) {
    // Setup that triggers the bug
    component := NewComponent(config)

    // Action that causes the issue
    result, err := component.Process(ctx, input)

    // This assertion should fail (demonstrating the bug)
    require.NoError(t, err) // Currently fails
}
```

### 3. Investigate Root Cause

#### Trace the Code Path

1. Start from the entry point
2. Follow the execution flow
3. Identify where behavior diverges from expected

#### Check Common Causes

- [ ] Nil pointer dereference
- [ ] Race condition (concurrent access)
- [ ] Resource leak (goroutine, file handle)
- [ ] Incorrect error handling
- [ ] Missing validation
- [ ] Configuration issue
- [ ] State mutation
- [ ] Context cancellation not respected

#### Use Debugging Techniques

```bash
# Run with race detection
go test -race -v ./pkg/affected/...

# Add verbose logging temporarily
// In code:
log.Printf("DEBUG: value=%v, state=%v", value, state)

# Check OTEL traces if available
```

### 4. Document Root Cause

Before fixing, document:

```markdown
## Root Cause Analysis

**Bug**: [Brief description]

**Root Cause**: [Technical explanation]

**Location**: `pkg/example/file.go:42`

**Why It Happened**: [Context]

**Fix Approach**: [How to fix]
```

### 5. Implement the Fix

#### Write Test First (TDD)

```go
func TestBugFix_IssueXYZ(t *testing.T) {
    // Setup
    component := NewComponent(config)

    // Action
    result, err := component.Process(ctx, input)

    // Assertions that should pass after fix
    require.NoError(t, err)
    assert.Equal(t, expectedResult, result)
}
```

#### Apply Minimal Fix

- Fix only what's broken
- Don't refactor unrelated code
- Preserve existing behavior for non-bug cases
- Add defensive checks if appropriate

#### Example Fix Patterns

**Nil Check**:
```go
// Before
value := obj.Field.Method()

// After
if obj == nil || obj.Field == nil {
    return nil, &Error{Op: "Process", Code: ErrCodeInvalidInput}
}
value := obj.Field.Method()
```

**Race Condition**:
```go
// Before
c.state = newState

// After
c.mu.Lock()
c.state = newState
c.mu.Unlock()
```

**Context Cancellation**:
```go
// Before
for item := range items {
    process(item)
}

// After
for item := range items {
    select {
    case <-ctx.Done():
        return ctx.Err()
    default:
        process(item)
    }
}
```

### 6. Add Regression Test

Ensure the bug cannot recur:

```go
func TestRegression_IssueXYZ(t *testing.T) {
    // Specific conditions that caused the bug
    input := createBugTriggerInput()

    component := NewComponent(config)
    result, err := component.Process(ctx, input)

    // Bug is now fixed
    require.NoError(t, err)
    assert.Equal(t, expected, result)
}
```

### 7. Run Quality Checks

```bash
# Format and lint
make fmt
make lint

# Run all tests with race detection
make test-race

# Run affected package tests
go test -v -race ./pkg/affected/...

# Run integration tests if applicable
make test-integration
```

### 8. Verify Fix

- [ ] Original bug is fixed
- [ ] Regression test passes
- [ ] No new test failures
- [ ] Race detection passes
- [ ] Related functionality still works

### 9. Document the Fix

Add to commit message:

```
fix(package): brief description of fix

Root cause: [explanation]
Fix: [what was changed]

Fixes #123
```

## Validation Checklist

- [ ] Bug is reproducible before fix
- [ ] Root cause is understood and documented
- [ ] Fix is minimal and focused
- [ ] Regression test added
- [ ] All existing tests pass
- [ ] Race detection passes
- [ ] Related functionality verified
- [ ] Commit message explains the fix

## Common Bug Categories

| Category | Symptoms | Typical Fix |
|----------|----------|-------------|
| Nil pointer | Panic | Add nil checks |
| Race condition | Intermittent failures | Add mutex/atomic |
| Resource leak | Memory growth | Add defer cleanup |
| Error handling | Silent failures | Check and propagate errors |
| Context | Hanging operations | Respect ctx.Done() |
| Validation | Invalid state | Add input validation |

## Output

A verified bug fix with:
- Root cause documentation
- Regression test
- Minimal code change
- Passing quality checks

Overview

This skill guides developers through a disciplined bug investigation and fix workflow with root cause analysis, verification, and regression prevention. It focuses on Go projects and emphasizes reproducible tests, minimal fixes, and quality checks to prevent regressions. The outcome is a verified bug fix with documentation and tests.

How this skill works

It starts by collecting symptoms, reproduction steps, frequency, and impact, then creates a minimal reproducible test demonstrating the failure. The workflow traces the code path, checks common causes (nil pointers, races, resource leaks, incorrect error handling, context issues), and applies a minimal, focused fix. After implementing the fix, you add a regression test, run formatting/linting and race-detection tests, and document the root cause and commit message.

When to use it

When a bug report includes unexpected behavior or crashes in a Go service
When you can reproduce the failure or gather logs/traces to reproduce
When a change requires root cause analysis before broad refactoring
When intermittent or race-like failures occur and need systematic diagnosis

Best practices

Write a minimal reproduction test before coding the fix (TDD)
Document root cause, location, why it happened, and the planned fix before changing code
Make the fix minimal and focused; avoid unrelated refactors in the same change
Add regression tests that capture the specific conditions that triggered the bug
Run go test -race and CI checks; include linting and formatting steps

Example use cases

Fix a nil pointer panic by adding defensive nil checks and a test that reproduces the panic
Resolve an intermittent race by protecting shared state with a mutex and adding a deterministic regression test
Address a hang caused by context cancellation not being respected and add tests that simulate ctx.Done()
Patch incorrect error handling where errors were dropped, add propagation, and include a test asserting the propagated error

FAQ

What if I can't reproduce the bug locally?

Gather logs, traces, and exact inputs; try to create a minimal environment that mimics production conditions or add instrumentation to capture state when the bug occurs.

How large should the fix be?

Keep changes minimal and focused on the root cause. Avoid unrelated refactors in the same commit; add defensive checks or locking patterns only where necessary and justify them in the commit message.