home / skills / petekp / claude-code-setup / hud-manual-testing

hud-manual-testing skill

/skills/hud-manual-testing

This skill guides you through a full manual testing workflow for Claude HUD, including reset, onboarding, core checks, and debugging steps.

npx playbooks add skill petekp/claude-code-setup --skill hud-manual-testing

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
3.4 KB
---
name: hud-manual-testing
description: Manual testing workflow for Claude HUD to verify core functionality. Use when asked to "test the app", "verify the app works", "run manual tests", "test after changes", or after implementing significant features. Performs full reset, launches app, and guides through verification checklist.
---

# Claude HUD Manual Testing

Verify core functionality works after code changes. Uses the project's reset script and debugging tools.

## When to Use

- After implementing significant features
- After fixing state detection bugs
- Before commits affecting session tracking
- When user asks to "test the app" or "verify it works"

## Testing Workflow

### 1. Full Reset

Reset to clean state to test first-run experience and verify nothing depends on stale data:

```bash
./scripts/dev/reset-for-testing.sh
```

This clears:
- App UserDefaults (layout, setup state)
- `~/.capacitor/` (sessions, locks, activity)
- Hook script and registrations
- Rebuilds app and launches it

### 2. First-Run Verification

After reset, verify the onboarding flow:

1. **Setup screen appears** - App should detect missing hooks
2. **Hook installation works** - Click install, verify success
3. **Empty project list** - No projects pinned yet

### 3. Core Functionality Checklist

Test each feature manually:

| Feature | How to Test | Expected |
|---------|-------------|----------|
| **Add project** | Click +, select a project folder | Project appears in list |
| **Session detection** | Start Claude in that project | State shows Ready/Working |
| **State transitions** | Type prompt, wait for response | Working → Ready transitions |
| **Lock detection** | Check `ls ~/.capacitor/sessions/` | Lock dir exists while Claude runs |
| **Session end** | Exit Claude session | State returns to Idle |

### 4. Debug Commands

If issues occur, check these:

```bash
# Watch hook events in real-time
tail -f ~/.capacitor/hud-hook-debug.log

# View current session states
cat ~/.capacitor/sessions.json | jq .

# Check active locks
ls -la ~/.capacitor/sessions/

# Check for process
ps aux | grep -E "(claude|hud-hook)"
```

### 5. Quick Restart (No Reset)

For iterative testing without full reset:

```bash
./scripts/dev/restart-app.sh
```

This rebuilds Rust + Swift and restarts the app, keeping state intact.

## Common Issues

### State stuck on Working/Waiting

Check if lock holder is alive:
```bash
cat ~/.capacitor/sessions/*.lock/pid | xargs -I {} ps -p {}
```

If PID is dead but lock exists, the cleanup should remove it on next app launch.

### Hook events not firing

Verify hooks are registered:
```bash
cat ~/.claude/settings.json | jq '.hooks'
```

Should show entries for SessionStart, UserPromptSubmit, etc. pointing to `hud-state-tracker.sh`.

### UniFFI checksum mismatch

Regenerate bindings:
```bash
cargo build -p hud-core --release
cd core/hud-core && cargo run --bin uniffi-bindgen generate --library ../../target/release/libhud_core.dylib --language swift --out-dir ../../apps/swift/bindings/
cp ../../apps/swift/bindings/hud_core.swift ../../apps/swift/Sources/ClaudeHUD/Bridge/
rm -rf ../../apps/swift/.build
```

## Verification Summary

After testing, confirm:

- [ ] App launches without crash
- [ ] Hooks install successfully on first run
- [ ] Projects can be added/removed
- [ ] Session states reflect actual Claude activity
- [ ] States transition correctly (Idle → Ready → Working → Ready → Idle)
- [ ] No orphaned lock holders accumulate

Overview

This skill provides a manual testing workflow for the Claude HUD app to verify core functionality after changes. It automates a full reset, launches the app, and walks a tester through a concise verification checklist to confirm onboarding, session tracking, hooks, and state transitions work correctly.

How this skill works

The workflow runs a reset script to create a clean first-run environment, rebuilds the app, and launches it. It then guides the tester through onboarding and a core checklist: add/remove projects, detect Claude sessions, observe state transitions, and inspect lock/session files. Debug commands are included for live logs, session state inspection, and process checks.

When to use it

  • After implementing significant features
  • After fixing state detection or session-tracking bugs
  • Before commits that touch onboarding or hook registration
  • When asked to “test the app” or “verify the app works”
  • After changes to native bindings or build artifacts

Best practices

  • Always start with the full reset when testing first-run or hook installation flows
  • Record steps and observed states for any failing item to reproduce reliably
  • Use the debug commands to capture logs before restarting or resetting
  • Perform quick restarts for iterative development to save time when state persistence is desired
  • Confirm cleanup of lock files and orphaned PIDs before concluding tests

Example use cases

  • Verify first-run onboarding and hook installation after refactoring the hook installer
  • Test session detection and state transitions after modifying the HUD state tracker
  • Reproduce and diagnose a stuck Working/Waiting state using lock and PID checks
  • Validate UniFFI binding updates by rebuilding and confirming the Swift bindings work
  • Quickly smoke-test the app after merging platform-specific fixes with a restart script

FAQ

What does the full reset do?

The reset clears app UserDefaults, removes ~/.capacitor sessions and locks, unregisters hooks, rebuilds the app, and launches it to recreate a true first-run environment.

When should I use restart instead of reset?

Use restart for iterative testing when you want to preserve state between runs; use reset when you need to validate first-run behavior or clear stale locks and registrations.