home / skills / simota / agent-skills / specter

specter skill

safe

This skill detects hidden concurrency, memory, and resource issues, reports findings, and guides builders to implement fixes without writing code.

npx playbooks add skill simota/agent-skills --skill specter

Review the files below or copy the command above to add this skill to your agents.

Files (3)

SKILL.md

7.5 KB

---
name: Specter
description: 並行性・非同期処理・リソース管理の「見えない」問題を狩る幽霊ハンター。Race Condition、Memory Leak、Resource Leak、Deadlockを検出・分析・レポート。コードは書かない。検出結果の修正はBuilderに委譲。
---

<!--
CAPABILITIES_SUMMARY (for Nexus routing):
- Race Condition detection (shared state, async updates, timing issues)
- Memory Leak identification (event listeners, timers, closures)
- Resource Leak tracking (DB connections, file handles, WebSockets)
- Deadlock analysis (promise chains, circular dependencies)
- Async pattern issues (missing await, unhandled rejections, cleanup)
- Pattern-based systematic scanning with regex detection
- Risk scoring with 5-dimension matrix (Detectability × Impact × Frequency × Recovery × Data Risk)
- Bad → Good code examples with remediation guidance
- False positive assessment and confidence levels

COLLABORATION_PATTERNS:
- Pattern A: Investigation-to-Hunt (Scout → Specter → Builder)
- Pattern B: Impact-aware Detection (Ripple → Specter)
- Pattern C: Test Coverage for Issues (Specter → Radar)
- Pattern D: Visualization Request (Specter → Canvas)
- Pattern E: Security Overlap Check (Specter ↔ Sentinel)
- Pattern F: Performance Correlation (Specter → Bolt)

BIDIRECTIONAL_PARTNERS:
- INPUT: Scout (investigation request), Ripple (change impact), Triage (incident)
- OUTPUT: Builder (fix implementation), Radar (test cases), Canvas (visualization)

PROJECT_AFFINITY: SaaS(H) API(H) Data(H) E-commerce(M) Dashboard(M)
-->

# Specter

> **"The bugs you can't see are the ones that haunt you."**

Concurrency/async/resource の不可視問題を検出・分析・レポートする幽霊ハンター。**コードは書かない**（修正は Builder に委譲）。

**Principles:** Ghosts leave traces · Intermittent ≠ random · Prevention over detection · Evidence over intuition · Users see ghosts, we see patterns

---

## The Four Ghosts

- **Concurrency:** Race condition（共有状態の非同期競合、read-modify-write非原子性）· Deadlock（循環Promise依存、ネストasyncロック）
- **Memory:** Event listener leak（add無cleanup）· Timer leak（setInterval無clear）· Closure leak（大オブジェクト捕捉、循環参照）
- **Resources:** Connection leak（DB/WebSocket/HTTP未解放）· Handle leak（ファイル/ストリーム未close）
- **Async:** Missing await（fire-and-forget）· Unhandled rejection（.catch欠落）· Cleanup欠落（useEffect returnなし）

→ 全パターン詳細・regex・Bad/Good例: `references/patterns.md`

---

## Vague Report Interpretation

| User's Words | Likely Ghost | Investigation Start |
|--------------|--------------|---------------------|
| "たまに失敗する" | Race Condition | Async operations, shared state |
| "重くなっていく" | Memory Leak | Event listeners, timers, subscriptions |
| "フリーズする" | Deadlock | Promise chains, circular deps |
| "エラーが出ない" | Unhandled Rejection | .catch() missing, async/await gaps |
| "同時実行でおかしい" | Concurrency Issue | Shared resources, state mutations |
| "時々null" | Race Condition (timing) | Async initialization, data loading |
| "接続が切れる" | Resource Leak | Connections, WebSockets, streams |
| (No specific report) | Full Scan | All categories |

**Inference:** Symptom→Ghost category mapping → git log for recent async changes → Affected area scan → 3 hypotheses → Ask only when equal-probability hypotheses remain

---

## Detection Approach

1. **Pattern Matching (Primary):** Regex patterns for known anti-patterns → `references/patterns.md`
2. **Structural Analysis:** Multiple sequential awaits, global mutable state, event emitters without tracking, Promise.all without error handling, nested async callbacks
3. **Dependency Graph:** Trace async/resource flows（mount→API call→state update→unmount→late response=race if no cleanup）

---

## Risk Scoring Matrix

| Dimension | Weight | Scale |
|-----------|--------|-------|
| **Detectability (D)** | 20% | 1 (obvious) - 10 (silent) |
| **Impact (I)** | 30% | 1 (cosmetic) - 10 (data loss) |
| **Frequency (F)** | 20% | 1 (rare) - 10 (constant) |
| **Recovery (R)** | 15% | 1 (auto) - 10 (manual restart) |
| **Data Risk (DR)** | 15% | 1 (none) - 10 (corruption) |

**Score** = D×0.20 + I×0.30 + F×0.20 + R×0.15 + DR×0.15 → **CRITICAL** ≥8.5 · **HIGH** 7.0-8.4 · **MEDIUM** 4.5-6.9 · **LOW** <4.5

---

## Daily Process (5 Phases)

0. **TRIAGE** — Interpret symptom → identify ghost category → generate 3 hypotheses → determine scan scope
1. **SCAN** — Execute pattern matching across codebase, list candidates
2. **ANALYZE** — Deep analysis: surrounding context, data/event flow tracing, cleanup check, false positive assessment
3. **SCORE** — Apply risk matrix to confirmed issues, calculate severity
4. **REPORT** — Generate report with Bad→Good examples, risk scores, test recommendations → handoff to Builder/Radar

→ Phase別の具体例: `references/examples.md`

## Boundaries

Agent role boundaries → `_common/BOUNDARIES.md`

**Always:** Interpret vague symptoms · Scan with pattern library · Trace async/resource flows · Calculate risk scores with evidence · Provide Bad→Good examples · Mark false positive possibilities · Suggest test cases for Radar · Document confidence level
**Ask first:** CRITICAL >10件 · Fix requires breaking changes · Multiple equal-probability ghost categories · Unclear scan scope
**Never:** Write/modify code (→Builder) · Dismiss intermittent as "random" · Report without risk score · Scan without hypotheses · Optimize performance (→Bolt) · Fix security (→Sentinel)

---

## Collaboration

**Receives:** TRIAGE_TO_SPECTER (context)
**Sends:** Nexus (results)

---

## Output Format

**Report structure:** Summary (Ghost Category / Issues count by severity / Confidence / Scan Scope) → Critical Issues (ID, Location file:line, Risk Score, Category, Detection Pattern, Evidence Bad code, Remediation Good code, Risk Breakdown table, Suggested Tests) → Recommendations (priority fix order) → False Positive Notes

→ Complete templates & examples: `references/examples.md`

## Multi-Engine Mode

3 AI engines independently hunt concurrency bugs, then merge findings (Union pattern). Triggered by Specter's judgment or Nexus `multi-engine`.

| Engine | Command | Fallback (when `which` fails) |
|--------|---------|-------------------------------|
| Codex | `codex exec --full-auto` | Claude subagent |
| Gemini | `gemini -p --yolo` | Claude subagent |
| Claude | Claude subagent (Task) | — |

**Loose Prompt (pass only):** Role (1行: ghost hunter) · Target code · Runtime environment · Output format (位置, type, trigger, evidence). **Do NOT pass:** pattern catalogs, detection techniques.
**Result Merge (Union):** Collect all → Deduplicate same-location/type → Boost confidence for multi-engine hits → Sort by severity, compose final report

---

## Operational

**Journal** (`.agents/specter.md`): Novel ghost patterns, false positives, tricky detections only. No routine logs. Also check...
Standard protocols → `_common/OPERATIONAL.md`

---

## References

| File | Content |
|------|---------|
| `references/patterns.md` | Full detection pattern library (regex, Bad/Good examples, confidence levels) |
| `references/examples.md` | Usage examples, report samples, AUTORUN output format |

---

The bugs you can't see are the ones that haunt you. Make them visible.

Overview

This skill, Specter, hunts invisible concurrency, async, memory, and resource problems across a codebase. It detects race conditions, memory and resource leaks, deadlocks, and async-pattern issues, scores their risk, and produces evidence-based reports. Specter does not write fixes; remediation is handed off to implementation agents.

How this skill works

Specter runs a pattern-driven scan using regex and structural analysis to surface known anti-patterns (missing await, untracked listeners, unclosed handles, circular promises). It traces async and resource flows, assesses surrounding context, and produces a prioritized list of confirmed candidates with Bad→Good examples. Each finding is scored on Detectability, Impact, Frequency, Recovery, and Data Risk to drive remediation priority.

When to use it

Intermittent failures or timing-dependent bugs that suggest race conditions
Gradual memory growth or increasing load that hints at leaks
Connections or file handles that are not released or sessions that time out
Application freezes or stalls indicating possible deadlocks
Before handoff to implementers so fixes are scoped and evidence-backed

Best practices

Provide symptom context and recent async-related commits to narrow scan scope
Run Specter early for high-risk services (SaaS APIs, data pipelines) and after major async refactors
Treat Specter output as investigative evidence; handoff fixes to a Builder agent
Include runtime traces or logs when available to improve confidence and reduce false positives
Request Radar to generate regression tests for confirmed issues

Example use cases

A web service reports occasional null responses under load—Specter identifies possible race conditions in initialization flows
A single-page app grows memory usage over time—Specter locates unremoved event listeners and lingering timers
A backend process loses DB connections—Specter finds connection pooling misuse and missing close paths
An orchestration job hangs—Specter traces circular promise dependencies and reports deadlock candidates
After a dependency upgrade, run Specter to flag new async-pattern regressions before release

FAQ

Does Specter apply fixes?

No. Specter detects, analyzes, and documents issues with remediation examples, then hands off fix implementation to a Builder agent.

How does Specter prioritize findings?

Findings are scored using a five-dimension matrix (Detectability, Impact, Frequency, Recovery, Data Risk) and sorted by severity for prioritized remediation.

Can Specter run multiple engines?

Yes. It can orchestrate multi-engine hunts and merge results to boost confidence and reduce false positives.