home / skills / phrazzld / claude-config / triage
This skill performs multi-source production triage by auditing Sentry, Vercel logs, health endpoints, and CI, guiding investigation and fixes.
npx playbooks add skill phrazzld/claude-config --skill triageReview the files below or copy the command above to add this skill to your agents.
---
name: triage
description: |
Multi-source observability triage. Checks Sentry, Vercel logs, health endpoints, GitHub CI/CD.
Drives: investigate -> fix -> PR -> postmortem workflow.
Invoke for: production issues, error spikes, CI failures, user reports, incident response.
argument-hint: "[action: status | investigate ISSUE-ID | investigate-ci RUN-ID | fix | postmortem ISSUE-ID]"
---
# /triage
Fix production issues. Run audit, investigate, fix, postmortem.
**This is a fixer.** It uses `/check-production` as its primitive. Use `/log-production-issues` to create issues instead of fixing.
## Usage
```bash
/triage # Audit and fix highest priority (default)
/triage investigate VOL-456 # Deep dive on specific Sentry issue
/triage investigate-ci 12345 # Deep dive on specific CI run failure
/triage fix # Create PR for current fix
/triage postmortem VOL-456 # Generate postmortem after merge
```
## Stage 1: Production Audit
**Command:** `/triage` or `/triage status`
Invoke `/check-production` primitive for parallel checks:
1. **Sentry** - Unresolved issues via triage scripts
2. **Vercel logs** - Recent errors in stream
3. **Health endpoints** - `/api/health` response
4. **GitHub CI/CD** - Failed workflow runs
**Output format:**
```
TRIAGE STATUS - 2026-01-23 15:30
================================
SENTRY (volume-fitness)
[P0] 3 unresolved issues
Top: VOL-456 "PaymentIntent failed" (Score: 147, 23 users)
GITHUB CI/CD
[P1] Main branch failing: "CI" workflow (run #1234)
Failed: Type check - 2h ago
[P2] 2 feature branches blocked
VERCEL LOGS
[OK] No errors in last 10 minutes
HEALTH ENDPOINTS
[OK] volume.fitness/api/health (200, 45ms)
RECOMMENDATION:
1. Investigate VOL-456 immediately - 23 users affected
Run: /triage investigate VOL-456
2. Fix main branch CI - blocking all deploys
Run: /triage investigate-ci 1234
```
If all clean: "All systems nominal. No action required."
## Stage 2: Investigate
### Delegation Pattern
For complex issues, delegate investigation to agentic tools (see `/delegate`):
- **Codex** — Code archaeology, stack trace analysis, debugging
- **Gemini** — Research current patterns, check for known issues
- **Thinktank** — Validate proposed fix before implementing
### Sentry Issues
**Command:** `/triage investigate ISSUE-ID`
Actions:
1. Fetch full issue context from Sentry
2. Create branch: `fix/ISSUE-ID-description`
3. Load affected files from stack trace
4. Check git history for related changes
5. Form root cause hypothesis (delegate to Codex for complex traces)
**Output:** Investigation summary with hypothesis and next steps.
### CI/CD Failures
**Command:** `/triage investigate-ci RUN-ID`
Actions:
1. Fetch failed workflow run details
```bash
gh run view RUN-ID --log-failed
```
2. Identify failed step and error message
3. Create branch: `fix/ci-[workflow-name]-[date]`
4. Load affected files based on error
5. Check recent commits that may have caused regression
**Common CI failure patterns:**
| Failure Type | Typical Cause | Fix Approach |
|--------------|---------------|--------------|
| Type check | New code with type errors | Fix types locally, push |
| Lint | Style violations | Run `pnpm lint --fix` |
| Test | Broken/flaky tests | Run tests locally, fix or skip flaky |
| Build | Missing deps, config issues | Check package.json, build config |
| Deploy | Env vars, permissions | Check Vercel/platform settings |
**Output:** CI investigation summary with specific error and fix approach.
## Stage 3: Fix
**Command:** `/triage fix`
Prerequisites: On `fix/` branch with changes.
Actions:
1. Run tests to verify fix
2. Create PR with standard format
3. Link Sentry issue in PR description
**PR format:**
```markdown
## Summary
[Fix description]
## Sentry Issue
- ID: ISSUE-ID
- Users affected: N
- First seen: DATE
## Test Plan
- [ ] Test case 1
- [ ] Test case 2
```
## Stage 4: Postmortem
**Command:** `/triage postmortem ISSUE-ID`
Prerequisites: Fix deployed (PR merged).
Actions:
1. Verify no new errors in Sentry
2. Generate postmortem document from template
3. Resolve Sentry issue
4. Create `docs/postmortems/YYYY-MM-DD-ISSUE-ID.md`
## Scripts
### Via Sentry MCP (Preferred)
When Sentry MCP is configured, use direct queries:
- "Show me unresolved errors in production"
- "What's the triage score for issue VOL-456?"
- "Get full context for the top error"
### Via CLI Scripts
```bash
# Multi-source orchestrator
~/.claude/skills/triage/scripts/check_all_sources.sh
# Individual checks
~/.claude/skills/triage/scripts/check_sentry.sh
~/.claude/skills/triage/scripts/check_vercel_logs.sh
~/.claude/skills/triage/scripts/check_health_endpoints.sh
# Sentry CLI directly
sentry-cli issues list --project=$SENTRY_PROJECT --status=unresolved
sentry-cli issues describe ISSUE-ID
# Postmortem generator
~/.claude/skills/triage/scripts/generate_postmortem.sh ISSUE-ID
```
### Via GitHub CLI
```bash
# List failed runs on main branch
gh run list --branch main --status failure --limit 10
# List all recent failures
gh run list --status failure --limit 10
# View failed run details
gh run view RUN-ID
# View only failed step logs
gh run view RUN-ID --log-failed
# Re-run failed jobs (after fix pushed)
gh run rerun RUN-ID --failed
# Watch a run in progress
gh run watch RUN-ID
```
## Workflow
```
/triage
|
v
[Issues found?]
|
+-- Sentry issue --> /triage investigate ISSUE-ID
| |
| v
| [Fix locally]
| |
| v
| /triage fix (creates PR)
| |
| v
| [PR merged & deployed]
| |
| v
| /triage postmortem ISSUE-ID
|
+-- CI failure --> /triage investigate-ci RUN-ID
| |
| v
| [Fix locally, push]
| |
| v
| [CI re-runs automatically]
| |
| v
| [Verify CI green]
|
+-- No issues --> "All systems nominal"
```
## Environment Variables
```bash
# Required for Sentry
SENTRY_AUTH_TOKEN # or SENTRY_MASTER_TOKEN
SENTRY_ORG # Organization slug
# Auto-detected per project
SENTRY_PROJECT # From .sentryclirc or .env.local
# Optional for Vercel
VERCEL_TOKEN # For `vercel logs` access
```
## MCP Configuration (Recommended)
For AI-assisted triage, configure Sentry MCP:
```json
{
"mcpServers": {
"sentry": {
"url": "https://mcp.sentry.dev/mcp",
"transport": "http"
}
}
}
```
Or local with token:
```json
{
"mcpServers": {
"sentry": {
"command": "npx",
"args": ["-y", "@sentry/mcp-server"],
"env": {
"SENTRY_AUTH_TOKEN": "your-token",
"SENTRY_ORG": "your-org"
}
}
}
}
```
## Reuses
- `~/.claude/skills/sentry-observability/scripts/triage_score.sh`
- `~/.claude/skills/sentry-observability/scripts/issue_detail.sh`
- `~/.claude/skills/sentry-observability/scripts/resolve_issue.sh`
## Related
- `/check-production` - The primitive (audit only)
- `/log-production-issues` - Create GitHub issues from findings
- `/observability` - Full observability setup
- `/sentry-observability` - Sentry-specific operations
- `/verify-fix` - Verification checklist
- `/delegate` - Multi-AI orchestration pattern
This skill provides multi-source observability triage for production incidents. It audits Sentry, Vercel logs, health endpoints, and GitHub CI/CD, then drives an investigate -> fix -> PR -> postmortem workflow. Use it to rapidly surface, diagnose, and remediate high-priority production problems.
The skill runs a parallel production audit (Sentry unresolved issues, Vercel error streams, /api/health checks, and failed GitHub workflow runs) and ranks findings by priority. For a selected issue it creates a fix branch, loads stack traces and related files, checks git history, and produces an investigation summary and hypothesis. For CI failures it inspects run logs, identifies the failed step, and proposes targeted fixes. After changes, it helps run tests, open a PR with a standard template, and generate postmortems once the fix is deployed.
What does the initial /triage audit include?
It queries unresolved Sentry issues, recent Vercel logs, health endpoint responses, and recent failed GitHub workflow runs, then ranks findings with recommendations.
When should I create a postmortem?
Create a postmortem after the fix is merged and deployed, once you verify the error is no longer occurring in Sentry; the skill generates a document and resolves the issue.