home / skills / phrazzld / claude-config / fix-observability

fix-observability skill

/skills/fix-observability

This skill fixes the highest priority observability gap by running checks, applying one fix per invocation, and verifying results.

npx playbooks add skill phrazzld/claude-config --skill fix-observability

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.9 KB
---
name: fix-observability
description: |
  Run /check-observability, then fix the highest priority observability issue.
  Creates one fix per invocation. Invoke again for next issue.
  Use /log-observability-issues to create issues without fixing.
---

# /fix-observability

Fix the highest priority observability gap.

## What This Does

1. Invoke `/check-observability` to audit monitoring
2. Identify highest priority gap
3. Fix that one issue
4. Verify the fix
5. Report what was done

**This is a fixer.** It fixes one issue at a time. Run again for next issue. Use `/observability` for full setup.

## Process

### 1. Run Primitive

Invoke `/check-observability` skill to get prioritized findings.

### 2. Fix Priority Order

Fix in this order:
1. **P0**: No error tracking, no health endpoint
2. **P1**: Sentry config, structured logging, alerting
3. **P2**: Analytics, console cleanup
4. **P3**: Performance monitoring

### 3. Execute Fix

**No error tracking (P0):**
```bash
pnpm add @sentry/nextjs
npx @sentry/wizard@latest -i nextjs
```

Or manual setup:
```bash
~/.claude/skills/sentry-observability/scripts/init_sentry.sh
```

**No health endpoint (P0):**
Create `app/api/health/route.ts`:
```typescript
export async function GET() {
  const checks = {
    status: 'ok',
    timestamp: new Date().toISOString(),
    // Add service checks as needed
  };
  return Response.json(checks);
}
```

**Sentry misconfigured (P1):**
Add to `.env.local`:
```
NEXT_PUBLIC_SENTRY_DSN=your-dsn
SENTRY_AUTH_TOKEN=your-token
SENTRY_ORG=your-org
SENTRY_PROJECT=your-project
```

**No structured logging (P1):**
```bash
pnpm add pino
```

Create `lib/logger.ts`:
```typescript
import pino from 'pino';

export const logger = pino({
  level: process.env.LOG_LEVEL || 'info',
});
```

**No alerting (P1):**
Create alert via Sentry CLI or scripts:
```bash
~/.claude/skills/sentry-observability/scripts/create_alert.sh --name "New Errors" --type issue
```

**No PostHog analytics (P1):**
1. Install dependency:
```bash
pnpm add posthog-js
```

2. Create analytics module from template:
   - Source: `~/.claude/skills/observability/references/posthog-patterns.md`
   - Target: `lib/analytics/posthog.ts`

3. Create PostHogProvider:
   - Target: `components/providers/PostHogProvider.tsx`
   - If Clerk detected, include user identification integration

4. Update `app/layout.tsx`:
   - Wrap children with `<PostHogProvider>`
   - Place inside existing providers (ClerkProvider, ConvexClientProvider)

5. Add env vars to `.env.example`:
```bash
# PostHog [REQUIRED] - Product analytics
NEXT_PUBLIC_POSTHOG_KEY=
NEXT_PUBLIC_POSTHOG_HOST=https://us.i.posthog.com
```

6. Verify setup:
```bash
pnpm dev
# Open browser, check PostHog debug mode shows events
# Check PostHog dashboard for incoming events
```

**PostHog installed but not configured (P2):**
Add to `.env.local`:
```
NEXT_PUBLIC_POSTHOG_KEY=phc_xxx  # From PostHog project settings
NEXT_PUBLIC_POSTHOG_HOST=https://us.i.posthog.com
```

### 4. Verify

After fix:
```bash
# Sentry works
~/.claude/skills/sentry-observability/scripts/verify_setup.sh

# Health endpoint works
curl -s http://localhost:3000/api/health | jq
```

### 5. Report

```
Fixed: [P0] No error tracking

Installed: @sentry/nextjs
Configured: sentry.client.config.ts, sentry.server.config.ts
Added: SENTRY_DSN to .env.local

Verified: Sentry SDK initialized

Next highest priority: [P0] No health endpoint
Run /fix-observability again to continue.
```

## Branching

Before making changes:
```bash
git checkout -b infra/observability-$(date +%Y%m%d)
```

## Single-Issue Focus

This skill fixes **one issue at a time**. Benefits:
- Test each monitoring component independently
- Easy to troubleshoot if something fails
- Clear audit trail

Run `/fix-observability` repeatedly to work through the backlog.

## Related

- `/check-observability` - The primitive (audit only)
- `/log-observability-issues` - Create issues without fixing
- `/observability` - Full observability setup
- `/triage` - Production incident response

Overview

This skill runs an observability audit and fixes the single highest-priority gap it finds. It performs one targeted remediation per invocation, verifies the change, and reports what was done so you can run it again for the next item.

How this skill works

The skill first invokes the observability checker to get prioritized findings. It chooses the top-priority gap (P0 → P1 → P2 → P3), applies a focused fix, verifies the result, and returns a concise report. Use the primitive audit directly for review or the issue-logger to create tickets without applying fixes.

When to use it

  • You need to quickly remediate the most severe observability gap without doing a full overhaul.
  • You want incremental, auditable fixes that are easy to test and roll back.
  • You are triaging observability debt and want to tackle one item at a time.
  • You need a scripted path for adding error tracking, health checks, or basic analytics.
  • You prefer automated verification after each change before proceeding.

Best practices

  • Run the observability check first to get the prioritized findings.
  • Create a feature branch before changes (e.g., infra/observability-YYYYMMDD).
  • Fix only the highest-priority item per run to keep changes small and testable.
  • Verify each fix locally or with included verification scripts (curl or SDK checks).
  • Add required env vars to .env.local and .env.example and record them securely.

Example use cases

  • No error tracking detected: install and initialize Sentry, add DSN and verify initialization.
  • Missing health endpoint: create a /api/health route that returns status and timestamp and curl to verify.
  • Sentry misconfigured: add required env vars and run verification script to confirm errors are captured.
  • No structured logging: install a logger (pino), add a small lib/logger module, and test logs in dev.
  • PostHog analytics missing or unconfigured: install client, add provider, add env keys, and check events in debug mode.

FAQ

How many issues will this fix in one run?

One. The skill intentionally fixes only the single highest-priority observability gap per invocation.

What priority order does it follow?

It fixes in this order: P0 (critical: no error tracking or no health endpoint), P1 (Sentry config, structured logging, alerting), P2 (analytics, console cleanup), P3 (performance monitoring).

Can I audit without fixing?

Yes—use the observability audit primitive to review findings or use the issue-logger to create issues without applying fixes.