home / skills / alekspetrov / navigator / nav-diagnose

nav-diagnose skill

safe

This skill detects quality drops in human-AI collaboration and re-anchors goals to restore effective Claude Code dialogue.

npx playbooks add skill alekspetrov/navigator --skill nav-diagnose

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

9.6 KB

---
name: nav-diagnose
description: Detect quality drops in AI output and prompt re-anchoring. Auto-triggers after repeated corrections, context confusion, or when user says "something seems off", "you're not getting this".
allowed-tools: Read, Write, Bash
version: 1.0.0
---

# Navigator Diagnose Skill

Detect when human-AI collaboration quality drops and prompt re-anchoring to restore effective communication.

## Why This Exists (Theory of Mind)

Based on Riedl & Weidmann 2025 research on Human-AI Synergy:
- Theory of Mind varies dynamically within users (moment-to-moment)
- Quality drops occur when ToM alignment degrades
- Early detection and re-anchoring restores collaboration effectiveness
- Both user ToM (understanding Claude) and Claude's model of user can drift

This skill detects when collaboration is degrading and prompts corrective action.

## When to Invoke

**Auto-invoke when**:
- 2+ corrections on the same topic detected
- User says "something seems off", "you're not getting this"
- User says "wrong again", "still not right"
- Context usage exceeds 75% and quality signals degrade
- User expresses frustration ("ugh", "sigh", explicit frustration)
- Loop mode stagnation detected (3+ same-state iterations)

**DO NOT invoke if**:
- Single correction (normal collaboration)
- User is providing new requirements (not correcting)
- Fresh session (insufficient data to diagnose)
- User explicitly says "it's fine" or "close enough"

## Quality Drop Indicators

### 1. Repeated Corrections (High Severity)
```
Trigger: Same correction given 2+ times
Signal: "No, I said users plural, not user" (2nd time)
Issue: Not incorporating user feedback
```

### 2. Hallucination Signals (High Severity)
```
Trigger: References to non-existent files, functions, or packages
Signal: "That file doesn't exist", "There's no such function"
Issue: Generating from incorrect mental model
```

### 3. Context Confusion (Medium Severity)
```
Trigger: Mixing details from unrelated tasks
Signal: "That's from the other project", "Wrong feature"
Issue: Context window pollution or misattribution
```

### 4. Unaddressed Feedback (Medium Severity)
```
Trigger: User correction not reflected in next output
Signal: Generates same pattern after being told not to
Issue: Not properly updating internal model
```

### 5. Goal Drift (Low Severity)
```
Trigger: Output increasingly diverges from original goal
Signal: "We're getting off track", "Not what I asked for"
Issue: Lost sight of user's actual objective
```

### 6. Loop Stagnation (High Severity)
```
Trigger: 3+ consecutive iterations with same state hash (loop mode only)
Signal: nav-loop detects stagnation, triggers nav-diagnose
Issue: Stuck on same step, unable to progress
```

## Execution Steps

### Step 1: Assess Quality State

**Analyze recent exchanges** (last 10-15 messages):

```
Quality Indicators:
- [ ] Corrections given: {count}
- [ ] Same-topic corrections: {count}
- [ ] User frustration signals: {count}
- [ ] Hallucination reports: {count}
- [ ] "Not what I meant" phrases: {count}
```

**Calculate severity**:
```python
severity = "critical" if same_topic_corrections >= 2 or hallucinations >= 1
severity = "high" if corrections >= 3 or frustration_signals >= 2
severity = "medium" if corrections >= 2 or goal_drift_detected
severity = "low" if corrections == 1  # Normal, don't trigger
```

### Step 2: Identify Root Cause

**Analyze correction patterns**:

| Pattern | Likely Cause | Re-anchoring Focus |
|---------|--------------|-------------------|
| Same correction repeated | Not incorporating feedback | Explicitly acknowledge and confirm understanding |
| Increasing corrections | Drifting from user intent | Re-establish goals |
| Technical mismatches | Wrong assumptions | Clarify technical context |
| Frustration without specifics | Communication mismatch | Ask what's wrong |

### Step 3: Display Diagnostic

**Show quality check alert**:
```
⚠️  QUALITY CHECK
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Detected Issue: {ISSUE_TYPE}
Severity: {SEVERITY}

What I noticed:
- {OBSERVATION_1}
- {OBSERVATION_2}

Possible causes:
- {CAUSE_1}
- {CAUSE_2}
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Let me re-anchor our collaboration:

1. Your goal: {RECONSTRUCTED_GOAL}
2. Current state: {STATE_SUMMARY}
3. What you want: {CORRECTED_UNDERSTANDING}

Is this understanding correct? [Y/n]
```

### Step 4: Re-anchor Collaboration

**Based on user confirmation**:

**If correct** (Y):
```
✅ Re-anchored!

I'll proceed with this understanding:
- {KEY_POINT_1}
- {KEY_POINT_2}

Continuing with: {NEXT_ACTION}
```

**If incorrect** (n):
```
Help me understand better:

1. What is your actual goal?
2. What am I getting wrong?
3. What constraints should I know?

[Open-ended response welcome]
```

### Step 5: Log Diagnostic (Optional)

**If nav-profile exists**, save diagnostic:
```json
{
  "date": "{YYYY-MM-DD}",
  "issue_type": "{ISSUE_TYPE}",
  "severity": "{SEVERITY}",
  "resolution": "re-anchored|user-corrected|escalated",
  "learnings": ["{WHAT_TO_AVOID}"]
}
```

### Step 6: Suggest Preventive Actions

**Based on severity and pattern**:

**For context overload**:
```
💡 Suggestion: Consider running nav-compact to clear context.
Current context usage is high, which can cause confusion.
```

**For repeated corrections**:
```
💡 Suggestion: Let me save your preference to avoid this in future.
"Remember I always want {X}" - This will persist across sessions.
```

**For communication mismatch**:
```
💡 Suggestion: Consider adjusting your profile preferences.
- Current verbosity: {VERBOSITY}
- Current confirmation: {CONFIRMATION}

Update with: "Remember I prefer {SUGGESTED_STYLE}"
```

---

## Re-anchoring Templates

### Template 1: Goal Re-alignment
```
Let me verify I understand your goal:

You want to: {GOAL_STATEMENT}

Not: {COMMON_MISUNDERSTANDING}

Key constraints:
- {CONSTRAINT_1}
- {CONSTRAINT_2}

Is this right?
```

### Template 2: Technical Re-alignment
```
Let me verify the technical context:

Framework: {FRAMEWORK}
Patterns: {PATTERNS}
Conventions: {CONVENTIONS}

What I should be using:
- {TOOL_1}: for {PURPOSE_1}
- {TOOL_2}: for {PURPOSE_2}

Corrections to my assumptions?
```

### Template 3: Communication Re-alignment
```
I may be mismatching your communication style:

You seem to prefer:
- {INFERRED_STYLE_1}
- {INFERRED_STYLE_2}

I've been:
- {MY_STYLE_1}
- {MY_STYLE_2}

Should I adjust my approach?
```

---

## Integration with Other Skills

### With nav-profile
- Log diagnostics for pattern analysis
- Suggest preference updates after repeated issues
- Load profile preferences for baseline comparison

### With nav-marker
- Suggest marker before major re-anchoring
- Include diagnostic state in marker

### With nav-compact
- Recommend compact if context overload detected
- Track if compaction resolves issues

---

## Quality Signals Reference

### Positive Signals (Good Collaboration)
```
- "Perfect, exactly what I needed"
- "Yes, continue"
- "Good, now..."
- No corrections for 5+ exchanges
- User providing new requirements (not corrections)
```

### Negative Signals (Quality Drop)
```
- "No", "Wrong", "Not that"
- "I already said..."
- "Again, please..."
- "Sigh", "Ugh", explicit frustration
- "You're not understanding"
- Same correction twice
```

### Neutral Signals (Normal Iteration)
```
- "Actually, let's try..."
- "Can we also..."
- "What about..."
- Single correction with explanation
```

---

## Example Scenarios

### Scenario 1: Repeated REST Convention Correction
```
Exchange 1:
User: "Create endpoint for users"
Claude: Creates /user endpoint
User: "Should be /users (plural)"

Exchange 2:
User: "Now create endpoint for posts"
Claude: Creates /post endpoint
User: "Again, plural! /posts"

→ Trigger: Same correction (plural naming) given twice
→ Action: Re-anchor on REST conventions
→ Outcome: "I understand now - always use plural nouns for REST resources"
```

### Scenario 2: Context Confusion
```
User working on: OAuth feature (Feature A)
Claude references: Stripe integration (Feature B from earlier)

User: "That's from the payment feature, not auth"

→ Trigger: Context confusion detected
→ Action: Re-anchor on current feature
→ Suggestion: Consider nav-compact to clear old context
```

### Scenario 3: User Frustration
```
User: "Ugh, still not right"
User: "This is frustrating"

→ Trigger: Frustration signals detected
→ Action: Pause and diagnose
→ Response: Open-ended question about what's wrong
```

---

## Success Criteria

Diagnostic is successful when:
- [ ] Quality drops detected before user escalates
- [ ] Root cause correctly identified
- [ ] Re-anchoring restores collaboration quality
- [ ] Preventive suggestions are actionable
- [ ] User confirms understanding after re-anchor
- [ ] Same issue doesn't recur immediately

---

## Limitations

**Cannot detect**:
- Silent user frustration (no signals in text)
- Issues outside conversation context
- Problems with external systems
- User preferences not yet expressed

**Should not**:
- Over-trigger on normal corrections
- Interrupt productive flow
- Make user feel blamed
- Require lengthy re-explanation

---

## Best Practices

**When diagnosing**:
- Be humble about AI limitations
- Don't blame user for miscommunication
- Offer concrete next steps
- Keep re-anchoring brief

**When re-anchoring**:
- Focus on understanding, not apologizing
- Confirm specific points, not general "I understand"
- Let user correct if wrong
- Thank user for patience

---

**This skill catches collaboration quality drops early, enabling quick recovery through Theory of Mind re-alignment** 🔍

Overview

This skill detects drops in human–AI collaboration quality and triggers a short diagnostic and re-anchoring flow to restore alignment. It auto-invokes after repeated corrections, context mix-ups, loop stagnation, or explicit user signals like "something seems off." The goal is to catch misalignment early and re-establish a clear, shared understanding before productivity degrades.

How this skill works

The skill monitors recent exchanges (typically the last 10–15 messages) for quality signals: repeated corrections, hallucination reports, context confusion, frustration language, and loop stagnation. When thresholds are met, it calculates a severity level, displays a concise diagnostic summary, and prompts the user to confirm a reconstructed goal or correct assumptions. After confirmation it applies the re-anchored instructions and can optionally log diagnostics for adaptive preferences.

When to use it

Auto-trigger after 2+ identical corrections on the same topic
When user says phrases like "something seems off" or "you're not getting this"
When hallucination or references to non-existent artifacts appear
If loop mode shows 3+ identical state iterations (stagnation)
When context usage is high and quality indicators start degrading

Best practices

Do not trigger on a single correction or when user provides new requirements
Keep diagnostics brief and non-blaming; focus on concrete observations
Confirm specific points (goal, constraints, technical context) rather than general agreement
Offer immediate corrective steps and preventive suggestions (e.g., compact context)
Log recurrent issues to user profile to reduce repeat friction

Example use cases

User corrects plural REST naming twice; skill re-anchors on naming convention
AI mixes features from two projects; skill prompts technical re-alignment and suggests clearing context
User shows frustration with repeated outputs; skill pauses, summarizes issues, and asks for correction
Looped debugging step repeats; skill detects stagnation and offers next-step options
Hallucinated file or function is reported; skill flags high severity and requests clarification

FAQ

Will it interrupt normal iteration?

No. It avoids firing on single corrections or when the user is adding new requirements; it only triggers when quality indicators cross configured thresholds.

Can it remember preferences to avoid repeat issues?

Yes. When a nav-profile exists, diagnostics can be logged and recurring preferences suggested to persist expected behaviors.