home / skills / tdhopper / dotfiles2.0 / auditing-claude-instructions

auditing-claude-instructions skill

safe

/.claude/skills/auditing-claude-instructions

This skill audits CLAUDE.md and agents.md files against a research rubric, delivering actionable improvements and clear scoring.

npx playbooks add skill tdhopper/dotfiles2.0 --skill auditing-claude-instructions

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

3.8 KB

---
name: auditing-claude-instructions
description: Use this skill when evaluating, auditing, reviewing, or optimizing CLAUDE.md files (or agents.md files) for effectiveness. Triggers on "review my CLAUDE.md", "optimize my claude instructions", "is my CLAUDE.md effective", "audit my claude config", or when users share their CLAUDE.md content for feedback. Evaluates files against a research-backed rubric covering minimality, tooling, codebase overviews, novelty, and authorship.
---

# Auditing Claude Instructions

Evaluate CLAUDE.md and agents.md files against a research-backed rubric. Score each file on 5 criteria (4 points each, 20 max) and provide actionable recommendations.

## Evaluation Process

### 1. Locate the File

```bash
fd -H "CLAUDE.md" .
fd -H "agents.md" .
```

### 2. Gather Context

Before scoring, check what other documentation exists in the repo:

```bash
fd -H "README.md" .
fd -d 1 . docs/ 2>/dev/null
```

This informs the Novelty vs. Redundancy criterion—instructions that duplicate README content are actively harmful.

### 3. Score Against Rubric

Apply each criterion from `./scoring-rubric.md`. For every criterion, assign a score of 4 (Excellent), 3 (Satisfactory), or 1 (Needs Improvement).

**The 5 criteria:**

1. **Minimality of Requirements** — Only what's needed to interact with the repo. Extra instructions increase exploration time and raise inference costs by 20%+.
2. **Specification of Tooling and Environment** — Explicit tool names and commands. Agents strictly adhere to specified tools.
3. **Absence of Codebase Overviews** — No enumerated directories or file summaries. Research shows overviews don't help agents find files faster and can cause models to waste steps re-reading context.
4. **Novelty vs. Redundancy** — Unique operational context not found elsewhere in the repo. Redundancy with README/docs only helps if all other docs are deleted.
5. **Authorship and Curation** — Human-written or heavily human-edited. Purely LLM-generated files reduce task success rates by 3% on average.

### 4. Generate Report

## Output Format

```markdown
## CLAUDE.md Audit Report

### Scores

| Criterion | Score | Rating |
|-----------|-------|--------|
| Minimality of Requirements | X/4 | [Excellent/Satisfactory/Needs Improvement] |
| Tooling and Environment | X/4 | [Excellent/Satisfactory/Needs Improvement] |
| Absence of Codebase Overviews | X/4 | [Excellent/Satisfactory/Needs Improvement] |
| Novelty vs. Redundancy | X/4 | [Excellent/Satisfactory/Needs Improvement] |
| Authorship and Curation | X/4 | [Excellent/Satisfactory/Needs Improvement] |
| **Total** | **X/20** | |

### Findings

#### [Criterion Name] — [Score]/4
**Evidence**: [Quote or reference specific lines]
**Issue**: [What's wrong and why it matters, citing research]
**Fix**: [Concrete rewrite or removal]

### Recommended Rewrite

[If score < 16, provide a complete rewritten version of the file]
```

## Scoring Thresholds

- **17-20**: Well-optimized file. Minor tweaks only.
- **13-16**: Functional but has clear areas for improvement.
- **9-12**: Significant issues reducing agent effectiveness.
- **5-8**: File is likely hurting more than helping. Consider a full rewrite.

## Key Principles

- **Cut aggressively**: For every line, ask "Would removing this cause the agent to make mistakes?" If no, cut it.
- **Commands over prose**: Replace explanatory paragraphs with runnable commands.
- **Detect LLM generation**: Watch for telltale signs—exhaustive file trees, generic advice ("write clean code"), walls of boilerplate. These indicate an unedited LLM-generated file.
- **Check for README duplication**: If content appears in both CLAUDE.md and README.md, it should be removed from CLAUDE.md.
- **Verify tooling specificity**: Vague references like "run the tests" should be `pytest -x` or `npm test`.

Overview

This skill audits CLAUDE.md and agents.md instruction files for clarity, usefulness, and agent efficiency. It applies a research-backed rubric to score five criteria and returns actionable, command-oriented fixes and a recommended rewrite when needed.

How this skill works

The skill locates CLAUDE.md / agents.md in a repo, gathers surrounding documentation to check for redundancy, and scores files on minimality, tooling specificity, absence of codebase overviews, novelty, and authorship. For each criterion it gives evidence, explains why issues matter, and supplies concrete fixes or full rewrites when the file scores below the threshold.

When to use it

Before deploying an agent that will rely on CLAUDE.md or agents.md
When you want a quick, research-backed audit of instruction effectiveness
If you suspect instruction duplication with other docs in the repo
When onboarding new contributors who will interact with repository agents
When optimizing for reduced model inference steps and lower costs

Best practices

Keep instructions minimal — include only what an agent must know to act
Prefer explicit commands and exact tool invocations over prose
Remove enumerated codebase overviews; let the agent discover files
Avoid duplicating README content; reserve unique operational context here
Flag or rewrite content likely generated solely by LLMs and add human edits

Example use cases

Audit a fresh CLAUDE.md in a dotfiles repo managed with yadm before agent use
Compare CLAUDE.md against README.md to detect harmful redundancy
Score multiple agents.md variants to choose the most efficient instruction set
Provide a concise, runnable replacement when the file adds unnecessary context
Verify that tooling instructions include exact commands (e.g., git, yadm)

FAQ

What constitutes a harmful codebase overview?

Any enumerated directory or file summary that repeats repository structure; these tend to waste model context and steps.

How strict should tooling specs be?

Very strict: include exact command names and flags the agent must use (e.g., yadm status, pytest -x).