home / skills / everyinc / compound-engineering-plugin / agent-native-audit

This skill performs a comprehensive agent-native architecture audit by launching parallel sub-agents to score and report gaps against eight core principles.

npx playbooks add skill everyinc/compound-engineering-plugin --skill agent-native-audit

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
8.0 KB
---
name: agent-native-audit
description: Run comprehensive agent-native architecture review with scored principles
argument-hint: "[optional: specific principle to audit]"
disable-model-invocation: true
---

# Agent-Native Architecture Audit

Conduct a comprehensive review of the codebase against agent-native architecture principles, launching parallel sub-agents for each principle and producing a scored report.

## Core Principles to Audit

1. **Action Parity** - "Whatever the user can do, the agent can do"
2. **Tools as Primitives** - "Tools provide capability, not behavior"
3. **Context Injection** - "System prompt includes dynamic context about app state"
4. **Shared Workspace** - "Agent and user work in the same data space"
5. **CRUD Completeness** - "Every entity has full CRUD (Create, Read, Update, Delete)"
6. **UI Integration** - "Agent actions immediately reflected in UI"
7. **Capability Discovery** - "Users can discover what the agent can do"
8. **Prompt-Native Features** - "Features are prompts defining outcomes, not code"

## Workflow

### Step 1: Load the Agent-Native Skill

First, invoke the agent-native-architecture skill to understand all principles:

```
/compound-engineering:agent-native-architecture
```

Select option 7 (action parity) to load the full reference material.

### Step 2: Launch Parallel Sub-Agents

Launch 8 parallel sub-agents using the Task tool with `subagent_type: Explore`, one for each principle. Each agent should:

1. Enumerate ALL instances in the codebase (user actions, tools, contexts, data stores, etc.)
2. Check compliance against the principle
3. Provide a SPECIFIC SCORE like "X out of Y (percentage%)"
4. List specific gaps and recommendations

<sub-agents>

**Agent 1: Action Parity**
```
Audit for ACTION PARITY - "Whatever the user can do, the agent can do."

Tasks:
1. Enumerate ALL user actions in frontend (API calls, button clicks, form submissions)
   - Search for API service files, fetch calls, form handlers
   - Check routes and components for user interactions
2. Check which have corresponding agent tools
   - Search for agent tool definitions
   - Map user actions to agent capabilities
3. Score: "Agent can do X out of Y user actions"

Format:
## Action Parity Audit
### User Actions Found
| Action | Location | Agent Tool | Status |
### Score: X/Y (percentage%)
### Missing Agent Tools
### Recommendations
```

**Agent 2: Tools as Primitives**
```
Audit for TOOLS AS PRIMITIVES - "Tools provide capability, not behavior."

Tasks:
1. Find and read ALL agent tool files
2. Classify each as:
   - PRIMITIVE (good): read, write, store, list - enables capability without business logic
   - WORKFLOW (bad): encodes business logic, makes decisions, orchestrates steps
3. Score: "X out of Y tools are proper primitives"

Format:
## Tools as Primitives Audit
### Tool Analysis
| Tool | File | Type | Reasoning |
### Score: X/Y (percentage%)
### Problematic Tools (workflows that should be primitives)
### Recommendations
```

**Agent 3: Context Injection**
```
Audit for CONTEXT INJECTION - "System prompt includes dynamic context about app state"

Tasks:
1. Find context injection code (search for "context", "system prompt", "inject")
2. Read agent prompts and system messages
3. Enumerate what IS injected vs what SHOULD be:
   - Available resources (files, drafts, documents)
   - User preferences/settings
   - Recent activity
   - Available capabilities listed
   - Session history
   - Workspace state

Format:
## Context Injection Audit
### Context Types Analysis
| Context Type | Injected? | Location | Notes |
### Score: X/Y (percentage%)
### Missing Context
### Recommendations
```

**Agent 4: Shared Workspace**
```
Audit for SHARED WORKSPACE - "Agent and user work in the same data space"

Tasks:
1. Identify all data stores/tables/models
2. Check if agents read/write to SAME tables or separate ones
3. Look for sandbox isolation anti-pattern (agent has separate data space)

Format:
## Shared Workspace Audit
### Data Store Analysis
| Data Store | User Access | Agent Access | Shared? |
### Score: X/Y (percentage%)
### Isolated Data (anti-pattern)
### Recommendations
```

**Agent 5: CRUD Completeness**
```
Audit for CRUD COMPLETENESS - "Every entity has full CRUD"

Tasks:
1. Identify all entities/models in the codebase
2. For each entity, check if agent tools exist for:
   - Create
   - Read
   - Update
   - Delete
3. Score per entity and overall

Format:
## CRUD Completeness Audit
### Entity CRUD Analysis
| Entity | Create | Read | Update | Delete | Score |
### Overall Score: X/Y entities with full CRUD (percentage%)
### Incomplete Entities (list missing operations)
### Recommendations
```

**Agent 6: UI Integration**
```
Audit for UI INTEGRATION - "Agent actions immediately reflected in UI"

Tasks:
1. Check how agent writes/changes propagate to frontend
2. Look for:
   - Streaming updates (SSE, WebSocket)
   - Polling mechanisms
   - Shared state/services
   - Event buses
   - File watching
3. Identify "silent actions" anti-pattern (agent changes state but UI doesn't update)

Format:
## UI Integration Audit
### Agent Action → UI Update Analysis
| Agent Action | UI Mechanism | Immediate? | Notes |
### Score: X/Y (percentage%)
### Silent Actions (anti-pattern)
### Recommendations
```

**Agent 7: Capability Discovery**
```
Audit for CAPABILITY DISCOVERY - "Users can discover what the agent can do"

Tasks:
1. Check for these 7 discovery mechanisms:
   - Onboarding flow showing agent capabilities
   - Help documentation
   - Capability hints in UI
   - Agent self-describes in responses
   - Suggested prompts/actions
   - Empty state guidance
   - Slash commands (/help, /tools)
2. Score against 7 mechanisms

Format:
## Capability Discovery Audit
### Discovery Mechanism Analysis
| Mechanism | Exists? | Location | Quality |
### Score: X/7 (percentage%)
### Missing Discovery
### Recommendations
```

**Agent 8: Prompt-Native Features**
```
Audit for PROMPT-NATIVE FEATURES - "Features are prompts defining outcomes, not code"

Tasks:
1. Read all agent prompts
2. Classify each feature/behavior as defined in:
   - PROMPT (good): outcomes defined in natural language
   - CODE (bad): business logic hardcoded
3. Check if behavior changes require prompt edit vs code change

Format:
## Prompt-Native Features Audit
### Feature Definition Analysis
| Feature | Defined In | Type | Notes |
### Score: X/Y (percentage%)
### Code-Defined Features (anti-pattern)
### Recommendations
```

</sub-agents>

### Step 3: Compile Summary Report

After all agents complete, compile a summary with:

```markdown
## Agent-Native Architecture Review: [Project Name]

### Overall Score Summary

| Core Principle | Score | Percentage | Status |
|----------------|-------|------------|--------|
| Action Parity | X/Y | Z% | ✅/⚠️/❌ |
| Tools as Primitives | X/Y | Z% | ✅/⚠️/❌ |
| Context Injection | X/Y | Z% | ✅/⚠️/❌ |
| Shared Workspace | X/Y | Z% | ✅/⚠️/❌ |
| CRUD Completeness | X/Y | Z% | ✅/⚠️/❌ |
| UI Integration | X/Y | Z% | ✅/⚠️/❌ |
| Capability Discovery | X/Y | Z% | ✅/⚠️/❌ |
| Prompt-Native Features | X/Y | Z% | ✅/⚠️/❌ |

**Overall Agent-Native Score: X%**

### Status Legend
- ✅ Excellent (80%+)
- ⚠️ Partial (50-79%)
- ❌ Needs Work (<50%)

### Top 10 Recommendations by Impact

| Priority | Action | Principle | Effort |
|----------|--------|-----------|--------|

### What's Working Excellently

[List top 5 strengths]
```

## Success Criteria

- [ ] All 8 sub-agents complete their audits
- [ ] Each principle has a specific numeric score (X/Y format)
- [ ] Summary table shows all scores and status indicators
- [ ] Top 10 recommendations are prioritized by impact
- [ ] Report identifies both strengths and gaps

## Optional: Single Principle Audit

If $ARGUMENTS specifies a single principle (e.g., "action parity"), only run that sub-agent and provide detailed findings for that principle alone.

Valid arguments:
- `action parity` or `1`
- `tools` or `primitives` or `2`
- `context` or `injection` or `3`
- `shared` or `workspace` or `4`
- `crud` or `5`
- `ui` or `integration` or `6`
- `discovery` or `7`
- `prompt` or `features` or `8`

Overview

This skill runs a comprehensive agent-native architecture review and produces a scored report against eight core principles. It launches parallel sub-agents to inspect code, prompts, tools, UI integration, data stores, and capability surfaces. The output is a prioritized set of gaps, recommendations, and an overall agent-native score for the project.

How this skill works

The skill spawns one sub-agent per principle to enumerate instances in the codebase, check compliance, and assign a specific X/Y score with percentage. Each sub-agent lists locations, gaps, and concrete remediation steps. Finally, the skill compiles a summary table, status indicators, top recommendations by impact, and strengths.

When to use it

  • Before major agent feature releases to validate architecture readiness
  • During engineering audits to align a product with agent-native principles
  • When integrating an LLM agent into an existing product for the first time
  • To prioritize technical debt related to agent capabilities and UI sync
  • When preparing design docs or developer onboarding for agent features

Best practices

  • Run the audit against a current branch or mainline to get accurate scores
  • Provide source paths and access to prompts, tool definitions, and schema
  • Treat tools as minimal primitives; push business logic into orchestrators or prompts
  • Ensure agent writes use the same persistent models and tables as users
  • Automate regular audits and track scores over time to measure progress

Example use cases

  • Audit a web app to confirm agents can perform every user-facing action (action parity)
  • Identify tools that embed workflows and refactor them into primitives
  • Verify that system prompts inject session and workspace state correctly
  • Detect UI update gaps where agent changes are not reflected to users
  • Create a prioritized plan to add missing CRUD operations for key entities

FAQ

Can I run a single-principle audit instead of all eight?

Yes. You can target a single principle to get a focused, detailed audit and remediation list.

What output formats are produced?

The skill produces a scored report per principle, a compiled summary table with status icons, and a prioritized top-10 recommendations list.

What does a score like X/Y mean?

X/Y indicates passed checks versus total checks for that principle; the percentage and status (Excellent/Partial/Needs Work) follow established thresholds.