home / skills / yonatangross / orchestkit / skill-evolution

skill-evolution skill

safe

/plugins/ork/skills/skill-evolution

This skill helps you evolve and rollback skill definitions by analyzing usage patterns, edits, and success rates.

npx playbooks add skill yonatangross/orchestkit --skill skill-evolution

Review the files below or copy the command above to add this skill to your agents.

Files (3)

SKILL.md

12.5 KB

---
name: skill-evolution
license: MIT
compatibility: "Claude Code 2.1.34+."
description: "Evolves skills based on usage patterns. Use when improving or rolling back skill definitions."
argument-hint: "[subcommand] [skill-id]"
context: inherit
version: 1.0.0
author: OrchestKit
tags: [skill-management, evolution, versioning, analytics]
user-invocable: true
allowed-tools: [Read, Write, Edit, Grep, Glob]
complexity: medium
metadata:
  category: document-asset-creation
---

# Skill Evolution Manager

Enables skills to automatically improve based on usage patterns, user edits, and success rates. Provides version control with safe rollback capability.

## Overview

- Reviewing how skills are performing across sessions
- Identifying patterns in user edits to skill outputs
- Applying learned improvements to skill templates
- Rolling back problematic skill changes
- Tracking skill version history and success rates

## Quick Reference

| Command | Description |
|---------|-------------|
| `/ork:skill-evolution` | Show evolution report for all skills |
| `/ork:skill-evolution analyze <skill-id>` | Analyze specific skill patterns |
| `/ork:skill-evolution evolve <skill-id>` | Review and apply suggestions |
| `/ork:skill-evolution history <skill-id>` | Show version history |
| `/ork:skill-evolution rollback <skill-id> <version>` | Restore previous version |

---

## How It Works

The skill evolution system operates in three phases:

```
COLLECT                    ANALYZE                    ACT
───────                    ───────                    ───
┌─────────────┐           ┌─────────────┐           ┌─────────────┐
│ PostTool    │──────────▶│ Evolution   │──────────▶│ /ork:skill  │
│ Edit        │  patterns │ Analyzer    │ suggest   │ evolve      │
│ Tracker     │           │ Engine      │           │ command     │
└─────────────┘           └─────────────┘           └─────────────┘
     │                          │                          │
     ▼                          ▼                          ▼
┌─────────────┐           ┌─────────────┐           ┌─────────────┐
│ edit-       │           │ evolution-  │           │ versions/   │
│ patterns.   │           │ registry.   │           │ snapshots   │
│ jsonl       │           │ json        │           │             │
└─────────────┘           └─────────────┘           └─────────────┘
```

### Edit Pattern Categories

The system tracks these common edit patterns:

| Pattern | Description | Detection |
|---------|-------------|-----------|
| `add_pagination` | User adds pagination to API responses | `limit.*offset`, `cursor.*pagination` |
| `add_rate_limiting` | User adds rate limiting | `rate.?limit`, `throttl` |
| `add_error_handling` | User adds try/catch blocks | `try.*catch`, `except` |
| `add_types` | User adds TypeScript/Python types | `interface\s`, `Optional` |
| `add_validation` | User adds input validation | `validate`, `Pydantic`, `Zod` |
| `add_logging` | User adds logging/observability | `logger\.`, `console.log` |
| `remove_comments` | User removes generated comments | Pattern removal detection |
| `add_auth_check` | User adds authentication checks | `@auth`, `@require_auth` |

### Suggestion Thresholds

| Threshold | Default | Description |
|-----------|---------|-------------|
| Minimum Samples | 5 | Uses before generating suggestions |
| Add Threshold | 70% | Frequency to suggest adding pattern |
| Auto-Apply Confidence | 85% | Confidence for auto-application |
| Rollback Trigger | -20% | Success rate drop to trigger rollback |

---

## Subcommand: Report (Default)

**Usage:** `/ork:skill-evolution`

Shows evolution report for all tracked skills.

### Implementation

```bash
# Run the evolution engine report
"${CLAUDE_PROJECT_DIR}/.claude/scripts/evolution-engine.sh" report
```

### Sample Output

```
Skill Evolution Report
══════════════════════════════════════════════════════════════

Skills Summary:
┌────────────────────────────┬─────────┬─────────┬───────────┬────────────┐
│ Skill                      │ Uses    │ Success │ Avg Edits │ Suggestions│
├────────────────────────────┼─────────┼─────────┼───────────┼────────────┤
│ api-design-framework       │     156 │     94% │       1.8 │          2 │
│ database-schema-designer   │      89 │     91% │       2.1 │          1 │
│ fastapi-patterns           │      67 │     88% │       2.4 │          3 │
└────────────────────────────┴─────────┴─────────┴───────────┴────────────┘

Summary:
  Skills tracked: 3
  Total uses: 312
  Overall success rate: 91%

Top Pending Suggestions:
1. 93% | api-design-framework | add add_pagination
2. 88% | api-design-framework | add add_rate_limiting
3. 85% | fastapi-patterns | add add_error_handling
```

---

## Subcommand: Analyze

**Usage:** `/ork:skill-evolution analyze <skill-id>`

Analyzes edit patterns for a specific skill.

### Implementation

```bash
# Run analysis for specific skill
"${CLAUDE_PROJECT_DIR}/.claude/scripts/evolution-engine.sh" analyze "$SKILL_ID"
```

### Sample Output

```
Skill Analysis: api-design-framework
────────────────────────────────────
Uses: 156 | Success: 94% | Avg Edits: 1.8

Edit Patterns Detected:
┌──────────────────────────┬─────────┬──────────┬────────────┐
│ Pattern                  │ Freq    │ Samples  │ Confidence │
├──────────────────────────┼─────────┼──────────┼────────────┤
│ add_pagination           │    85%  │ 132/156  │       0.93 │
│ add_rate_limiting        │    72%  │ 112/156  │       0.88 │
│ add_error_handling       │    45%  │  70/156  │       0.56 │
└──────────────────────────┴─────────┴──────────┴────────────┘

Pending Suggestions:
1. 93% conf: ADD add_pagination to template
2. 88% conf: ADD add_rate_limiting to template

Run `/ork:skill-evolution evolve api-design-framework` to review
```

---

## Subcommand: Evolve

**Usage:** `/ork:skill-evolution evolve <skill-id>`

Interactive review and application of improvement suggestions.

### Implementation

When this subcommand is invoked:

1. **Get Suggestions:**
```bash
SUGGESTIONS=$("${CLAUDE_PROJECT_DIR}/.claude/scripts/evolution-engine.sh" suggest "$SKILL_ID")
```

2. **For Each Suggestion, Present Interactive Options:**

Use `AskUserQuestion` to let the user decide on each suggestion:

```json
{
  "questions": [{
    "question": "Apply suggestion: ADD add_pagination to template? (93% confidence, 132/156 users add this)",
    "header": "Evolution",
    "options": [
      {"label": "Apply", "description": "Add this pattern to the skill template"},
      {"label": "Skip", "description": "Skip for now, ask again later"},
      {"label": "Reject", "description": "Never suggest this again"}
    ],
    "multiSelect": false
  }]
}
```

3. **On Apply:**
   - Create version snapshot first
   - Apply the suggestion to skill files
   - Update evolution registry

4. **On Reject:**
   - Mark suggestion as rejected in registry
   - Won't be suggested again

### Applying Suggestions

When a user accepts a suggestion, the implementation depends on the suggestion type:

**For `add` suggestions to templates:**
- Add the pattern to the skill's template files
- Update SKILL.md with new guidance

**For `add` suggestions to references:**
- Create new reference file in `references/` directory

**For `remove` suggestions:**
- Remove the identified content
- Archive in version snapshot first

---

## Subcommand: History

**Usage:** `/ork:skill-evolution history <skill-id>`

Shows version history with performance metrics.

### Implementation

```bash
# Run version manager list
"${CLAUDE_PROJECT_DIR}/.claude/scripts/version-manager.sh" list "$SKILL_ID"
```

### Sample Output

```
Version History: api-design-framework
══════════════════════════════════════════════════════════════

Current Version: 1.2.0

┌─────────┬────────────┬─────────┬───────┬───────────┬────────────────────────────┐
│ Version │ Date       │ Success │ Uses  │ Avg Edits │ Changelog                  │
├─────────┼────────────┼─────────┼───────┼───────────┼────────────────────────────┤
│ 1.2.0   │ 2026-01-14 │    94%  │   156 │       1.8 │ Added pagination pattern   │
│ 1.1.0   │ 2026-01-05 │    89%  │    80 │       2.3 │ Added error handling ref   │
│ 1.0.0   │ 2025-11-01 │    78%  │    45 │       3.2 │ Initial release            │
└─────────┴────────────┴─────────┴───────┴───────────┴────────────────────────────┘
```

---

## Subcommand: Rollback

**Usage:** `/ork:skill-evolution rollback <skill-id> <version>`

Restores a skill to a previous version.

### Implementation

1. **Confirm with User:**

Use `AskUserQuestion` for confirmation:

```json
{
  "questions": [{
    "question": "Rollback api-design-framework from 1.2.0 to 1.0.0? Current version will be backed up.",
    "header": "Rollback",
    "options": [
      {"label": "Confirm Rollback", "description": "Restore version 1.0.0"},
      {"label": "Cancel", "description": "Keep current version"}
    ],
    "multiSelect": false
  }]
}
```

2. **On Confirm:**
```bash
"${CLAUDE_PROJECT_DIR}/.claude/scripts/version-manager.sh" restore "$SKILL_ID" "$VERSION"
```

3. **Report Result:**
```
Restored api-design-framework to version 1.0.0
Previous version backed up to: versions/.backup-1.2.0-1736867234
```

---

## Data Files

| File | Purpose | Format |
|------|---------|--------|
| `.claude/feedback/edit-patterns.jsonl` | Raw edit pattern events | JSONL (append-only) |
| `.claude/feedback/evolution-registry.json` | Aggregated suggestions | JSON |
| `.claude/feedback/metrics.json` | Skill usage metrics | JSON |
| `skills/<cat>/<name>/versions/` | Version snapshots | Directory |
| `skills/<cat>/<name>/versions/manifest.json` | Version metadata | JSON |

---

## Auto-Evolution Safety

The system includes safety mechanisms:

1. **Version Snapshots**: Always created before changes
2. **Rollback Triggers**: Auto-alert if success rate drops >20%
3. **Human Review**: High-confidence suggestions require approval
4. **Rejection Memory**: Rejected suggestions aren't re-suggested

### Health Monitoring

The system monitors skill health and can trigger warnings:

```
WARNING: api-design-framework success rate dropped from 94% to 71%
Consider: /ork:skill-evolution rollback api-design-framework 1.1.0
```

---

## References

- [Evolution Analysis Methodology](references/evolution-analysis.md)
- [Version Management Guide](references/version-management.md)

---

## Related Skills

- `configure` - Configure OrchestKit settings
- `doctor` - Diagnose OrchestKit issues
- `feedback-dashboard` - View comprehensive feedback metrics

Overview

This skill evolves skill definitions automatically by observing usage, user edits, and success metrics. It proposes high-confidence template changes, creates safe version snapshots, and supports interactive apply or rollback flows. Use it to continuously improve skill quality while keeping full version traceability.

How this skill works

The system collects post-tool edits and usage metrics, analyzes recurring edit patterns, then generates suggestions to add, remove, or modify templates and references. Suggestions enter an interactive review flow where you can Apply, Skip, or Reject; accepted changes are snapshotted and applied, rejected suggestions are recorded to avoid repeat prompts. Rollback and automated alerts protect production stability when success rates drop.

When to use it

When patterns of repeated user edits indicate missing best practices in a skill template
Before applying bulk improvements across skills to ensure safe versioning
To monitor and act on declining skill success rates with rollback options
When onboarding new conventions (types, validation, logging) across multiple skills
To generate prioritized, confidence-scored suggestions for maintainers

Best practices

Require human review for suggestions below auto-apply confidence thresholds
Keep the evolution registry and metrics files under version control for auditability
Create snapshots before any automated change and verify tests after apply
Use rejection memory to reduce noise from low-value suggestions
Monitor rollback triggers and set sensible success-rate baselines per skill

Example use cases

Detect many users adding pagination to API examples and propose adding pagination by default to the skill template
Auto-suggest adding TypeScript types when a high percentage of edits insert interfaces or Optional types
Propose adding rate-limiting scaffolding when edits commonly add throttling checks
Roll back a recent template change after automated monitoring detects a >20% drop in success rate
Inspect a skill's version history and performance metrics before releasing a new major change

FAQ

How does the system decide when to suggest a change?

Suggestions are based on observed edit-pattern frequencies, minimum sample counts, and confidence thresholds; defaults require multiple samples and a high percentage of users performing the same edit.

Are changes applied automatically?

High-confidence changes may be auto-applied per policy, but most suggestions are presented for interactive review. Snapshots are always created before any change.

Can I undo an applied evolution?

Yes. The history command lists versions and a rollback command restores a previous snapshot while backing up the current version.