home / skills / vibery-studio / templates / context-engineering

context-engineering skill

not checked

npx playbooks add skill vibery-studio/templates --skill context-engineering

Review the files below or copy the command above to add this skill to your agents.

Files (12)

SKILL.md

4.0 KB

---
name: context-engineering
description: Master context engineering for AI agents - token optimization, degradation patterns, compression, memory systems, multi-agent coordination, evaluation. Use when designing agents, debugging context failures, or building LLM pipelines.
version: 1.0.0
---

# Context Engineering

Context engineering curates the smallest high-signal token set for LLM tasks. The goal: maximize reasoning quality while minimizing token usage.

## When to Activate

- Designing/debugging agent systems
- Context limits constrain performance
- Optimizing cost/latency
- Building multi-agent coordination
- Implementing memory systems
- Evaluating agent performance
- Developing LLM-powered pipelines

## Core Principles

1. **Context quality > quantity** - High-signal tokens beat exhaustive content
2. **Attention is finite** - U-shaped curve favors beginning/end positions
3. **Progressive disclosure** - Load information just-in-time
4. **Isolation prevents degradation** - Partition work across sub-agents
5. **Measure before optimizing** - Know your baseline

## Quick Reference

| Topic            | When to Use                                        | Reference                                                       |
| ---------------- | -------------------------------------------------- | --------------------------------------------------------------- |
| **Fundamentals** | Understanding context anatomy, attention mechanics | [context-fundamentals.md](./references/context-fundamentals.md) |
| **Degradation**  | Debugging failures, lost-in-middle, poisoning      | [context-degradation.md](./references/context-degradation.md)   |
| **Optimization** | Compaction, masking, caching, partitioning         | [context-optimization.md](./references/context-optimization.md) |
| **Compression**  | Long sessions, summarization strategies            | [context-compression.md](./references/context-compression.md)   |
| **Memory**       | Cross-session persistence, knowledge graphs        | [memory-systems.md](./references/memory-systems.md)             |
| **Multi-Agent**  | Coordination patterns, context isolation           | [multi-agent-patterns.md](./references/multi-agent-patterns.md) |
| **Evaluation**   | Testing agents, LLM-as-Judge, metrics              | [evaluation.md](./references/evaluation.md)                     |
| **Tool Design**  | Tool consolidation, description engineering        | [tool-design.md](./references/tool-design.md)                   |
| **Pipelines**    | Project development, batch processing              | [project-development.md](./references/project-development.md)   |

## Key Metrics

- **Token utilization**: Warning at 70%, trigger optimization at 80%
- **Token variance**: Explains 80% of agent performance variance
- **Multi-agent cost**: ~15x single agent baseline
- **Compaction target**: 50-70% reduction, <5% quality loss
- **Cache hit target**: 70%+ for stable workloads

## Four-Bucket Strategy

1. **Write**: Save context externally (scratchpads, files)
2. **Select**: Pull only relevant context (retrieval, filtering)
3. **Compress**: Reduce tokens while preserving info (summarization)
4. **Isolate**: Split across sub-agents (partitioning)

## Anti-Patterns

- Exhaustive context over curated context
- Critical info in middle positions
- No compaction triggers before limits
- Single agent for parallelizable tasks
- Tools without clear descriptions

## Guidelines

1. Place critical info at beginning/end of context
2. Implement compaction at 70-80% utilization
3. Use sub-agents for context isolation, not role-play
4. Design tools with 4-question framework (what, when, inputs, returns)
5. Optimize for tokens-per-task, not tokens-per-request
6. Validate with probe-based evaluation
7. Monitor KV-cache hit rates in production
8. Start minimal, add complexity only when proven necessary

## Scripts

- [context_analyzer.py](./scripts/context_analyzer.py) - Context health analysis, degradation detection
- [compression_evaluator.py](./scripts/compression_evaluator.py) - Compression quality evaluation