home / skills / hoangnguyen0403 / agent-skills-standard / context-optimization
This skill helps optimize AI sessions by masking noisy outputs, compacting state, and preserving context to reduce latency and lost context.
npx playbooks add skill hoangnguyen0403/agent-skills-standard --skill context-optimizationReview the files below or copy the command above to add this skill to your agents.
---
name: Context Optimization
description: Techniques to maximize context window efficiency, reduce latency, and prevent 'lost in middle' issues through strategic masking and compaction.
metadata:
labels: [context, optimization, tokens, memory, performance]
triggers:
files: ['*.log', 'chat-history.json']
keywords: [reduce tokens, optimize context, summarize history, clear output]
---
## **Priority: P1 (OPTIMIZATION)**
Manage the Attention Budget. Treat context as a scarce resource.
## 1. Observation Masking (Noise Reduction)
**Problem**: Large tool outputs (logs, JSON lists) flood context and degrade reasoning.
**Solution**: Replace raw output with semantic summaries _after_ consumption.
1. **Identify**: outputs > 50 lines or > 1kb.
2. **Extract**: Read critical data points immediately.
3. **Mask**: Rewrite history to replace raw data with `[Reference: <summary_of_findings>]`.
4. **See**: `references/masking.md` for patterns.
## 2. Context Compaction (State Preservation)
**Problem**: Long conversations drift from original intent.
**Solution**: Recursive summarization that preserves _State_ over _Dialogue_.
1. **Trigger**: Every 10 turns or 8k tokens.
2. **Compact**:
- **Keep**: User Goal, Active Task, Current Errors, Key Decisions.
- **Drop**: Chat chit-chat, intermediate tool calls, corrected assumptions.
3. **Format**: Update `System Prompt` or `Memory File` with compacted state.
4. **See**: `references/compaction.md` for algorithms.
## 3. KV-Cache Awareness (Latency)
**Goal**: Maximize pre-fill cache hits.
- **Static Prefix**: strict ordering: System -> Tools -> RAG -> User.
- **Append-Only**: Avoid inserting into the middle of history if possible.
## References
- [Observation Masking Patterns](references/masking.md)
- [Compaction Algorithms](references/compaction.md)
This skill teaches practical techniques to maximize context window efficiency, reduce latency, and prevent 'lost in middle' issues by using strategic masking and compaction. It focuses on treating context as a scarce resource and preserving only the state necessary for accurate, low-latency reasoning. The methods are language- and framework-agnostic with examples for TypeScript-based agents.
The skill inspects conversation history, tool outputs, and memory to identify noisy or redundant content, then replaces or summarizes it to keep the actionable state compact. It applies observation masking for large outputs, recursive context compaction every few turns or token thresholds, and enforces KV-cache friendly ordering to reduce retrieval latency. The result is a small, high-signal context that preserves goals, active tasks, errors, and decisions while discarding transient chatter.
How often should I trigger compaction?
Trigger compaction every 10 turns or when you reach roughly 8k tokens, whichever comes first.
What exactly should I keep vs drop during compaction?
Keep user goal, active task, current errors, and key decisions. Drop chit-chat, transient tool calls, and intermediate corrected assumptions.