home / skills / coowoolf / insighthunt-skills / three-layer-agent-stack

three-layer-agent-stack skill

This skill helps you design and implement three-layer agent stacks by coordinating model, API, and harness to deliver robust AI-powered workflows.

npx playbooks add skill coowoolf/insighthunt-skills --skill three-layer-agent-stack

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.9 KB

---
name: three-layer-agent-stack
description: Use when building AI-powered products or agents, when raw model intelligence isn't enough to solve user problems, or when designing the architecture for agentic workflows
---

# The Three-Layer Agent Stack

## Overview

A framework for building effective AI agents by synchronizing innovation across **three distinct layers**: Model, API, and Harness. Success requires tight integration—not treating the model as a black box.

**Core principle:** Features like "compaction" (long-running tasks) require simultaneous changes across all three layers.

## The Stack

```
┌─────────────────────────────────────────────────────────────────┐
│  LAYER 3: HARNESS / PRODUCT LAYER                               │
│  ─────────────────────────────────────────────────────────────  │
│  The environment that executes actions and provides context     │
│  • VS Code / IDE integration                                    │
│  • Terminal / Shell access                                      │
│  • Sandbox / Secure execution environment                       │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 2: API LAYER                                             │
│  ─────────────────────────────────────────────────────────────  │
│  Interface handling state, context windows, and orchestration   │
│  • Context management / Compaction                              │
│  • State handoff between sessions                               │
│  • Tool routing and formatting                                  │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 1: MODEL LAYER                                           │
│  ─────────────────────────────────────────────────────────────  │
│  Foundation model providing reasoning and intelligence          │
│  • Code generation / Reasoning                                  │
│  • Summarization for compaction                                 │
│  • Environment-specific training                                │
└─────────────────────────────────────────────────────────────────┘
```

## Key Principles

| Principle | Description |
|-----------|-------------|
| **Full-Stack Iteration** | Changes often need Model + API + Harness together |
| **Harness Specificity** | Models perform best when trained for specific environments |
| **Feedback Loops** | Product usage (Harness) must inform model training |
| **Safety Sandboxing** | Harness provides secure environment for code execution |

## Common Mistakes

- **Model-only optimization**: Changing model without adapting harness
- **Generic API assumptions**: Assuming generic API supports agentic behaviors
- **No feedback loop**: Harness doesn't feed back to model training

## Real-World Example

Implementing "Compaction" to allow Codex to run 24 hours:
- **Model**: Must understand summarization
- **API**: Must handle the context handoff
- **Harness**: Must prepare and format the payload

---

*Source: Alexander Embiricos (OpenAI Codex) via Lenny's Podcast*

Overview

This skill presents the Three-Layer Agent Stack framework for designing AI-powered products and agents. It shows how to align Model, API, and Harness layers so features and long-running behaviors work reliably. It emphasizes that improving agent capabilities requires coordinated changes across all layers, not model-only tweaks.

How this skill works

The skill inspects and organizes responsibilities into three layers: the Model layer (reasoning, generation, and environment-specific behavior), the API layer (context, state handoff, and tool orchestration), and the Harness layer (execution environment, UI, and security). It guides teams to map a feature across those layers, identify gaps, and plan joint iterations. It highlights feedback loops so runtime signals from the Harness inform model tuning and API changes.

When to use it

Designing agentic workflows where the model must act in environments (edit files, run commands, use tools).
Building long-running or stateful features like compaction, session handoff, or background tasks.
Integrating models into products where safety, sandboxing, or execution context matter.
Planning iterative improvements that require coordinated changes across tooling, API, and model.
Troubleshooting failures that stem from mismatched assumptions between model, API, and harness.

Best practices

Map each feature to responsibilities in Model, API, and Harness before implementation.
Design APIs to explicitly handle state handoff, context compaction, and tool routing.
Make the Harness the source of runtime telemetry and user signals to close feedback loops with model training.
Create isolated sandboxes in the Harness for risky actions and enforce capability gating.
Iterate across all three layers when introducing new behaviors—avoid model-only fixes.

Example use cases

Implementing compaction for a 24-hour agent: build summarization in the Model, context handoff in the API, and formatted payloads in the Harness.
Adding a code-execution feature: model learns environment-aware prompts, API routes outputs and logs, Harness provides secure execution and file access.
Session handoff between devices: API serializes state, Model condenses intent, Harness restores context and permissions.
Tool orchestration where multiple tools must be sequenced and audited across layers.

FAQ

Do I have to retrain the model for every product change?

No. Many product changes require API and Harness updates first. Retrain or fine-tune the model only when behaviors or environment-specific reasoning must improve.

How do I start applying this stack to an existing agent?

Inventory current responsibilities by layer, identify mismatches (e.g., the API lacks state handoff), and create a small pilot that implements a feature across all three layers to validate the approach.