home / skills / coowoolf / insighthunt-skills / three-layer-agent-stack

three-layer-agent-stack skill

/ai-engineering/three-layer-agent-stack

This skill helps you design and implement three-layer agent stacks by coordinating model, API, and harness to deliver robust AI-powered workflows.

npx playbooks add skill coowoolf/insighthunt-skills --skill three-layer-agent-stack

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.9 KB
---
name: three-layer-agent-stack
description: Use when building AI-powered products or agents, when raw model intelligence isn't enough to solve user problems, or when designing the architecture for agentic workflows
---

# The Three-Layer Agent Stack

## Overview

A framework for building effective AI agents by synchronizing innovation across **three distinct layers**: Model, API, and Harness. Success requires tight integration—not treating the model as a black box.

**Core principle:** Features like "compaction" (long-running tasks) require simultaneous changes across all three layers.

## The Stack

```
┌─────────────────────────────────────────────────────────────────┐
│  LAYER 3: HARNESS / PRODUCT LAYER                               │
│  ─────────────────────────────────────────────────────────────  │
│  The environment that executes actions and provides context     │
│  • VS Code / IDE integration                                    │
│  • Terminal / Shell access                                      │
│  • Sandbox / Secure execution environment                       │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 2: API LAYER                                             │
│  ─────────────────────────────────────────────────────────────  │
│  Interface handling state, context windows, and orchestration   │
│  • Context management / Compaction                              │
│  • State handoff between sessions                               │
│  • Tool routing and formatting                                  │
├─────────────────────────────────────────────────────────────────┤
│  LAYER 1: MODEL LAYER                                           │
│  ─────────────────────────────────────────────────────────────  │
│  Foundation model providing reasoning and intelligence          │
│  • Code generation / Reasoning                                  │
│  • Summarization for compaction                                 │
│  • Environment-specific training                                │
└─────────────────────────────────────────────────────────────────┘
```

## Key Principles

| Principle | Description |
|-----------|-------------|
| **Full-Stack Iteration** | Changes often need Model + API + Harness together |
| **Harness Specificity** | Models perform best when trained for specific environments |
| **Feedback Loops** | Product usage (Harness) must inform model training |
| **Safety Sandboxing** | Harness provides secure environment for code execution |

## Common Mistakes

- **Model-only optimization**: Changing model without adapting harness
- **Generic API assumptions**: Assuming generic API supports agentic behaviors
- **No feedback loop**: Harness doesn't feed back to model training

## Real-World Example

Implementing "Compaction" to allow Codex to run 24 hours:
- **Model**: Must understand summarization
- **API**: Must handle the context handoff
- **Harness**: Must prepare and format the payload

---

*Source: Alexander Embiricos (OpenAI Codex) via Lenny's Podcast*

Overview

This skill presents the Three-Layer Agent Stack framework for designing AI-powered products and agents. It shows how to align Model, API, and Harness layers so features and long-running behaviors work reliably. It emphasizes that improving agent capabilities requires coordinated changes across all layers, not model-only tweaks.

How this skill works

The skill inspects and organizes responsibilities into three layers: the Model layer (reasoning, generation, and environment-specific behavior), the API layer (context, state handoff, and tool orchestration), and the Harness layer (execution environment, UI, and security). It guides teams to map a feature across those layers, identify gaps, and plan joint iterations. It highlights feedback loops so runtime signals from the Harness inform model tuning and API changes.

When to use it

  • Designing agentic workflows where the model must act in environments (edit files, run commands, use tools).
  • Building long-running or stateful features like compaction, session handoff, or background tasks.
  • Integrating models into products where safety, sandboxing, or execution context matter.
  • Planning iterative improvements that require coordinated changes across tooling, API, and model.
  • Troubleshooting failures that stem from mismatched assumptions between model, API, and harness.

Best practices

  • Map each feature to responsibilities in Model, API, and Harness before implementation.
  • Design APIs to explicitly handle state handoff, context compaction, and tool routing.
  • Make the Harness the source of runtime telemetry and user signals to close feedback loops with model training.
  • Create isolated sandboxes in the Harness for risky actions and enforce capability gating.
  • Iterate across all three layers when introducing new behaviors—avoid model-only fixes.

Example use cases

  • Implementing compaction for a 24-hour agent: build summarization in the Model, context handoff in the API, and formatted payloads in the Harness.
  • Adding a code-execution feature: model learns environment-aware prompts, API routes outputs and logs, Harness provides secure execution and file access.
  • Session handoff between devices: API serializes state, Model condenses intent, Harness restores context and permissions.
  • Tool orchestration where multiple tools must be sequenced and audited across layers.

FAQ

Do I have to retrain the model for every product change?

No. Many product changes require API and Harness updates first. Retrain or fine-tune the model only when behaviors or environment-specific reasoning must improve.

How do I start applying this stack to an existing agent?

Inventory current responsibilities by layer, identify mismatches (e.g., the API lacks state handoff), and create a small pilot that implements a feature across all three layers to validate the approach.