home / skills / eyadsibai / ltk / multi-agent-patterns

multi-agent-patterns skill

/plugins/ltk-core/skills/multi-agent-patterns

This skill helps design and reason about multi-agent systems using supervisor, swarm, and hierarchical patterns to improve coordination and context isolation.

npx playbooks add skill eyadsibai/ltk --skill multi-agent-patterns

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.6 KB
---
name: multi-agent-patterns
description: Use when designing multi-agent systems, implementing supervisor patterns, coordinating multiple agents, or asking about "multi-agent", "supervisor pattern", "swarm", "agent handoffs", "orchestration", "parallel agents"
version: 1.0.0
---

# Multi-Agent Architecture Patterns

Multi-agent architectures distribute work across multiple LLM instances, each with its own context window. The critical insight: sub-agents exist primarily to isolate context, not to anthropomorphize role division.

## Why Multi-Agent?

**Context Bottleneck**: Single agents fill context with history, documents, and tool outputs. Performance degrades via lost-in-middle effect and attention scarcity.

**Token Economics**:

| Architecture | Token Multiplier |
|--------------|------------------|
| Single agent chat | 1× baseline |
| Single agent + tools | ~4× baseline |
| Multi-agent system | ~15× baseline |

**Parallelization**: Research tasks can search multiple sources simultaneously. Total time approaches longest subtask, not sum.

## Architectural Patterns

### Pattern 1: Supervisor/Orchestrator

```
User Query -> Supervisor -> [Specialist, Specialist] -> Aggregation -> Output
```

**Use when**: Clear decomposition, coordination needed, human oversight important.

**The Telephone Game Problem**: Supervisors paraphrase sub-agent responses incorrectly.

**Fix**: `forward_message` tool lets sub-agents respond directly:

```python
def forward_message(message: str, to_user: bool = True):
    """Forward sub-agent response directly to user."""
    if to_user:
        return {"type": "direct_response", "content": message}
```

### Pattern 2: Peer-to-Peer/Swarm

```python
def transfer_to_agent_b():
    return agent_b  # Handoff via function return

agent_a = Agent(name="Agent A", functions=[transfer_to_agent_b])
```

**Use when**: Flexible exploration, rigid planning counterproductive, emergent requirements.

### Pattern 3: Hierarchical

```
Strategy Layer -> Planning Layer -> Execution Layer
```

**Use when**: Large-scale projects, enterprise workflows, clear separation of concerns.

## Context Isolation

Primary purpose of multi-agent: context isolation.

**Mechanisms**:

- **Full context delegation**: Complex tasks needing full understanding
- **Instruction passing**: Simple, well-defined subtasks
- **File system memory**: Shared state without context bloat

## Consensus and Coordination

**Weighted Voting**: Weight by confidence or expertise.

**Debate Protocols**: Agents critique each other's outputs. Adversarial critique often yields higher accuracy than collaborative consensus.

**Trigger-Based Intervention**:

- Stall triggers: No progress detection
- Sycophancy triggers: Mimicking without reasoning

## Failure Modes

| Failure | Mitigation |
|---------|------------|
| Supervisor Bottleneck | Output schema constraints, checkpointing |
| Coordination Overhead | Clear handoff protocols, batch results |
| Divergence | Objective boundaries, convergence checks |
| Error Propagation | Output validation, retry with circuit breakers |

## Example: Research Team

```text
Supervisor
├── Researcher (web search, document retrieval)
├── Analyzer (data analysis, statistics)
├── Fact-checker (verification, validation)
└── Writer (report generation)
```

## Best Practices

1. Design for context isolation as primary benefit
2. Choose pattern based on coordination needs, not org metaphor
3. Implement explicit handoff protocols with state passing
4. Use weighted voting or debate for consensus
5. Monitor for supervisor bottlenecks
6. Validate outputs before passing between agents
7. Set time-to-live limits to prevent infinite loops

Overview

This skill helps design and evaluate multi-agent system architectures and supervisor patterns for coordinating multiple LLM instances. It focuses on context isolation, handoff protocols, consensus methods, and failure-mode mitigations to improve reliability and scalability. Use it to choose patterns, define handoffs, and implement orchestration safeguards.

How this skill works

The skill inspects task decomposition needs and recommends one of three primary patterns: supervisor/orchestrator, peer-to-peer/swarm, or hierarchical layering. It evaluates context isolation strategies (instruction passing, full delegation, shared file memory), consensus mechanisms (weighted voting, debate), and common failure modes. It produces concrete protocol recommendations: forward_message-style direct responses, explicit handoff functions, output schemas, and circuit-breaker checkpoints.

When to use it

  • Designing systems that coordinate multiple LLM agents to avoid single-context bottlenecks
  • Implementing supervisor or orchestrator logic with human oversight or aggregation needs
  • Creating peer-to-peer or swarm setups for exploration and parallel search
  • Building hierarchical stacks for large-scale projects with strategy, planning, and execution layers
  • Defining handoff protocols, state passing, and output validation between agents

Best practices

  • Treat context isolation as the primary advantage—minimize shared context growth
  • Choose the pattern based on coordination requirements, not org metaphors
  • Implement explicit handoff functions that pass state and preserve provenance
  • Use output schemas and checkpoints to prevent supervisor bottlenecks and telephone-game errors
  • Apply weighted voting or adversarial debate for consensus and confidence weighting
  • Enforce time-to-live limits and circuit breakers to avoid infinite loops and error propagation

Example use cases

  • Research pipeline: supervisor delegates search, analysis, fact-checking, and writing, then aggregates results
  • Parallel sourcing: swarm of specialist agents probe different databases simultaneously to reduce latency
  • Enterprise workflow: hierarchical layers translate strategy into concrete execution plans and automated tasks
  • Resilient QA: multiple validators run adversarial critique and weighted voting before final publication
  • Tool-enabled handoffs: sub-agent returns direct_response objects to avoid coordinator mistranslation

FAQ

How do I prevent the supervisor from misrepresenting sub-agent outputs?

Use a forward_message-style direct response or structured output schema so sub-agents can return canonical responses that bypass paraphrase steps.

When should I use swarm vs. supervisor patterns?

Use swarm for flexible exploration and emergent requirements; use supervisor when decomposition, coordination, and human oversight are necessary.

How do I detect and stop runaway agent loops?

Set explicit time-to-live counters and circuit breakers, implement stall triggers for no-progress detection, and validate outputs at handoff points.