home / skills / eyadsibai / ltk / multi-agent-patterns

multi-agent-patterns skill

safe

/plugins/ltk-core/skills/multi-agent-patterns

This skill helps design and reason about multi-agent systems using supervisor, swarm, and hierarchical patterns to improve coordination and context isolation.

npx playbooks add skill eyadsibai/ltk --skill multi-agent-patterns

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.6 KB

---
name: multi-agent-patterns
description: Use when designing multi-agent systems, implementing supervisor patterns, coordinating multiple agents, or asking about "multi-agent", "supervisor pattern", "swarm", "agent handoffs", "orchestration", "parallel agents"
version: 1.0.0
---

# Multi-Agent Architecture Patterns

Multi-agent architectures distribute work across multiple LLM instances, each with its own context window. The critical insight: sub-agents exist primarily to isolate context, not to anthropomorphize role division.

## Why Multi-Agent?

**Context Bottleneck**: Single agents fill context with history, documents, and tool outputs. Performance degrades via lost-in-middle effect and attention scarcity.

**Token Economics**:

| Architecture | Token Multiplier |
|--------------|------------------|
| Single agent chat | 1× baseline |
| Single agent + tools | ~4× baseline |
| Multi-agent system | ~15× baseline |

**Parallelization**: Research tasks can search multiple sources simultaneously. Total time approaches longest subtask, not sum.

## Architectural Patterns

### Pattern 1: Supervisor/Orchestrator

```
User Query -> Supervisor -> [Specialist, Specialist] -> Aggregation -> Output
```

**Use when**: Clear decomposition, coordination needed, human oversight important.

**The Telephone Game Problem**: Supervisors paraphrase sub-agent responses incorrectly.

**Fix**: `forward_message` tool lets sub-agents respond directly:

```python
def forward_message(message: str, to_user: bool = True):
    """Forward sub-agent response directly to user."""
    if to_user:
        return {"type": "direct_response", "content": message}
```

### Pattern 2: Peer-to-Peer/Swarm

```python
def transfer_to_agent_b():
    return agent_b  # Handoff via function return

agent_a = Agent(name="Agent A", functions=[transfer_to_agent_b])
```

**Use when**: Flexible exploration, rigid planning counterproductive, emergent requirements.

### Pattern 3: Hierarchical

```
Strategy Layer -> Planning Layer -> Execution Layer
```

**Use when**: Large-scale projects, enterprise workflows, clear separation of concerns.

## Context Isolation

Primary purpose of multi-agent: context isolation.

**Mechanisms**:

- **Full context delegation**: Complex tasks needing full understanding
- **Instruction passing**: Simple, well-defined subtasks
- **File system memory**: Shared state without context bloat

## Consensus and Coordination

**Weighted Voting**: Weight by confidence or expertise.

**Debate Protocols**: Agents critique each other's outputs. Adversarial critique often yields higher accuracy than collaborative consensus.

**Trigger-Based Intervention**:

- Stall triggers: No progress detection
- Sycophancy triggers: Mimicking without reasoning

## Failure Modes

| Failure | Mitigation |
|---------|------------|
| Supervisor Bottleneck | Output schema constraints, checkpointing |
| Coordination Overhead | Clear handoff protocols, batch results |
| Divergence | Objective boundaries, convergence checks |
| Error Propagation | Output validation, retry with circuit breakers |

## Example: Research Team

```text
Supervisor
├── Researcher (web search, document retrieval)
├── Analyzer (data analysis, statistics)
├── Fact-checker (verification, validation)
└── Writer (report generation)
```

## Best Practices

1. Design for context isolation as primary benefit
2. Choose pattern based on coordination needs, not org metaphor
3. Implement explicit handoff protocols with state passing
4. Use weighted voting or debate for consensus
5. Monitor for supervisor bottlenecks
6. Validate outputs before passing between agents
7. Set time-to-live limits to prevent infinite loops

Overview

This skill helps design and evaluate multi-agent system architectures and supervisor patterns for coordinating multiple LLM instances. It focuses on context isolation, handoff protocols, consensus methods, and failure-mode mitigations to improve reliability and scalability. Use it to choose patterns, define handoffs, and implement orchestration safeguards.

How this skill works

The skill inspects task decomposition needs and recommends one of three primary patterns: supervisor/orchestrator, peer-to-peer/swarm, or hierarchical layering. It evaluates context isolation strategies (instruction passing, full delegation, shared file memory), consensus mechanisms (weighted voting, debate), and common failure modes. It produces concrete protocol recommendations: forward_message-style direct responses, explicit handoff functions, output schemas, and circuit-breaker checkpoints.

When to use it

Designing systems that coordinate multiple LLM agents to avoid single-context bottlenecks
Implementing supervisor or orchestrator logic with human oversight or aggregation needs
Creating peer-to-peer or swarm setups for exploration and parallel search
Building hierarchical stacks for large-scale projects with strategy, planning, and execution layers
Defining handoff protocols, state passing, and output validation between agents

Best practices

Treat context isolation as the primary advantage—minimize shared context growth
Choose the pattern based on coordination requirements, not org metaphors
Implement explicit handoff functions that pass state and preserve provenance
Use output schemas and checkpoints to prevent supervisor bottlenecks and telephone-game errors
Apply weighted voting or adversarial debate for consensus and confidence weighting
Enforce time-to-live limits and circuit breakers to avoid infinite loops and error propagation

Example use cases

Research pipeline: supervisor delegates search, analysis, fact-checking, and writing, then aggregates results
Parallel sourcing: swarm of specialist agents probe different databases simultaneously to reduce latency
Enterprise workflow: hierarchical layers translate strategy into concrete execution plans and automated tasks
Resilient QA: multiple validators run adversarial critique and weighted voting before final publication
Tool-enabled handoffs: sub-agent returns direct_response objects to avoid coordinator mistranslation

FAQ

How do I prevent the supervisor from misrepresenting sub-agent outputs?

Use a forward_message-style direct response or structured output schema so sub-agents can return canonical responses that bypass paraphrase steps.

When should I use swarm vs. supervisor patterns?

Use swarm for flexible exploration and emergent requirements; use supervisor when decomposition, coordination, and human oversight are necessary.

How do I detect and stop runaway agent loops?

Set explicit time-to-live counters and circuit breakers, implement stall triggers for no-progress detection, and validate outputs at handoff points.