home / skills / athola / claude-night-market / agent-teams

This skill coordinates multiple Claude Code agents via a filesystem-based protocol to manage parallel tasks, dependencies, and coordinated reviews.

npx playbooks add skill athola/claude-night-market --skill agent-teams

Review the files below or copy the command above to add this skill to your agents.

Files (7)
SKILL.md
8.4 KB
---
name: agent-teams
description: Coordinate Claude Code Agent Teams through filesystem-based protocol. Use
  when orchestrating multiple Claude agents on parallel tasks, need task dependency
  management, multi-agent code review or implementation. Do not use when single-agent
  work suffices, task is not parallelizable.
category: delegation-framework
tags:
- agent-teams
- multi-agent
- coordination
- tmux
- task-management
- messaging
dependencies:
- delegation-core
- leyline:damage-control
- leyline:risk-classification
tools:
- Bash
- Read
- Write
usage_patterns:
- team-orchestration
- parallel-implementation
- multi-agent-review
- task-dependency-management
complexity: advanced
estimated_tokens: 450
progressive_loading: true
modules:
- modules/team-management.md
- modules/messaging-protocol.md
- modules/task-coordination.md
- modules/spawning-patterns.md
- modules/crew-roles.md
- modules/health-monitoring.md
references:
- delegation-core/../../leyline/skills/error-patterns/SKILL.md
- delegation-core/../../leyline/skills/service-registry/SKILL.md
---
## Table of Contents

- [Overview](#overview)
- [When to Use](#when-to-use)
- [Prerequisites](#prerequisites)
- [Protocol Architecture](#protocol-architecture)
- [Quick Start](#quick-start)
- [Coordination Workflow](#coordination-workflow)
- [Module Reference](#module-reference)
- [Integration with Conjure](#integration-with-conjure)
- [Troubleshooting](#troubleshooting)
- [Exit Criteria](#exit-criteria)


# Agent Teams Coordination

## Overview

Claude Code Agent Teams enables multiple Claude CLI processes to collaborate on shared work through a filesystem-based coordination protocol. Each teammate runs as an independent `claude` process in a tmux pane, communicating via JSON files guarded by `fcntl` locks. No database, no daemon, no network layer.

This skill provides the patterns for orchestrating agent teams effectively.

## When To Use

- Parallel implementation across multiple files or modules
- Multi-agent code review (one agent reviews, another implements fixes)
- Large refactoring requiring coordinated changes across subsystems
- Tasks with natural parallelism that benefit from concurrent agents

## When NOT To Use

- Single-file changes or small tasks (overhead exceeds benefit)
- Tasks requiring tight sequential reasoning (agents coordinate loosely)
- When `claude` CLI is not available or tmux is not installed

## Prerequisites

```bash
# Verify Claude Code CLI
claude --version

# Verify tmux (required for split-pane mode)
tmux -V

# Enable experimental feature (set by spawner automatically)
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
```

## Protocol Architecture

```
~/.claude/
  teams/<team-name>/
    config.json            # Team metadata + member roster
    inboxes/
      <agent-name>.json    # Per-agent message queue
      .lock                # fcntl exclusive lock
  tasks/<team-name>/
    1.json ... N.json      # Auto-incrementing task files
    .lock                  # fcntl exclusive lock
```

**Design principles:**
- **Filesystem is the database**: JSON files, atomic writes via `tempfile` + `os.replace`
- **fcntl locking**: Prevents concurrent read/write corruption on inboxes and tasks
- **Numbered tasks**: Auto-incrementing IDs with sequential file naming
- **Loose coupling**: Agents poll their own inbox; no push notifications

## Quick Start

### 1. Create a Team

```bash
# Programmatic team setup (via MCP or direct API)
# Team config written to ~/.claude/teams/<team-name>/config.json
```

The team config contains:
- `name`, `description`, `created_at` (ms timestamp)
- `lead_agent_id`, `lead_session_id`
- `members[]` — array of LeadMember and TeammateMember objects

### 2. Spawn Teammates

Each teammate is a separate `claude` CLI process launched with identity flags:

```bash
claude --agent-id "backend@my-team" \
       --agent-name "backend" \
       --team-name "my-team" \
       --agent-color "#FF6B6B" \
       --parent-session-id "$SESSION_ID" \
       --agent-type "general-purpose" \
       --model sonnet
```

See `modules/spawning-patterns.md` for tmux pane management and color assignment.

### 3. Create Tasks with Dependencies

```json
{
  "id": "1",
  "subject": "Implement API endpoints",
  "description": "Create REST endpoints for user management",
  "status": "pending",
  "owner": null,
  "blocks": ["3"],
  "blocked_by": [],
  "metadata": {}
}
```

See `modules/task-coordination.md` for state machine and dependency management.

### 4. Coordinate via Messages

```json
{
  "from": "team-lead",
  "text": "API endpoints are ready for integration testing",
  "timestamp": "2026-02-07T22:00:00Z",
  "read": false,
  "summary": "API ready"
}
```

See `modules/messaging-protocol.md` for message types and inbox operations.

## Coordination Workflow

1. **`agent-teams:team-created`** — Initialize team config and directories
2. **`agent-teams:teammates-spawned`** — Launch agents in tmux panes
3. **`agent-teams:tasks-assigned`** — Create tasks with dependencies, assign owners
4. **`agent-teams:coordination-active`** — Agents claim tasks, exchange messages, mark completion
5. **`agent-teams:team-shutdown`** — Graceful shutdown with approval protocol

## Crew Roles

Each team member has a `role` that determines their capabilities and task compatibility. Five roles are defined: `implementer` (default), `researcher`, `tester`, `reviewer`, and `architect`. Roles constrain which risk tiers an agent can handle — see `modules/crew-roles.md` for the full capability matrix and role-risk compatibility table.

## Health Monitoring

Team members can be monitored for health via heartbeat messages and claim expiry. The lead polls team health every 60s with a 2-stage stall detection protocol (health_check probe + 30s wait). Stalled agents have their tasks released and are restarted or replaced following the "replace don't wait" doctrine. See `modules/health-monitoring.md` for the full protocol and state machine.

## Module Reference

- **team-management.md**: Team lifecycle, config format, member management
- **messaging-protocol.md**: Message types, inbox operations, locking patterns
- **task-coordination.md**: Task CRUD, state machine, dependency cycle detection
- **spawning-patterns.md**: tmux spawning, CLI flags, pane management
- **crew-roles.md**: Role taxonomy, capability matrix, role-risk compatibility
- **health-monitoring.md**: Heartbeat protocol, stall detection, automated recovery

## Integration with Conjure

Agent Teams extends the conjure delegation model:

| Conjure Pattern | Agent Teams Equivalent |
|-----------------|----------------------|
| `delegation-core:task-assessed` | `agent-teams:team-created` |
| `delegation-core:handoff-planned` | `agent-teams:tasks-assigned` |
| `delegation-core:results-integrated` | `agent-teams:team-shutdown` |
| External LLM execution | Teammate agent execution |

Use `Skill(conjure:delegation-core)` first to determine if the task benefits from multi-agent coordination vs. single-service delegation.

## Troubleshooting

### Common Issues

**tmux not found**
Install via package manager: `brew install tmux` / `apt install tmux`

**Stale lock files**
If an agent crashes mid-operation, lock files may persist. Remove `.lock` files manually from `~/.claude/teams/<team>/inboxes/` or `~/.claude/tasks/<team>/`

**Orphaned tasks**
Tasks claimed by a crashed agent stay `in_progress` indefinitely. Use `modules/health-monitoring.md` for heartbeat-based stall detection and automatic task release. The health monitoring protocol detects unresponsive agents within 60s + 30s probe window and releases their tasks for reassignment.

**Message ordering**
Filesystem timestamp resolution varies (HFS+ = 1s granularity). Use numbered filenames or UUID-sorted names to avoid collision on rapid message bursts.

**Model errors on Bedrock/Vertex/Foundry (pre-2.1.39)**
Teammate agents could use incorrect model identifiers on enterprise providers, causing 400 errors. Upgrade to Claude Code 2.1.39+ for correct model ID qualification across all providers.

**Nested session guard (2.1.39+)**
If `claude` refuses to launch within an existing session, ensure you're using tmux pane splitting (not subshell invocation). The guard is intentional — see `modules/spawning-patterns.md` for details.

## Exit Criteria

- [ ] Team created with config and directories
- [ ] Teammates spawned and registered in config
- [ ] Tasks created with dependency graph (no cycles)
- [ ] Agents coordinating via inbox messages
- [ ] Graceful shutdown completed

Overview

This skill coordinates multiple Claude Code CLI processes into cooperating teams using a filesystem-based protocol. It enables parallel task execution, dependency management, and multi-agent code review without a centralized database or network daemon. The design favors simple JSON files, fcntl locks, and tmux-based spawning for predictable, reproducible orchestration.

How this skill works

Each teammate runs as an independent claude CLI process and communicates by reading and writing JSON files under ~/.claude/teams/<team> and ~/.claude/tasks/<team>. Files are written atomically and protected with fcntl locks; agents poll per-agent inbox files and claim numbered task files. A lightweight state machine manages task lifecycle, dependencies, health heartbeats, and automatic task release for stalled agents.

When to use it

  • Parallel implementation across multiple files or modules to speed delivery
  • Multi-agent code review where one agent proposes fixes and another implements them
  • Large refactors that touch many subsystems and benefit from concurrent ownership
  • Workflows that require task dependency management and explicit handoffs
  • When you can run claude CLI and tmux on the host machine

Best practices

  • Keep task granularity small and independent to reduce coordination overhead
  • Use numbered task files for ordering; prefer UUIDs for high-frequency messaging
  • Assign clear roles (implementer, tester, reviewer, researcher, architect) to avoid duplicate work
  • Enable and monitor heartbeat health checks; release stalled claims automatically
  • Run agents inside tmux panes and pass identity flags consistently to avoid session guards

Example use cases

  • Split a large feature across backend, frontend, and tests with three agents implementing respective pieces
  • One agent performs automated static analysis and opens tasks; another agent implements fixes and a third reviews changes
  • Run parallel test-writing and implementation: tester and implementer agents coordinate test-first development
  • Coordinate a rolling refactor where tasks list blocks/blocked_by dependencies to enforce ordering
  • Use team lead to monitor health and reassign tasks when agents stall

FAQ

Do I need a networked server or database?

No. The protocol uses the local filesystem (~/.claude) with atomic writes and fcntl locks; no network layer required.

When should I not use agent teams?

Avoid this skill for single-file edits, tightly sequential reasoning, or environments without claude CLI or tmux; overhead can outweigh benefit.