home / skills / athola / claude-night-market / agent-teams

agent-teams skill

safe

This skill coordinates multiple Claude Code agents via a filesystem-based protocol to manage parallel tasks, dependencies, and coordinated reviews.

npx playbooks add skill athola/claude-night-market --skill agent-teams

Review the files below or copy the command above to add this skill to your agents.

Files (7)

SKILL.md

8.4 KB

---
name: agent-teams
description: Coordinate Claude Code Agent Teams through filesystem-based protocol. Use
  when orchestrating multiple Claude agents on parallel tasks, need task dependency
  management, multi-agent code review or implementation. Do not use when single-agent
  work suffices, task is not parallelizable.
category: delegation-framework
tags:
- agent-teams
- multi-agent
- coordination
- tmux
- task-management
- messaging
dependencies:
- delegation-core
- leyline:damage-control
- leyline:risk-classification
tools:
- Bash
- Read
- Write
usage_patterns:
- team-orchestration
- parallel-implementation
- multi-agent-review
- task-dependency-management
complexity: advanced
estimated_tokens: 450
progressive_loading: true
modules:
- modules/team-management.md
- modules/messaging-protocol.md
- modules/task-coordination.md
- modules/spawning-patterns.md
- modules/crew-roles.md
- modules/health-monitoring.md
references:
- delegation-core/../../leyline/skills/error-patterns/SKILL.md
- delegation-core/../../leyline/skills/service-registry/SKILL.md
---
## Table of Contents

- [Overview](#overview)
- [When to Use](#when-to-use)
- [Prerequisites](#prerequisites)
- [Protocol Architecture](#protocol-architecture)
- [Quick Start](#quick-start)
- [Coordination Workflow](#coordination-workflow)
- [Module Reference](#module-reference)
- [Integration with Conjure](#integration-with-conjure)
- [Troubleshooting](#troubleshooting)
- [Exit Criteria](#exit-criteria)


# Agent Teams Coordination

## Overview

Claude Code Agent Teams enables multiple Claude CLI processes to collaborate on shared work through a filesystem-based coordination protocol. Each teammate runs as an independent `claude` process in a tmux pane, communicating via JSON files guarded by `fcntl` locks. No database, no daemon, no network layer.

This skill provides the patterns for orchestrating agent teams effectively.

## When To Use

- Parallel implementation across multiple files or modules
- Multi-agent code review (one agent reviews, another implements fixes)
- Large refactoring requiring coordinated changes across subsystems
- Tasks with natural parallelism that benefit from concurrent agents

## When NOT To Use

- Single-file changes or small tasks (overhead exceeds benefit)
- Tasks requiring tight sequential reasoning (agents coordinate loosely)
- When `claude` CLI is not available or tmux is not installed

## Prerequisites

```bash
# Verify Claude Code CLI
claude --version

# Verify tmux (required for split-pane mode)
tmux -V

# Enable experimental feature (set by spawner automatically)
export CLAUDE_CODE_EXPERIMENTAL_AGENT_TEAMS=1
```

## Protocol Architecture

```
~/.claude/
  teams/<team-name>/
    config.json            # Team metadata + member roster
    inboxes/
      <agent-name>.json    # Per-agent message queue
      .lock                # fcntl exclusive lock
  tasks/<team-name>/
    1.json ... N.json      # Auto-incrementing task files
    .lock                  # fcntl exclusive lock
```

**Design principles:**
- **Filesystem is the database**: JSON files, atomic writes via `tempfile` + `os.replace`
- **fcntl locking**: Prevents concurrent read/write corruption on inboxes and tasks
- **Numbered tasks**: Auto-incrementing IDs with sequential file naming
- **Loose coupling**: Agents poll their own inbox; no push notifications

## Quick Start

### 1. Create a Team

```bash
# Programmatic team setup (via MCP or direct API)
# Team config written to ~/.claude/teams/<team-name>/config.json
```

The team config contains:
- `name`, `description`, `created_at` (ms timestamp)
- `lead_agent_id`, `lead_session_id`
- `members[]` — array of LeadMember and TeammateMember objects

### 2. Spawn Teammates

Each teammate is a separate `claude` CLI process launched with identity flags:

```bash
claude --agent-id "backend@my-team" \
       --agent-name "backend" \
       --team-name "my-team" \
       --agent-color "#FF6B6B" \
       --parent-session-id "$SESSION_ID" \
       --agent-type "general-purpose" \
       --model sonnet
```

See `modules/spawning-patterns.md` for tmux pane management and color assignment.

### 3. Create Tasks with Dependencies

```json
{
  "id": "1",
  "subject": "Implement API endpoints",
  "description": "Create REST endpoints for user management",
  "status": "pending",
  "owner": null,
  "blocks": ["3"],
  "blocked_by": [],
  "metadata": {}
}
```

See `modules/task-coordination.md` for state machine and dependency management.

### 4. Coordinate via Messages

```json
{
  "from": "team-lead",
  "text": "API endpoints are ready for integration testing",
  "timestamp": "2026-02-07T22:00:00Z",
  "read": false,
  "summary": "API ready"
}
```

See `modules/messaging-protocol.md` for message types and inbox operations.

## Coordination Workflow

1. **`agent-teams:team-created`** — Initialize team config and directories
2. **`agent-teams:teammates-spawned`** — Launch agents in tmux panes
3. **`agent-teams:tasks-assigned`** — Create tasks with dependencies, assign owners
4. **`agent-teams:coordination-active`** — Agents claim tasks, exchange messages, mark completion
5. **`agent-teams:team-shutdown`** — Graceful shutdown with approval protocol

## Crew Roles

Each team member has a `role` that determines their capabilities and task compatibility. Five roles are defined: `implementer` (default), `researcher`, `tester`, `reviewer`, and `architect`. Roles constrain which risk tiers an agent can handle — see `modules/crew-roles.md` for the full capability matrix and role-risk compatibility table.

## Health Monitoring

Team members can be monitored for health via heartbeat messages and claim expiry. The lead polls team health every 60s with a 2-stage stall detection protocol (health_check probe + 30s wait). Stalled agents have their tasks released and are restarted or replaced following the "replace don't wait" doctrine. See `modules/health-monitoring.md` for the full protocol and state machine.

## Module Reference

- **team-management.md**: Team lifecycle, config format, member management
- **messaging-protocol.md**: Message types, inbox operations, locking patterns
- **task-coordination.md**: Task CRUD, state machine, dependency cycle detection
- **spawning-patterns.md**: tmux spawning, CLI flags, pane management
- **crew-roles.md**: Role taxonomy, capability matrix, role-risk compatibility
- **health-monitoring.md**: Heartbeat protocol, stall detection, automated recovery

## Integration with Conjure

Agent Teams extends the conjure delegation model:

| Conjure Pattern | Agent Teams Equivalent |
|-----------------|----------------------|
| `delegation-core:task-assessed` | `agent-teams:team-created` |
| `delegation-core:handoff-planned` | `agent-teams:tasks-assigned` |
| `delegation-core:results-integrated` | `agent-teams:team-shutdown` |
| External LLM execution | Teammate agent execution |

Use `Skill(conjure:delegation-core)` first to determine if the task benefits from multi-agent coordination vs. single-service delegation.

## Troubleshooting

### Common Issues

**tmux not found**
Install via package manager: `brew install tmux` / `apt install tmux`

**Stale lock files**
If an agent crashes mid-operation, lock files may persist. Remove `.lock` files manually from `~/.claude/teams/<team>/inboxes/` or `~/.claude/tasks/<team>/`

**Orphaned tasks**
Tasks claimed by a crashed agent stay `in_progress` indefinitely. Use `modules/health-monitoring.md` for heartbeat-based stall detection and automatic task release. The health monitoring protocol detects unresponsive agents within 60s + 30s probe window and releases their tasks for reassignment.

**Message ordering**
Filesystem timestamp resolution varies (HFS+ = 1s granularity). Use numbered filenames or UUID-sorted names to avoid collision on rapid message bursts.

**Model errors on Bedrock/Vertex/Foundry (pre-2.1.39)**
Teammate agents could use incorrect model identifiers on enterprise providers, causing 400 errors. Upgrade to Claude Code 2.1.39+ for correct model ID qualification across all providers.

**Nested session guard (2.1.39+)**
If `claude` refuses to launch within an existing session, ensure you're using tmux pane splitting (not subshell invocation). The guard is intentional — see `modules/spawning-patterns.md` for details.

## Exit Criteria

- [ ] Team created with config and directories
- [ ] Teammates spawned and registered in config
- [ ] Tasks created with dependency graph (no cycles)
- [ ] Agents coordinating via inbox messages
- [ ] Graceful shutdown completed

Overview

This skill coordinates multiple Claude Code CLI processes into cooperating teams using a filesystem-based protocol. It enables parallel task execution, dependency management, and multi-agent code review without a centralized database or network daemon. The design favors simple JSON files, fcntl locks, and tmux-based spawning for predictable, reproducible orchestration.

How this skill works

Each teammate runs as an independent claude CLI process and communicates by reading and writing JSON files under ~/.claude/teams/<team> and ~/.claude/tasks/<team>. Files are written atomically and protected with fcntl locks; agents poll per-agent inbox files and claim numbered task files. A lightweight state machine manages task lifecycle, dependencies, health heartbeats, and automatic task release for stalled agents.

When to use it

Parallel implementation across multiple files or modules to speed delivery
Multi-agent code review where one agent proposes fixes and another implements them
Large refactors that touch many subsystems and benefit from concurrent ownership
Workflows that require task dependency management and explicit handoffs
When you can run claude CLI and tmux on the host machine

Best practices

Keep task granularity small and independent to reduce coordination overhead
Use numbered task files for ordering; prefer UUIDs for high-frequency messaging
Assign clear roles (implementer, tester, reviewer, researcher, architect) to avoid duplicate work
Enable and monitor heartbeat health checks; release stalled claims automatically
Run agents inside tmux panes and pass identity flags consistently to avoid session guards

Example use cases

Split a large feature across backend, frontend, and tests with three agents implementing respective pieces
One agent performs automated static analysis and opens tasks; another agent implements fixes and a third reviews changes
Run parallel test-writing and implementation: tester and implementer agents coordinate test-first development
Coordinate a rolling refactor where tasks list blocks/blocked_by dependencies to enforce ordering
Use team lead to monitor health and reassign tasks when agents stall

FAQ

Do I need a networked server or database?

No. The protocol uses the local filesystem (~/.claude) with atomic writes and fcntl locks; no network layer required.

When should I not use agent teams?

Avoid this skill for single-file edits, tightly sequential reasoning, or environments without claude CLI or tmux; overhead can outweigh benefit.