home / skills / troykelly / claude-skills / autonomous-orchestration
npx playbooks add skill troykelly/claude-skills --skill autonomous-orchestrationReview the files below or copy the command above to add this skill to your agents.
---
name: autonomous-orchestration
description: Use when user requests autonomous operation across multiple issues. Orchestrates parallel workers using Task tool, monitors with TaskOutput, handles SLEEP/WAKE cycles, and works until scope is complete without user intervention.
allowed-tools:
- Bash
- Read
- Grep
- Glob
- Task
- TaskOutput
- mcp__github__*
- mcp__memory__*
model: opus
---
# Autonomous Orchestration
## Overview
Orchestrates long-running autonomous work across multiple issues using the **Task tool** to spawn parallel worker agents and **TaskOutput** to monitor their progress.
**Core principle:** GitHub is the source of truth. Workers are disposable. State survives restarts.
**Announce at start:** "I'm using autonomous-orchestration to work through [SCOPE]. Starting autonomous operation now."
## Prerequisites
- `worker-dispatch` skill for spawning workers with Task tool
- `worker-protocol` skill for worker behavior
- `ci-monitoring` skill for CI/WAKE handling
- GitHub CLI (`gh`) authenticated
- GitHub Project Board configured
## Parallel Execution Model
Workers are spawned as **background agents** using the Task tool:
```
Orchestrator
│
├── Task(run_in_background: true) → Worker Agent #1 (task_id: aa93f22)
├── Task(run_in_background: true) → Worker Agent #2 (task_id: b51e54b)
└── Task(run_in_background: true) → Worker Agent #3 (task_id: c72f3d1)
```
Monitor progress with TaskOutput:
```
TaskOutput(task_id: "aa93f22", block: false) → "Task is still running..."
TaskOutput(task_id: "b51e54b", block: false) → Completed with result
TaskOutput(task_id: "c72f3d1", block: false) → "Task is still running..."
```
**CRITICAL:** Spawn all workers in the SAME message for true concurrent execution.
## State Management
**CRITICAL:** All persistent state is stored in GitHub. Task IDs are ephemeral session state.
| State Store | Purpose | Used For |
|-------------|---------|----------|
| Project Board Status | THE source of truth | Ready, In Progress, In Review, Blocked, Done |
| Issue Comments | Activity log | Worker assignment, progress, deviations |
| Labels | Lineage only | `spawned-from:#N`, `depth:N`, `epic-*` |
| MCP Memory | Active marker + task tracking | **Active orchestration detection**, active task_ids |
| Orchestrator memory | Ephemeral session state | Map of issue# → task_id for monitoring |
**See:** `reference/state-management.md` for detailed state queries and updates.
## Context Compaction Survival
**CRITICAL:** Orchestration must survive mid-loop context compaction.
### On Start: Write Active Marker
```bash
# Write to MCP Memory when orchestration starts
mcp__memory__create_entities([{
"name": "ActiveOrchestration",
"entityType": "Orchestration",
"observations": [
"Status: ACTIVE",
"Scope: [MILESTONE/EPIC/unbounded]",
"Tracking Issue: #[NUMBER]",
"Started: [ISO_TIMESTAMP]",
"Repository: [owner/repo]",
"Phase: BOOTSTRAP|MAIN_LOOP",
"Last Loop: [ISO_TIMESTAMP]"
]
}])
```
### On Each Loop Iteration: Update Marker
```bash
mcp__memory__add_observations({
"observations": [{
"entityName": "ActiveOrchestration",
"contents": ["Last Loop: [ISO_TIMESTAMP]", "Phase: MAIN_LOOP"]
}]
})
```
### On Complete: Remove Marker
```bash
mcp__memory__delete_entities({
"entityNames": ["ActiveOrchestration"]
})
```
### On Session Resume (After Compaction)
Session-start skill checks for active orchestration:
```bash
# Check MCP Memory for active orchestration
ACTIVE=$(mcp__memory__open_nodes({"names": ["ActiveOrchestration"]}))
if [ -n "$ACTIVE" ]; then
echo "⚠️ ACTIVE ORCHESTRATION DETECTED"
echo "Scope: [from ACTIVE]"
echo "Tracking: [from ACTIVE]"
echo ""
echo "Resuming orchestration loop..."
# Invoke autonomous-orchestration skill to resume
fi
```
**This ensures:** Even if context compacts mid-loop, the next session will detect the active orchestration and resume it.
## Immediate Start (User Consent Implied)
**The user's request for autonomous operation IS their consent.** No additional confirmation required.
When the user requests autonomous work:
1. **Identify scope** - Parse user request for milestone, epic, specific issues, or "all"
2. **Announce intent** - Briefly state what you're about to do
3. **Start immediately** - Begin orchestration without waiting for additional input
```markdown
## Starting Autonomous Operation
**Scope:** [MILESTONE/EPIC/ISSUES or "all open issues"]
**Workers:** Up to 5 parallel
**Mode:** Continuous until complete
Beginning work now...
```
**Do NOT ask for "PROCEED" or any confirmation.** The user asked for autonomous operation - that is the confirmation.
## Automatic Scope Detection
When the user requests autonomous operation without specifying a scope:
### Priority Order
1. **User-specified scope** - If user mentions specific issues, epics, or milestones
2. **Urgent/High Priority standalone issues** - Issues with `priority:urgent` or `priority:high` labels not part of an epic
3. **Epic-based sequential work** - Work through epics in order, completing all issues within each epic
4. **Remaining standalone issues** - Any issues not part of an epic
```bash
detect_work_scope() {
# 1. Check for urgent/high priority standalone issues first
PRIORITY_ISSUES=$(gh issue list --state open \
--label "priority:urgent,priority:high" \
--json number,labels \
--jq '[.[] | select(.labels | map(.name) | any(startswith("epic-")) | not)] | .[].number')
if [ -n "$PRIORITY_ISSUES" ]; then
echo "priority_standalone"
echo "$PRIORITY_ISSUES"
return
fi
# 2. Get epics in order (by creation date)
EPICS=$(gh issue list --state open --label "type:epic" \
--json number,title,createdAt \
--jq 'sort_by(.createdAt) | .[].number')
if [ -n "$EPICS" ]; then
echo "epics"
echo "$EPICS"
return
fi
# 3. Fall back to all open issues
ALL_ISSUES=$(gh issue list --state open --json number --jq '.[].number')
echo "all_issues"
echo "$ALL_ISSUES"
}
```
## Continuous Operation Until Complete
Autonomous operation continues until ALL of:
- No open issues remain in scope
- No open PRs awaiting merge
- No issues in "In Progress" or "In Review" status
The operation does NOT pause for:
- Progress updates
- Confirmation between issues
- Switching between epics
- Any user input (unless blocked by a fatal error)
## PR Resolution Bootstrap Phase
**CRITICAL:** Before spawning ANY new workers, resolve all existing open PRs first.
**Bootstrap Flow:** Get open PRs (exclude release/*, release-placeholder, do-not-merge) → For each: Check CI → Verify review → Merge if ready → Then start main loop
### Bootstrap Implementation
```bash
resolve_existing_prs() {
echo "=== PR RESOLUTION BOOTSTRAP ==="
# Get all open PRs, excluding release placeholders
OPEN_PRS=$(gh pr list --json number,headRefName,labels \
--jq '[.[] | select(
(.headRefName | startswith("release/") | not) and
(.labels | map(.name) | index("release-placeholder") | not)
)] | .[].number')
if [ -z "$OPEN_PRS" ]; then
echo "No actionable PRs to resolve. Proceeding to main loop."
return 0
fi
echo "Found PRs to resolve: $OPEN_PRS"
for pr in $OPEN_PRS; do
echo "Processing PR #$pr..."
# Get CI status
ci_status=$(gh pr checks "$pr" --json state --jq '.[].state' 2>/dev/null | sort -u)
# Get linked issue
ISSUE=$(gh pr view "$pr" --json body --jq '.body' | grep -oE 'Closes #[0-9]+' | grep -oE '[0-9]+' | head -1)
if [ -z "$ISSUE" ]; then
echo " ⚠ No linked issue found, skipping"
continue
fi
# Check if CI passed
if echo "$ci_status" | grep -q "FAILURE"; then
echo " ❌ CI failing - triggering ci-monitoring for PR #$pr"
# Invoke ci-monitoring skill to fix
handle_ci_failure "$pr"
continue
fi
if echo "$ci_status" | grep -q "PENDING"; then
echo " ⏳ CI pending for PR #$pr, will check in main loop"
continue
fi
if echo "$ci_status" | grep -q "SUCCESS"; then
# Verify review artifact
REVIEW_EXISTS=$(gh api "/repos/$OWNER/$REPO/issues/$ISSUE/comments" \
--jq '[.[] | select(.body | contains("<!-- REVIEW:START -->"))] | length' 2>/dev/null || echo "0")
if [ "$REVIEW_EXISTS" = "0" ]; then
echo " ⚠ No review artifact - requesting review for #$ISSUE"
gh issue comment "$ISSUE" --body "## Review Required
PR #$pr has passing CI but no review artifact.
**Action needed:** Complete comprehensive-review and post artifact to this issue.
---
*Bootstrap phase - Orchestrator*"
continue
fi
# All checks pass - merge
echo " ✅ Merging PR #$pr"
gh pr merge "$pr" --squash --delete-branch
mark_issue_done "$ISSUE"
fi
done
echo "=== BOOTSTRAP COMPLETE ==="
}
```
### Release Placeholder Detection
PRs are excluded from bootstrap resolution if:
| Condition | Example |
|-----------|---------|
| Branch starts with `release/` | `release/v2.0.0`, `release/2025-01` |
| Has `release-placeholder` label | Manual exclusion |
| Has `do-not-merge` label | Explicit hold |
## Orchestration Loop
**Main Loop Flow:** Monitor workers (TaskOutput) + Check CI/PRs + Spawn workers (Task) → Evaluate state (complete/sleep/continue)
### Loop Steps
1. **Monitor Active Workers** - Use `TaskOutput(block: false)` for each active task_id
2. **Handle Completed Workers** - Check GitHub for PR/completion status
3. **Check CI/PRs** - Monitor for merge readiness, verify review artifacts
4. **MERGE GREEN PRs** - Any PR with passing CI is merged IMMEDIATELY
5. **Spawn Workers** - Up to 5 parallel workers using `Task(run_in_background: true)`
6. **Evaluate State** - Determine next action (continue, sleep, complete)
7. **Brief Pause** - 30 second interval between iterations
### Worker Monitoring with TaskOutput
Each loop iteration checks all active workers:
```markdown
## Monitor Active Workers
For each task_id in active_workers:
TaskOutput(task_id: "[ID]", block: false, timeout: 1000)
Interpret results:
- "Task is still running..." → Continue monitoring
- Completed → Check GitHub for PR, update project board, remove from active list
- Error → Handle failure, potentially spawn replacement
```
### Spawning Parallel Workers
Spawn multiple workers in ONE message for concurrent execution:
```markdown
## Dispatch Workers
1. Count available slots: 5 - len(active_workers)
2. Get Ready issues from project board
3. For each issue to dispatch:
Task(
description: "Issue #123 worker",
prompt: [WORKER_PROMPT],
subagent_type: "general-purpose",
run_in_background: true
)
4. Store returned task_id → issue mapping
```
### CRITICAL: Merge Green PRs Immediately
**Every loop iteration must check for and merge passing PRs:**
```bash
# In each loop iteration
for pr in $(gh pr list --json number,statusCheckRollup --jq '.[] | select(.statusCheckRollup | all(.conclusion == "SUCCESS")) | .number'); do
# Check for do-not-merge label
if ! gh pr view "$pr" --json labels --jq '.labels[].name' | grep -q "do-not-merge"; then
echo "Merging PR #$pr (CI passed)"
gh pr merge "$pr" --squash --delete-branch
# Get linked issue and mark done
ISSUE=$(gh pr view "$pr" --json body --jq '.body' | grep -oE 'Closes #[0-9]+' | grep -oE '[0-9]+' | head -1)
if [ -n "$ISSUE" ]; then
# Update project board status to Done
mark_issue_done "$ISSUE"
fi
fi
done
```
**Do NOT:**
- Report "PRs are ready for merge" and stop
- Wait for user to request merge
- Summarize completed work and ask for next steps
**The loop continues until scope is complete. Green PR = immediate merge.**
## Scope Types
### Milestone
```bash
gh issue list --milestone "v1.0.0" --state open --json number --jq '.[].number'
```
### Epic
```bash
gh issue list --label "epic-dark-mode" --state open --json number --jq '.[].number'
```
### Unbounded (All Open Issues)
```bash
gh issue list --state open --json number --jq '.[].number'
```
**Do NOT ask for "UNBOUNDED" confirmation.** The user's request is their consent.
## Failure Handling
Workers that fail do NOT immediately become blocked:
```
Attempt 1 → Research → Attempt 2 → Research → Attempt 3 → Research → Attempt 4 → BLOCKED
```
Only after 3+ research cycles is an issue marked as blocked.
**See:** `reference/failure-recovery.md` for research cycle implementation.
### Blocked Determination
An issue is only marked blocked when:
- Multiple research cycles completed (3+)
- Research concludes "impossible without external input"
- Examples: missing credentials, requires human decision, external service down
## SLEEP/WAKE
### Entering SLEEP
Orchestration sleeps when:
- All issues are either blocked or in review
- No work can proceed without external event
State is posted to GitHub tracking issue (survives crashes).
### WAKE Mechanisms
- **SessionStart hook** - Checks CI status on new Claude session
- **Manual** - `claude --resume [SESSION_ID]`
## Checklist
Before starting orchestration:
- [ ] Scope identified (explicit or auto-detected)
- [ ] GitHub CLI authenticated (`gh auth status`)
- [ ] Tracking issue exists with `orchestration-tracking` label
- [ ] Project board configured with Status field
- [ ] Active marker written to MCP Memory
Bootstrap phase:
- [ ] Existing open PRs detected
- [ ] Release placeholders excluded (`release/*`, `release-placeholder`, `do-not-merge` labels)
- [ ] CI status checked for each PR
- [ ] Review artifacts verified before merge
- [ ] PRs merged or flagged for attention
- [ ] Bootstrap complete before spawning workers
During orchestration:
- [ ] Workers spawned using `Task(run_in_background: true)`
- [ ] Task IDs stored for each worker (issue# → task_id mapping)
- [ ] Worker status monitored with `TaskOutput(block: false)`
- [ ] Completed workers detected and handled
- [ ] Project board status updated (In Progress, In Review, Done)
- [ ] CI status monitored for open PRs
- [ ] Green PRs merged IMMEDIATELY
- [ ] Review artifacts verified before PR merge
- [ ] Failed workers trigger research cycles
- [ ] Handovers spawn replacement workers
- [ ] SLEEP entered when only waiting on CI
- [ ] Deviation resolution checked each loop
- [ ] Status posted to tracking issue
## Review Enforcement
**CRITICAL:** The orchestrator verifies review compliance:
1. **Before PR merge:**
- Review artifact exists in issue comments
- Review status is COMPLETE
- Unaddressed findings = 0
2. **Child issues (from deferred findings):**
- Follow full `issue-driven-development` process
- Have their own code reviews
- Track via `spawned-from:#N` label
3. **Deviation handling:**
- Parent status set to Blocked on project board
- Resumes only when all children closed
## Integration
This skill coordinates:
| Skill | Purpose |
|-------|---------|
| `worker-dispatch` | Spawning workers |
| `worker-protocol` | Worker behavior |
| `worker-handover` | Context passing |
| `ci-monitoring` | CI and WAKE handling |
| `research-after-failure` | Research cycles |
| `issue-driven-development` | Worker follows this |
| `comprehensive-review` | Workers must complete before PR |
| `project-board-enforcement` | ALL state queries and updates |
## Reference Files
- `reference/state-management.md` - State queries, updates, deviation handling
- `reference/loop-implementation.md` - Full loop code and helpers
- `reference/failure-recovery.md` - Research cycles, blocked handling, SLEEP/WAKE