home / skills / 2389-research / claude-plugins / turbo

turbo skill

/speed-run/skills/turbo

This skill accelerates code generation with hosted LLMs, delivering multi-file implementations and surgical fixes to speed-run pipelines.

npx playbooks add skill 2389-research/claude-plugins --skill turbo

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.5 KB
---
name: turbo
description: Direct code generation via hosted LLM (Cerebras). Write a contract prompt, generate code, fix surgically. Part of speed-run pipeline.
---

# Turbo

Direct code generation via hosted LLM. Claude writes the contract, Cerebras implements the code, files are written directly to disk.

**Announce:** "I'm using speed-run:turbo for hosted code generation."

## When to Use

**Use turbo for:**
- Algorithmic code (rate limiters, parsers, state machines)
- Multiple files (3+)
- Boilerplate-heavy implementations
- Token-constrained sessions

**Use Claude direct instead for:**
- CRUD/storage operations (Claude is cheaper due to no fix overhead)
- Single implementation with complex coordination
- Speed-critical tasks where fix cycles are costly

## Tradeoffs

| Aspect | Claude Direct | Turbo (Hosted LLM) |
|--------|---------------|---------------------|
| Speed | ~10s | ~0.5s |
| Token Cost | Higher | ~90% savings |
| First-pass Quality | ~100% | 80-95% |
| Fixes Needed | 0 | 0-2 typical |

## Workflow

### Step 1: Write Contract Prompt

Structure your prompt with exact specifications:

```
Build [X] with [tech stack].

## DATA CONTRACT (use exactly these models):

[Pydantic models / interfaces with exact field names and types]

Example:
class Task(BaseModel):
    id: str
    title: str
    completed: bool = False
    created_at: datetime

class TaskCreate(BaseModel):
    title: str

## API CONTRACT (use exactly these routes):

POST /tasks -> Task           # Create task
GET /tasks -> list[Task]      # List all tasks
GET /tasks/{id} -> Task       # Get single task
DELETE /tasks/{id} -> dict    # Delete task
POST /reset -> dict           # Reset state (for testing)

## ALGORITHM:

1. [Step-by-step logic for the implementation]
2. [Include state management details]
3. [Include edge case handling]

## RULES:

- Use FastAPI with uvicorn
- Store data in [storage mechanism]
- Return 404 for missing resources
- POST /reset must clear all state and return {"status": "ok"}
```

### Step 2: Generate Code

```
mcp__speed-run__generate_and_write_files
  prompt: [contract prompt]
  output_dir: [target directory]
```

Returns only metadata (files written, line counts). Claude never sees the generated code.

### Step 3: Run Tests

Run the test suite against generated code.

### Step 4: Fix (if needed)

For failures, use **Claude Edit tool** for surgical fixes (typically 1-4 lines each).

Common fixes:
| Error Type | Frequency | Fix Complexity |
|------------|-----------|----------------|
| Missing utility functions | Occasional | 4 lines |
| Logic edge cases | Occasional | 1-2 lines |
| Import ordering | Rare | 1 line |

### Step 5: Re-test

Repeat Steps 3-4 until all tests pass. Even with fixes, total token cost is much lower than Claude generating everything.

## What Hosted LLM Gets Right (~90%)

- Data models match contract exactly
- Routes/endpoints correct
- Core algorithm logic
- Basic error handling

## Configuration

| Variable | Default | Description |
|----------|---------|-------------|
| `CEREBRAS_API_KEY` | (required) | Your API key |
| `CEREBRAS_MODEL` | `gpt-oss-120b` | Model to use |

Available models:

| Model | Price (in/out) | Speed | Notes |
|-------|----------------|-------|-------|
| `gpt-oss-120b` | $0.35/$0.75 | 3000 t/s | **Default** - best value, clean output |
| `llama-3.3-70b` | $0.85/$1.20 | 2100 t/s | Reliable fallback |
| `qwen-3-32b` | $0.40/$0.80 | 2600 t/s | Has verbose `<think>` tags |
| `llama3.1-8b` | $0.10/$0.10 | 2200 t/s | Cheapest, may need more fixes |

Overview

This skill provides direct code generation via a hosted LLM (Cerebras) to rapidly produce multi-file implementations from a precise contract prompt. It integrates into a speed-run pipeline: write an exact contract, generate files, run tests, and apply small surgical fixes until passing. The flow is optimized for speed and token cost savings compared to full-model generation.

How this skill works

You write a contract-style prompt that defines data models, API routes, algorithm steps, and implementation rules. The skill calls the hosted LLM to generate project files and writes them directly to disk, returning metadata about files produced. Tests are run against the generated code and any failures are fixed with small edits using an edit tool, repeating until tests pass.

When to use it

  • Generate algorithmic or infrastructure code (rate limiters, parsers, state machines).
  • Create projects that require multiple files (3+ files) or heavy boilerplate.
  • Scenarios with token-constrained sessions where cost matters.
  • When you want fast first-pass generation and can tolerate 0–2 small fix cycles.

Best practices

  • Write a precise contract prompt: explicit data models, API routes, algorithm steps, and rules.
  • Include exact types and field names in data contracts and explicit route signatures in API contracts.
  • Design tests that exercise edge cases so fixes remain small and focused.
  • Keep fixes surgical: prefer targeted edits for 1–4 lines rather than broad rewrites.
  • Choose models based on tradeoffs: default for speed/value, larger models for fewer fixes.

Example use cases

  • Generate a FastAPI service with clear Pydantic models and endpoints, then run integration tests.
  • Create multi-file libraries (parser + tokenizer + utils) where boilerplate dominates.
  • Implement state machines or rate limiters with defined state serialization and reset endpoints.
  • Scaffold complex algorithmic components that need consistent interfaces across files.
  • Rapidly prototype service backends for tests where POST /reset clears state for repeatable runs.

FAQ

How many fix cycles are typical?

Most runs need 0–2 surgical fixes; typical issues are small utilities or edge-case logic adjustments.

When should I use Claude direct instead?

Use Claude direct for simple single-file CRUD or storage tasks where token cost and fix cycles are less beneficial.