home / skills / willsigmon / sigstack / speech-to-code-expert

speech-to-code-expert skill

safe

/plugins/voice-input/skills/speech-to-code-expert

This skill enables hands-free coding by converting speech to code through Claude, accelerating development workflows.

npx playbooks add skill willsigmon/sigstack --skill speech-to-code-expert

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.6 KB

---
name: Speech-to-Code Expert
description: Voice coding - speech to code, voice commands for development, hands-free programming
allowed-tools: Read, Edit, Bash, WebFetch
model: sonnet
---

# Speech-to-Code Expert

Code with your voice. Perfect for vibe coders.

## The Voice Coding Stack

```
Voice → Transcription → Claude → Code
```

1. **Sled**: Your voice interface to Claude
2. **Whisper/Deepgram**: Transcription
3. **Claude Code**: Understands intent, writes code

## Sled Setup

Your local voice → Claude pipeline.

```bash
cd ~/Developer/sled
./sled start

# Now speak naturally:
# "Create a SwiftUI view for user profile with avatar and name"
```

## Voice Patterns for Coding

### Feature Requests
```
"Add a button to the login screen that says 'Forgot Password'
and navigates to a password reset view"
```

### Bug Fixes
```
"The app crashes when I tap the save button.
Look at the SaveHandler and find the bug"
```

### Refactoring
```
"This function is too long. Break it into smaller functions
that each do one thing"
```

### Questions
```
"Explain how the authentication flow works in this codebase"
```

## Custom Voice Commands

### Sled Commands
```bash
# .sled/commands.yaml
commands:
  build:
    trigger: "build the app"
    action: "xcodebuild -scheme App"

  test:
    trigger: "run tests"
    action: "swift test"

  commit:
    trigger: "commit changes"
    action: "git add -A && git commit -m"
```

## Tips for Voice Coding

### 1. Be Specific About Location
```
❌ "Add a button"
✅ "Add a button below the email field in LoginView.swift"
```

### 2. Describe Visual Outcome
```
❌ "Make it look better"
✅ "Add 16 points of padding, round the corners to 12 pixels,
    and add a subtle shadow"
```

### 3. Name Things Clearly
```
❌ "Create a thing for users"
✅ "Create a UserProfile struct with name, email, and avatar URL"
```

### 4. Reference Existing Code
```
"Make this button look like the primary button in
the design system we already have"
```

## Voice + Claude Vision Workflow

```
1. Voice: "Take a screenshot of the simulator"
2. Voice: "What's wrong with this UI?"
3. [Claude analyzes screenshot]
4. Voice: "Fix the spacing issues Claude found"
5. [Claude makes changes]
6. Voice: "Take another screenshot to verify"
```

## Dictation Tools

### macOS Built-in
```
System Settings → Keyboard → Dictation → On
Press Fn Fn to activate
```

### Whisper.cpp (Local)
```bash
# Fast local transcription
brew install whisper-cpp
whisper-cpp -m models/ggml-base.en.bin -f audio.wav
```

### Deepgram (API)
```python
# Real-time transcription
from deepgram import Deepgram

dg = Deepgram(API_KEY)
# Stream audio for live coding
```

## VS Code Voice Extensions

### Voice Control
- **Voice Control for VS Code**: Basic commands
- **Talon**: Advanced voice coding (programmable)
- **Cursorless**: Structural voice editing

### Talon Example
```talon
# ~/.talon/user/code.talon
save file: key(cmd-s)
new function <user.text>:
    insert("function {text}() {{\n\n}}")
    key(up)
```

## Accessibility Benefits

Voice coding is for everyone:
- RSI prevention/recovery
- Mobility limitations
- Multitasking (pacing while coding)
- Faster for natural language descriptions

## Best Practices

### 1. Use Checkpoints
```
"Before making changes, let's save the current state"
"Now make the change"
"If that didn't work, revert to the checkpoint"
```

### 2. Verify Changes
```
"Read back what you just wrote"
"Take a screenshot and show me"
```

### 3. Iterate
```
"Almost right, but make the font slightly larger"
"Good, now apply the same style to the other buttons"
```

Use when: Hands-free coding, accessibility, natural language development, Sled workflow

Overview

This skill turns spoken language into working code and development actions so you can program hands-free. It connects local or cloud transcription (Whisper, Deepgram) to a Claude coding engine and an optional Sled voice interface. The result is voice-driven feature creation, bug fixes, refactors, and repository commands without touching the keyboard.

How this skill works

Your voice is captured and transcribed, then passed to Claude which interprets intent and generates code changes or commands. A local Sled agent can stream audio to the transcription layer and relay Claude's edits back to your project. Optional vision inputs let you capture screenshots and ask Claude to analyze and fix UI issues iteratively.

When to use it

Implementing or prototyping UI features by voice (e.g., ‘Add a profile view’).
Fixing bugs described in natural language and pointing to specific files or functions.
Refactoring long functions into smaller units using high-level instructions.
Hands-free coding during recovery from RSI or when multitasking.
Running repository actions via voice (build, test, commit) through Sled commands.

Best practices

Be specific about location and file names (e.g., ‘in LoginView.swift below the email field’).
Describe visual outcomes with exact values (padding, corner radius, spacing).
Name data structures and functions clearly when requesting new code.
Use checkpoints: save state before edits and verify changes after each step.
Iterate: ask for small adjustments rather than broad, vague edits.

Example use cases

Voice request: ‘Create a SwiftUI UserProfile view with avatar, name, and email’ — Claude generates the view and adds a struct.
Bug triage: ‘The app crashes when tapping save. Inspect SaveHandler and fix the null pointer’ — locate and patch the bug.
Refactor: ‘Split this long function into three smaller ones that each do one thing’ — Claude rewrites and updates callers.
UI tuning with vision: take a screenshot, ask ‘What’s wrong with this UI?’, then apply spacing fixes and re-check.
Repo commands: ‘Build the app’, ‘Run tests’, or ‘Commit changes’ using Sled command triggers.

FAQ

Do I need cloud transcription to use this?

No — you can use local transcription like whisper.cpp or Deepgram for real-time accuracy; choose cloud if you prefer managed speech services.

How do I ensure code quality from voice-generated edits?

Use checkpoints, run tests, request Claude to explain changes, and review diffs before committing.