home / skills / aviz85 / claude-skills-library / transcribe
This skill transcribes audio or video into SRT subtitles using ElevenLabs Scribe v2, boosting accessibility and streamlining subtitle generation.
npx playbooks add skill aviz85/claude-skills-library --skill transcribeReview the files below or copy the command above to add this skill to your agents.
---
name: transcribe
description: "Transcribe audio/video to SRT subtitles using ElevenLabs Scribe v2. Use for: transcription, subtitles, captions, SRT generation."
setup_complete: false
setup: "./SETUP.md"
---
# Transcribe
> **First time?** If `setup_complete: false` above, run `./SETUP.md` first, then set `setup_complete: true`.
Generate SRT subtitle files from audio/video using ElevenLabs Scribe v2.
## Quick Start
```bash
cd ~/.claude/skills/transcribe/scripts
# Basic transcription (auto-detect language)
npx ts-node transcribe.ts -i /path/to/video.mp4 -o /path/to/output.srt
# Specify language
npx ts-node transcribe.ts -i /path/to/video.mp4 -o /path/to/output.srt -l en
# Custom subtitle length (max words per entry)
npx ts-node transcribe.ts -i /path/to/video.mp4 -o /path/to/output.srt --max-words 6
# Custom max duration per subtitle
npx ts-node transcribe.ts -i /path/to/video.mp4 -o /path/to/output.srt --max-duration 4.0
```
## Options
| Option | Short | Default | Description |
|--------|-------|---------|-------------|
| `--input` | `-i` | (required) | Input audio/video file |
| `--output` | `-o` | (required) | Output SRT file path |
| `--language` | `-l` | auto | Language code (en, he, ar, etc.) |
| `--max-words` | | 5 | Max words per subtitle entry |
| `--max-duration` | | 3.0 | Max seconds per subtitle entry |
| `--max-chars` | | 70 | Max characters per subtitle entry |
| `--timing-offset` | | 0.25 | Timing offset in seconds |
| `--json` | | false | Also output raw transcript JSON |
## Language Codes
- `en` - English
- `he` - Hebrew
- `ar` - Arabic
- `es` - Spanish
- `fr` - French
- `de` - German
- `ru` - Russian
- `zh` - Chinese
- `ja` - Japanese
- (or omit for auto-detection)
## Output
The script generates:
1. `.srt` file - Standard subtitle file
2. `.json` file (optional) - Raw transcript with word-level timestamps
## Environment
API key stored in `scripts/.env`:
```
ELEVENLABS_API_KEY=your_key_here
```
This skill converts audio or video into SRT subtitle files using ElevenLabs Scribe v2. It produces timecoded subtitle entries and can also output a raw JSON transcript with word-level timestamps. The tool supports language auto-detection or explicit language codes and exposes controls for subtitle length and timing.
I ingest an input audio or video file and send it to ElevenLabs Scribe v2 to generate a transcription with timestamps. The script segments the transcript into SRT-format subtitle entries according to max words, max duration, and max characters, and applies a small timing offset to improve sync. Optionally I save the raw transcript as JSON for deeper word-level timing or post-processing.
What language codes are supported?
Common codes like en, he, ar, es, fr, de, ru, zh, ja are supported; omit to enable auto-detection.
Can I customize subtitle length and timing?
Yes — you can set max-words, max-duration, max-chars, and timing-offset to control segmentation and sync.
Where do I put my ElevenLabs API key?
The script reads the API key from an environment file used by the tool; ensure your key is available to the script process.