home / skills / eyadsibai / ltk / youtube-transcribe
This skill extracts YouTube transcripts and subtitles using CLI, browser, or API, enabling quick text access for videos.
npx playbooks add skill eyadsibai/ltk --skill youtube-transcribeReview the files below or copy the command above to add this skill to your agents.
---
name: youtube-transcribe
description: Use when "youtube transcript", "extract subtitles", "video captions", "get transcript", "video to text"
version: 1.0.0
---
# YouTube Transcript Extraction
Extract subtitles and transcripts from YouTube videos.
---
## Methods
| Method | Tool | When to Use |
|--------|------|-------------|
| **CLI** | yt-dlp | Fast, reliable, preferred |
| **Browser** | Chrome automation | Fallback when CLI fails |
| **API** | youtube-transcript-api | Python programmatic access |
---
## yt-dlp Method (Preferred)
### Basic Command
```bash
yt-dlp --write-auto-sub --write-sub --sub-lang en --skip-download -o "%(title)s.%(ext)s" "VIDEO_URL"
```
### Key Flags
| Flag | Purpose |
|------|---------|
| `--write-sub` | Download manual subtitles |
| `--write-auto-sub` | Download auto-generated subtitles |
| `--sub-lang LANG` | Specify language (en, zh-Hans, etc.) |
| `--skip-download` | Don't download video |
| `--cookies-from-browser chrome` | Use browser cookies for restricted videos |
### Common Issues
| Issue | Solution |
|-------|----------|
| Sign-in required | Add `--cookies-from-browser chrome` |
| No subtitles found | Video has no captions available |
| Age-restricted | Use cookies from logged-in browser |
---
## Browser Automation Fallback
When CLI fails, use browser automation:
1. **Open video page** - Navigate to YouTube URL
2. **Expand description** - Click "...more" button
3. **Open transcript** - Click "Show transcript" button
4. **Extract text** - Query DOM for transcript segments
### DOM Selectors
| Element | Selector |
|---------|----------|
| Transcript segments | `ytd-transcript-segment-renderer` |
| Timestamp | `.segment-timestamp` |
| Text | `.segment-text` |
---
## Output Formats
| Format | Extension | Use Case |
|--------|-----------|----------|
| **VTT** | .vtt | Web standard, includes timing |
| **SRT** | .srt | Video editing, media players |
| **TXT** | .txt | Plain text, no timing |
### Convert VTT to Plain Text
```bash
# Strip timing and formatting
sed '/^[0-9]/d; /^$/d; /WEBVTT/d; /-->/d' video.vtt > video.txt
```
---
## Language Codes
| Language | Code |
|----------|------|
| English | `en` |
| Chinese (Simplified) | `zh-Hans` |
| Chinese (Traditional) | `zh-Hant` |
| Spanish | `es` |
| Multiple | `en,es,zh-Hans` |
---
## Best Practices
| Practice | Why |
|----------|-----|
| Try manual subs first | Higher quality than auto-generated |
| Use cookies for restricted | Avoids sign-in errors |
| Check multiple languages | Some videos have better subs in other languages |
| Verify transcript exists | Not all videos have captions |
This skill extracts subtitles and transcripts from YouTube videos using three complementary methods: a CLI downloader (yt-dlp), browser automation, and a Python API. It prioritizes fast, reliable CLI extraction and falls back to DOM scraping or the youtube-transcript-api when needed. Outputs can be saved as VTT, SRT, or plain TXT for downstream processing.
The preferred path calls yt-dlp with flags to write manual or auto-generated subtitles and skip video download, producing VTT/SRT files. If the CLI cannot access captions (age-restricted, sign-in needed, or no subtitles), the skill can automate a browser to open the transcript panel and scrape DOM elements for timestamps and text. For programmatic integration in Python, it can use youtube-transcript-api to fetch segments directly into structured data.
What command gets both manual and auto subtitles quickly?
Use yt-dlp with --write-sub --write-auto-sub --sub-lang en --skip-download and the video URL.
How do I handle sign-in or age-restricted videos?
Pass browser cookies to yt-dlp (e.g., --cookies-from-browser chrome) or authenticate the automated browser before scraping.