home / skills / openclaw / skills / yt-summarize
This skill summarizes YouTube videos by extracting transcripts and captions to deliver quick insights and key points without watching.
npx playbooks add skill openclaw/skills --skill yt-summarizeReview the files below or copy the command above to add this skill to your agents.
---
name: youtube-summarize
description: Summarize YouTube videos by extracting transcripts and captions. Use when you need to get a quick summary of a video, extract key points, or analyze video content without watching it.
metadata: {"openclaw":{"requires":{"bins":["yt-dlp"]},"install":[{"id":"python","kind":"pip","package":"yt-dlp","bins":["yt-dlp"],"label":"Install yt-dlp (pip)"}]}}
---
# YouTube Video Summarizer
## Setup
Install yt-dlp:
```bash
pip install yt-dlp
```
## Extract Transcript
Get auto-generated subtitles:
```bash
yt-dlp --write-auto-sub --sub-lang en --skip-download --sub-format vtt -o "%(title)s" "VIDEO_URL"
```
Get manual subtitles (if available):
```bash
yt-dlp --write-sub --sub-lang en --skip-download --sub-format vtt -o "%(title)s" "VIDEO_URL"
```
List available subtitles:
```bash
yt-dlp --list-subs "VIDEO_URL"
```
## Extract as Plain Text
Download and convert to text:
```bash
yt-dlp --write-auto-sub --sub-lang en --skip-download --sub-format vtt -o "transcript" "VIDEO_URL" && \
sed -e '/^$/d' -e '/^[0-9]/d' -e '/-->/d' -e 's/<[^>]*>//g' transcript.en.vtt | sort -u > transcript.txt
```
## Quick Transcript to Stdout
```bash
yt-dlp --write-auto-sub --sub-lang en --skip-download --sub-format json3 -o - "VIDEO_URL" 2>/dev/null | \
python3 -c "
import sys, json
data = json.load(sys.stdin)
for event in data.get('events', []):
for seg in event.get('segs', []):
if text := seg.get('utf8', '').strip():
print(text, end=' ')"
```
## Get Video Metadata
```bash
yt-dlp --dump-json "VIDEO_URL" | python3 -c "
import sys, json
d = json.load(sys.stdin)
print(f\"Title: {d['title']}\")
print(f\"Channel: {d['channel']}\")
print(f\"Duration: {d['duration']//60}:{d['duration']%60:02d}\")
print(f\"Views: {d.get('view_count', 'N/A'):,}\")
print(f\"Upload: {d.get('upload_date', 'N/A')}\")
print(f\"Description:\n{d.get('description', '')[:500]}...\")"
```
## Summarization Workflow
1. Extract transcript:
```bash
yt-dlp --write-auto-sub --sub-lang en --skip-download -o "video" "VIDEO_URL"
```
2. Clean VTT to plain text:
```bash
python3 -c "
import re
with open('video.en.vtt', 'r') as f:
content = f.read()
# Remove VTT headers and timestamps
content = re.sub(r'WEBVTT.*?\n\n', '', content, flags=re.DOTALL)
content = re.sub(r'\d+:\d+:\d+\.\d+ --> \d+:\d+:\d+\.\d+.*?\n', '', content)
content = re.sub(r'<[^>]+>', '', content)
lines = [l.strip() for l in content.split('\n') if l.strip()]
unique = []
for l in lines:
if l not in unique[-1:]:
unique.append(l)
print(' '.join(unique))" > transcript.txt
```
3. Send to LLM for summarization (the transcript is now ready for Claude to analyze)
## Multi-language Support
Extract subtitles in other languages:
```bash
# Russian
yt-dlp --write-auto-sub --sub-lang ru --skip-download "VIDEO_URL"
# Spanish
yt-dlp --write-auto-sub --sub-lang es --skip-download "VIDEO_URL"
# Multiple languages
yt-dlp --write-auto-sub --sub-lang "en,ru,es" --skip-download "VIDEO_URL"
```
## Chapter Extraction
Get video chapters (if available):
```bash
yt-dlp --dump-json "VIDEO_URL" | python3 -c "
import sys, json
d = json.load(sys.stdin)
for ch in d.get('chapters', []):
start = int(ch['start_time'])
print(f\"{start//60}:{start%60:02d} - {ch['title']}\")"
```
## Common Options
| Option | Description |
|--------|-------------|
| `--sub-lang en` | Subtitle language (en, ru, es, de, fr, etc.) |
| `--write-auto-sub` | Get auto-generated captions |
| `--write-sub` | Get manual subtitles |
| `--sub-format vtt` | Output format (vtt, srt, json3) |
| `--skip-download` | Don't download video |
## Notes
- Auto-generated subtitles may have errors
- Not all videos have subtitles available
- Some videos have subtitles disabled by uploader
- Use `--sub-lang` with appropriate language code
- Transcripts work best for spoken content (lectures, podcasts, tutorials)
This skill extracts transcripts and captions from YouTube videos and produces concise summaries and key points. It automates subtitle download, cleans VTT/SRT output into plain text, and prepares transcripts for LLM summarization. Ideal for quickly understanding video content without watching the full video.
The skill uses yt-dlp to download auto-generated or manual subtitles in formats like VTT, SRT, or json3. It provides small Python cleaning snippets to strip timestamps, HTML tags, and duplicate lines, producing a clean transcript file. The cleaned transcript is then ready to send to a language model for summarization, chapter extraction, or multilingual analysis.
What if subtitles are not available for a video?
If subtitles are disabled or missing, the skill cannot extract a transcript; consider using an automatic speech-to-text service on the downloaded audio instead.
Are auto-generated captions reliable?
Auto-generated captions are convenient but can contain errors, especially with noise, accents, or technical terms. Always review or correct transcripts before using them for critical tasks.