home / skills / openclaw / skills / youtube-transcript-yt-dlp

youtube-transcript-yt-dlp skill

/skills/itzsubhadip/youtube-transcript-yt-dlp

This skill extracts YouTube transcripts from existing captions using yt-dlp, caches results locally, and optionally outputs text or JSON for quick

npx playbooks add skill openclaw/skills --skill youtube-transcript-yt-dlp

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
3.8 KB
---
name: yt_transcript
description: Extract YouTube video transcripts from existing captions (manual or auto-generated) using yt-dlp, with optional timestamps and local SQLite caching. Use when the user asks for a YouTube transcript, captions, subtitles, or wants to turn a YouTube link into text for summarization/search.
metadata: {"openclaw":{"requires":{"bins":["python3","yt-dlp"]},"os":["linux","darwin","win32"]}}
user-invocable: true
---

# YouTube Transcript (Captions-Only)

This skill extracts transcripts from **existing** YouTube captions.

**Primary behavior**
- Prefer **manual subtitles** when available.
- Fall back to **auto-generated captions**.
- Output either:
  - **JSON** segments (default) or
  - **plain text** (`--text`)
- Cache results locally in **SQLite** for speed.

**Reliability behavior**
- If YouTube blocks anonymous access (bot-check), provide **cookies.txt**.
- If `yt-dlp` reports no captions for a video, the script tries a fallback:
  1) YouTube’s **transcript panel** (youtubei `get_transcript`) when accessible

This published version intentionally **does not** call third-party transcript providers.

**Privacy note:** This published version only contacts YouTube directly (via `yt-dlp` and the transcript panel fallback). It does **not** send video IDs/URLs to third-party transcript providers.

**Cookies:** Cookies are treated as secrets.
- The script supports `--cookies` / `YT_TRANSCRIPT_COOKIES`, but does **not** auto-load cookies from inside the skill directory.
- Store cookies under `~/.config/yt-transcript/`.

**Path safety:** This skill restricts `--cookies` and `--cache` paths to approved directories.
- cookies allowed under: `~/.config/yt-transcript/`
- cache allowed under: `{baseDir}/cache/` and `~/.config/yt-transcript/`

## How to run

Script path:
- `{baseDir}/scripts/yt_transcript.py`

Typical usage:
- `python3 {baseDir}/scripts/yt_transcript.py <youtube_url_or_id>`
- `python3 {baseDir}/scripts/yt_transcript.py <url> --lang en`
- `python3 {baseDir}/scripts/yt_transcript.py <url> --text`
- `python3 {baseDir}/scripts/yt_transcript.py <url> --no-ts`

Cookies (optional, but often required on VPS IPs):
- `python3 {baseDir}/scripts/yt_transcript.py <url> --cookies /path/to/youtube-cookies.txt`
- or set env var: `YT_TRANSCRIPT_COOKIES=/path/to/youtube-cookies.txt`

**Publishing safety note:** Cookies are optional, so `YT_TRANSCRIPT_COOKIES` is intentionally **not required** by skill metadata. Only set it if you need authenticated access.

**Best practice:** store cookies **outside** the skill folder (so you never accidentally publish them), e.g. `~/.config/yt-transcript/youtube-cookies.txt`, and point to it via `--cookies` or `YT_TRANSCRIPT_COOKIES`.

## What the script returns

### JSON mode (default)
A JSON object:
- `video_id`: 11-char id
- `lang`: chosen language
- `source`: `manual` | `auto` | `panel`
- `segments`: list of `{ start, duration, text }` (or text-only when `--no-ts`)

### Text mode (`--text`)
A newline-separated transcript.
- By default timestamps are included as `[12.34s]`.
- Use `--no-ts` to output only the text lines.

## Caching

Default cache DB:
- `{baseDir}/cache/transcripts.sqlite`

Cache key includes:
- `video_id`, `lang`, `source`, `include_timestamp`, `format`

## Cookie handling (important)

- Cookies must be in **Netscape cookies.txt** format.
- Treat cookies as **secrets**.
- **Never commit / publish cookies** to ClawHub.

Recommended local path (ignored by git/publish):
- `{baseDir}/cache/youtube-cookies.txt` (chmod 600)

## Notes (safety + reliability)

- Only accept a YouTube URL or an 11-character video ID.
- Do not forward arbitrary user-provided flags into the command.
- If `yt-dlp` is missing, instruct the user to install it (recommended):
  - install pipx
  - `pipx install yt-dlp`
  - ensure `yt-dlp` is on PATH

Overview

This skill extracts transcripts from existing YouTube captions using yt-dlp and an optional YouTube transcript panel fallback. It prefers manual subtitles, falls back to auto-generated captions, and can output JSON segments or plain text. Results can be cached locally in SQLite for faster repeated access.

How this skill works

Given a YouTube URL or 11-character video ID, the tool queries YouTube via yt-dlp to fetch available captions and chooses manual subtitles when present, otherwise falls back to auto captions or the transcript panel. It formats output as JSON segments (with start, duration, text) or newline text, optionally including timestamps. The script supports an optional cookies file for authenticated access and stores cached transcripts in a local SQLite database to avoid repeated downloads.

When to use it

  • You want a machine-readable transcript (JSON segments) for search or indexing.
  • You need a plain-text transcript for summarization or downstream NLP tasks.
  • You want to convert a YouTube link into text without sending video data to third-party services.
  • You need reliable access to manual subtitles and a fallback to auto-generated captions.
  • You want to cache transcripts locally to speed up repeated processing.

Best practices

  • Provide a valid YouTube URL or the 11-character video ID only; other inputs are rejected.
  • Use cookies (Netscape cookies.txt) stored outside the project, e.g. ~/.config/yt-transcript/, when running on VPS or blocked IPs.
  • Prefer JSON mode for structured processing; use --text and --no-ts when you need plain text without timestamps.
  • Keep the cookies file and cache under approved paths to avoid accidental publication.
  • Install yt-dlp via pipx and ensure it is on PATH if the script reports it missing.

Example use cases

  • Fetch manual captions for a lecture video and index segments into a search engine.
  • Convert a tutorial video into plain text for a summarizer or note-taking pipeline.
  • Batch-download transcripts for offline archiving with local SQLite caching.
  • Use the transcript panel fallback when yt-dlp reports no captions to retrieve auto-generated text.
  • Provide timestamped segments to align transcripts with other media or annotations.

FAQ

What output formats are available?

JSON segments (default) and plain text via --text; timestamps can be removed with --no-ts.

When should I supply cookies?

Supply cookies when YouTube blocks anonymous access or you need access to restricted captions; use a Netscape cookies.txt file stored outside the repo.

Where is the cache stored?

Default cache is a transcripts.sqlite file under the skill cache directory; only approved cache paths are allowed.