home / skills / badlogic / pi-skills / youtube-transcript

youtube-transcript skill

/youtube-transcript

This skill fetches YouTube transcripts for a video and enables quick summarization and analysis effortlessly.

npx playbooks add skill badlogic/pi-skills --skill youtube-transcript

Review the files below or copy the command above to add this skill to your agents.

Files (3)
SKILL.md
715 B
---
name: youtube-transcript
description: Fetch transcripts from YouTube videos for summarization and analysis.
---

# YouTube Transcript

Fetch transcripts from YouTube videos.

## Setup

```bash
cd {baseDir}
npm install
```

## Usage

```bash
{baseDir}/transcript.js <video-id-or-url>
```

Accepts video ID or full URL:
- `EBw7gsDPAYQ`
- `https://www.youtube.com/watch?v=EBw7gsDPAYQ`
- `https://youtu.be/EBw7gsDPAYQ`

## Output

Timestamped transcript entries:

```
[0:00] All right. So, I got this UniFi Theta
[0:15] I took the camera out, painted it
[1:23] And here's the final result
```

## Notes

- Requires the video to have captions/transcripts available
- Works with auto-generated and manual transcripts

Overview

This skill fetches transcripts from YouTube videos for summarization, analysis, or downstream processing. It accepts either a video ID or a full YouTube URL and returns timestamped transcript lines. It supports both auto-generated and manually provided captions when they are available.

How this skill works

You provide a video identifier or URL and the script queries YouTube caption data for that video. It parses the available caption tracks and outputs a simple, timestamped transcript format. If no captions exist, the tool reports that a transcript is unavailable.

When to use it

  • You need a quick, timestamped transcript for summarization or note-taking.
  • Preparing content analysis, keyword extraction, or topic modeling from video audio.
  • Automating subtitling workflows where raw captions are required as input.
  • Ingesting video text into an NLP pipeline for sentiment, entity, or trend analysis.
  • Prepping material for translation or accessibility audits.

Best practices

  • Provide a full YouTube URL or the plain video ID to ensure correct lookup.
  • Verify the video actually has captions; auto-generated captions are common but not guaranteed.
  • Use the timestamped output as input to summarizers or chunking tools for long videos.
  • Handle cases where captions are missing by falling back to speech-to-text only when needed.
  • Respect copyright and terms of service when storing or redistributing transcript content.

Example use cases

  • Summarize a 30-minute talk by fetching the transcript and running a summarizer over timed chunks.
  • Extract speaker-mention timestamps to link video moments to show notes or highlights.
  • Feed transcripts into search indexes so users can search within video content.
  • Compare auto-generated captions to manual captions for quality assessment.
  • Translate transcripts into another language before creating translated subtitles.

FAQ

What input formats are accepted?

You can pass a plain video ID (e.g., EBw7gsDPAYQ), a full watch URL, or a youtu.be short link.

Does it work with auto-generated captions?

Yes. The tool supports both auto-generated and manual captions if they exist for the video.

What happens if a video has no captions?

The script will report that no transcript is available. You can then choose to run an external speech-to-text step.