home / skills / openclaw / skills / subtitles

subtitles skill

unsafe

This skill fetches YouTube subtitles via TranscriptAPI, returning timed or clean text to support translation, language learning, or reading along.

npx playbooks add skill openclaw/skills --skill subtitles

Review the files below or copy the command above to add this skill to your agents.

Files (3)

SKILL.md

3.9 KB

---
name: subtitles
description: Get subtitles from YouTube videos for translation, language learning, or reading along. Use when the user asks for subtitles, subs, foreign language text, or wants to read video content. Supports multiple languages and timestamped output for sync'd reading.
homepage: https://transcriptapi.com
user-invocable: true
metadata: {"openclaw":{"emoji":"🗨️","requires":{"env":["TRANSCRIPT_API_KEY"],"bins":["node"],"config":["~/.openclaw/openclaw.json"]},"primaryEnv":"TRANSCRIPT_API_KEY"}}
---

# Subtitles

Fetch YouTube video subtitles via [TranscriptAPI.com](https://transcriptapi.com).

## Setup

If `$TRANSCRIPT_API_KEY` is not set, help the user create an account (100 free credits, no card):

**Step 1 — Register:** Ask user for their email.

```bash
node ./scripts/tapi-auth.js register --email USER_EMAIL
```

→ OTP sent to email. Ask user: _"Check your email for a 6-digit verification code."_

**Step 2 — Verify:** Once user provides the OTP:

```bash
node ./scripts/tapi-auth.js verify --token TOKEN_FROM_STEP_1 --otp CODE
```

> API key saved to `~/.openclaw/openclaw.json`. See **File Writes** below for details. Existing file is backed up before modification.

Manual option: [transcriptapi.com/signup](https://transcriptapi.com/signup) → Dashboard → API Keys.

## File Writes

The verify and save-key commands save the API key to `~/.openclaw/openclaw.json` (sets `skills.entries.transcriptapi.apiKey` and `enabled: true`). **Existing file is backed up to `~/.openclaw/openclaw.json.bak` before modification.**

To use the API key in terminal/CLI outside the agent, add to your shell profile manually:
`export TRANSCRIPT_API_KEY=<your-key>`

## GET /api/v2/youtube/transcript

```bash
curl -s "https://transcriptapi.com/api/v2/youtube/transcript\
?video_url=VIDEO_URL&format=text&include_timestamp=false&send_metadata=true" \
  -H "Authorization: Bearer $TRANSCRIPT_API_KEY"
```

| Param               | Values                  | Use case                                       |
| ------------------- | ----------------------- | ---------------------------------------------- |
| `video_url`         | YouTube URL or video ID | Required                                       |
| `format`            | `json`, `text`          | `json` for sync'd subs with timing             |
| `include_timestamp` | `true`, `false`         | `false` for clean text for reading/translation |
| `send_metadata`     | `true`, `false`         | Include title, channel, description            |

**For language learning** — clean text without timestamps:

```bash
curl -s "https://transcriptapi.com/api/v2/youtube/transcript\
?video_url=VIDEO_ID&format=text&include_timestamp=false" \
  -H "Authorization: Bearer $TRANSCRIPT_API_KEY"
```

**For translation** — structured segments:

```bash
curl -s "https://transcriptapi.com/api/v2/youtube/transcript\
?video_url=VIDEO_ID&format=json&include_timestamp=true" \
  -H "Authorization: Bearer $TRANSCRIPT_API_KEY"
```

**Response** (`format=json`):

```json
{
  "video_id": "dQw4w9WgXcQ",
  "language": "en",
  "transcript": [
    { "text": "We're no strangers to love", "start": 18.0, "duration": 3.5 }
  ]
}
```

**Response** (`format=text`, `include_timestamp=false`):

```json
{
  "video_id": "dQw4w9WgXcQ",
  "language": "en",
  "transcript": "We're no strangers to love\nYou know the rules and so do I..."
}
```

## Tips

- Many videos have auto-generated subtitles in multiple languages.
- Use `format=json` to get timing for each line (great for sync'd reading).
- Use `include_timestamp=false` for clean text suitable for translation apps.

## Errors

| Code | Action                                 |
| ---- | -------------------------------------- |
| 402  | No credits — transcriptapi.com/billing |
| 404  | No subtitles available                 |
| 408  | Timeout — retry once after 2s          |

1 credit per request. Free tier: 100 credits, 300 req/min.

Overview

This skill fetches subtitles from YouTube videos for translation, language learning, or reading along. It retrieves plain text or timestamped JSON transcripts from TranscriptAPI and supports multiple languages and synced output. Use it to get clean text for study or structured segments for timed display or translation workflows.

How this skill works

The skill calls TranscriptAPI's YouTube transcript endpoint using a provided API key. You supply a YouTube URL or video ID and choose format=text for clean, timestamp-free text or format=json with include_timestamp=true for timestamped segments. The API returns language metadata and either a single text block or an array of timed transcript segments.

When to use it

You want readable plain text for translating or studying spoken content.
You need timestamped segments for synced subtitles, karaoke, or scroll-along reading.
You are building language learning exercises from authentic video dialogue.
You need video metadata (title, channel, description) alongside the transcript.
You want to extract subtitles from auto-generated or uploaded captions for backup or analysis.

Best practices

Provide the exact YouTube URL or video ID to avoid lookup errors.
Use format=text and include_timestamp=false for clean text suited to translation or flashcard creation.
Use format=json with include_timestamp=true when you need precise start times and durations for syncing.
Handle API errors: retry once after short time for timeouts, notify users when subtitles are unavailable (404), and surface billing issues (402).
Store the API key securely (environment variable) and limit requests — each transcript costs one credit.

Example use cases

Generate a plain transcript to paste into a translator or vocabulary extractor.
Create timed subtitle files (SRT/JSON) for a custom player or language study tool.
Fetch multiple language alternatives to compare original vs. translated captions for learning.
Batch-download transcripts for archiving a channel's content or building searchable text.
Extract dialogue lines and timestamps to build listening comprehension exercises.

FAQ

Do I need an API key?

Yes. The skill requires a TranscriptAPI key provided via environment variable or saved config.

Which format is best for language learning?

format=text with include_timestamp=false produces clean continuous text ideal for reading and translation practice.

How do I get timing for synced display?

Request format=json and include_timestamp=true to receive an array of segments with start and duration for each line.