home / skills / jst-well-dan / skill-box / advanced-video-downloader

advanced-video-downloader skill

safe

/content-pipeline/advanced-video-downloader

This skill downloads and transcribes videos from 1000+ platforms, supports quality, audio extraction, playlists, and AI transcription to Markdown.

npx playbooks add skill jst-well-dan/skill-box --skill advanced-video-downloader

Review the files below or copy the command above to add this skill to your agents.

Files (6)

SKILL.md

10.5 KB

---
name: advanced-video-downloader
description: Download and transcribe videos from YouTube, Bilibili, TikTok and 1000+ platforms. Use when user requests video download, transcription (转录/字幕提取), or converting video to text/markdown. Supports quality selection, audio extraction, playlist downloads, cookie-based authentication, and AI-powered transcription via SiliconFlow API (免费转录).
---

# Advanced Video Downloader

## Overview

This skill provides comprehensive video downloading and transcription capabilities from 1000+ platforms including YouTube, Bilibili, TikTok, Twitter, Instagram, and more. It combines:
- **yt-dlp**: Powerful video downloading tool
- **SiliconFlow API**: Free AI-powered transcription to convert videos to Markdown

## When to Use This Skill

Activate this skill when the user:
- Explicitly requests to download a video ("download this video", "下载视频")
- Provides video URLs from any platform
- Mentions saving videos for offline viewing
- Wants to extract audio from videos
- Needs to download multiple videos or playlists
- Asks about video quality options
- Requests video transcription ("转录视频", "提取字幕", "视频转文字")
- Wants to convert video/audio to text or Markdown
- Asks to download AND transcribe a video in one workflow

## Core Capabilities

### 1. Single Video Download
Download individual videos from any supported platform with automatic quality selection.

**Example usage:**
```
User: "Download this YouTube video: https://youtube.com/watch?v=abc123"
User: "下载这个B站视频: https://bilibili.com/video/BV1xxx"
```

### 2. Batch & Playlist Download
Download multiple videos or entire playlists at once.

**Example usage:**
```
User: "Download all videos from this playlist"
User: "Download these 3 videos: [URL1], [URL2], [URL3]"
```

### 3. Audio Extraction
Extract audio only from videos, saving as MP3 or M4A.

**Example usage:**
```
User: "Download only the audio from this video"
User: "Convert this video to MP3"
```

### 4. Quality Selection
Choose specific video quality (4K, 1080p, 720p, etc.).

**Example usage:**
```
User: "Download in 4K quality"
User: "Get the 720p version to save space"
```

### 5. Video/Audio Transcription
Convert video or audio files to Markdown text using SiliconFlow's free AI transcription API.

**Example usage:**
```
User: "Transcribe this video to text" / "转录这个视频"
User: "Download and transcribe this YouTube video"
User: "将这个音频转成文字"
User: "Extract transcript from this MP4 file"
```

**Supported formats:**
- Audio: MP3, WAV, M4A, FLAC, AAC, OGG, OPUS, WMA
- Video: MP4, AVI, MOV, MKV, FLV, WMV, WEBM, M4V

## Response Pattern

When a user requests video download:

### Step 1: Identify the Platform and URL(s)
```python
# Extract video URL(s) from user message
# Identify platform: YouTube, Bilibili, TikTok, etc.
```

### Step 2: Check Tool Availability
```bash
# Check if yt-dlp is installed
yt-dlp --version
```

### Step 3: Select Appropriate yt-dlp Command

Based on platform and requirements:
- **YouTube, Twitter, Instagram, TikTok**: Basic command works
- **Bilibili**: Basic command works for most videos
- **Quality selection**: Use `-f` with height filter
- **Audio only**: Use `-x --audio-format mp3`
- **Playlists**: Use playlist-specific output template

### Step 4: Execute Download

Use yt-dlp directly with appropriate options:

```bash
# Basic download (best quality MP4)
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best[ext=mp4]/best" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Specific quality (1080p)
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Audio only (MP3)
yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "VIDEO_URL"

# With cookies file (for protected content)
yt-dlp --cookies cookies.txt -o "%(title)s.%(ext)s" "VIDEO_URL"

# Playlist download
yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "PLAYLIST_URL"
```

### Step 5: Report Results
After download completes, report:
- ✅ Video title and duration
- ✅ File size and format
- ✅ Save location
- ✅ Download speed and time taken
- ⚠️ Any warnings or quality limitations

**Example output:**
```
✅ Downloaded: "Video Title Here"
   Duration: 15:30
   Quality: 1080p MP4
   Size: 234 MB
   Location: ./Video Title Here.mp4
   Time: 45 seconds at 5.2 MB/s
```

## Transcription Response Pattern

When a user requests video/audio transcription:

### Step 1: Check Prerequisites
```bash
# Verify SiliconFlow API key is available
echo $SILICONFLOW_API_KEY
# Or user must provide via --api-key parameter
```

**API Key Setup:**
- Get free API key from: https://cloud.siliconflow.cn/account/ak
- Copy `.env.example` to `.env` and add your API key
- Or set environment variable: `SILICONFLOW_API_KEY=sk-xxx`

### Step 2: Validate File
Ensure the file exists and is a supported format (audio or video).

### Step 3: Execute Transcription
Use the bundled script `scripts/transcribe_siliconflow.py`:

```bash
# Basic transcription
python scripts/transcribe_siliconflow.py --file video.mp4 --api-key sk-xxx

# With custom output path
python scripts/transcribe_siliconflow.py --file audio.mp3 --output transcript.md --api-key sk-xxx

# Using environment variable for API key
python scripts/transcribe_siliconflow.py --file video.mp4
```

### Step 4: Report Transcription Results
```
✅ Transcription complete!
   File: video.mp4
   Output: 2025-01-15-video.md
   Size: 12.5 KB

   Preview:
   --------------------------------------------------
   [First 200 characters of transcription...]
   --------------------------------------------------
```

## Combined Workflow: Download + Transcribe

For requests like "Download and transcribe this video":

```bash
# Step 1: Download video
yt-dlp -f "bestvideo[ext=mp4]+bestaudio[ext=m4a]/best" --merge-output-format mp4 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Step 2: Transcribe the downloaded file
python scripts/transcribe_siliconflow.py --file "Downloaded Video Title.mp4" --api-key sk-xxx
```

## Platform-Specific Notes

### YouTube
- Fully supported by yt-dlp
- No authentication needed for public videos
- Supports all quality levels including 4K/8K

### Bilibili
- Supported by yt-dlp
- High-quality downloads may require login cookies
- Use `--cookies` with cookies.txt for member-only content

### Other Platforms
- Most platforms work well with yt-dlp
- Check `references/supported_platforms.md` for full list

## Handling Cookies for Protected Content

For platforms requiring authentication (Bilibili VIP, member-only content, etc.):

### Method 1: Export Cookies File (Recommended)
```bash
# Use browser extension "Get cookies.txt LOCALLY"
# Export cookies.txt, then:
yt-dlp --cookies cookies.txt "VIDEO_URL"
```

### Method 2: Manual Cookies File
```bash
# Create cookies.txt in Netscape format
# Use browser extension "Get cookies.txt LOCALLY"
# Then use with yt-dlp
yt-dlp --cookies cookies.txt "VIDEO_URL"
```

## Troubleshooting

### Issue: Video quality lower than expected
**Solution:**
1. Check if platform requires login for HD
2. Use `--cookies cookies.txt` for authenticated access
3. Explicitly specify quality with `-f` parameter

### Issue: Download very slow
**Solution:**
1. Check internet connection
2. Try different time of day (peak hours affect speed)
3. Use `--concurrent-fragments` for faster downloads

### Issue: "Video unavailable" or geo-restricted
**Solution:**
1. Video may be region-locked
2. Use proxy/VPN if legally permitted
3. Check if video is still available on platform

### Issue: Transcription API key error
**Solution:**
1. Verify API key starts with `sk-`
2. Get free key from: https://cloud.siliconflow.cn/account/ak
3. Set environment variable: `SILICONFLOW_API_KEY=sk-xxx`

### Issue: Transcription returns empty text
**Solution:**
1. Check if audio/video has clear speech
2. Verify file format is supported
3. File may be too short or contain only music

## Common Commands

### Quality Presets

```bash
# 4K (2160p)
yt-dlp -f "bestvideo[height<=2160]+bestaudio/best[height<=2160]" --merge-output-format mp4 "VIDEO_URL"

# 1080p (Full HD)
yt-dlp -f "bestvideo[height<=1080]+bestaudio/best[height<=1080]" --merge-output-format mp4 "VIDEO_URL"

# 720p (HD)
yt-dlp -f "bestvideo[height<=720]+bestaudio/best[height<=720]" --merge-output-format mp4 "VIDEO_URL"

# 480p (SD)
yt-dlp -f "bestvideo[height<=480]+bestaudio/best[height<=480]" --merge-output-format mp4 "VIDEO_URL"
```

### Audio Extraction

```bash
# Extract as MP3
yt-dlp -x --audio-format mp3 -o "%(title)s.%(ext)s" "VIDEO_URL"

# Extract as M4A (better quality)
yt-dlp -x --audio-format m4a -o "%(title)s.%(ext)s" "VIDEO_URL"
```

### Batch Downloads

```bash
# Download multiple URLs from file
yt-dlp -a urls.txt

# Download playlist with custom naming
yt-dlp -o "%(playlist)s/%(playlist_index)s - %(title)s.%(ext)s" "PLAYLIST_URL"

# Download channel's videos
yt-dlp -o "%(uploader)s/%(title)s.%(ext)s" "CHANNEL_URL"
```

## Bundled Resources

### Configuration

#### `.env.example`
Template for environment variables. Copy to `.env` and add your SiliconFlow API key.

### Scripts

#### `scripts/transcribe_siliconflow.py`
AI-powered transcription script using SiliconFlow's free API.

**Usage:**
```bash
python scripts/transcribe_siliconflow.py --file <audio/video> [--api-key <key>] [--output <path>]
```

**Parameters:**
- `--file, -f`: Input audio/video file (required)
- `--api-key, -k`: SiliconFlow API key (or use `SILICONFLOW_API_KEY` env var)
- `--output, -o`: Output markdown file path (default: `YYYY-MM-DD-filename.md`)
- `--model, -m`: Model to use (default: `FunAudioLLM/SenseVoiceSmall`)

### References

#### `references/supported_platforms.md`
Comprehensive list of 1000+ supported platforms with platform-specific notes and requirements.

#### `references/quality_formats.md`
Detailed explanation of video formats, codecs, and quality selection strategies.

## Tips for Best Results

1. **Always specify quality if user has preference** - saves bandwidth and storage
2. **Batch downloads save time** - use playlist URLs when possible
3. **Audio extraction is faster** - recommend for podcast/music content
4. **Check file size before downloading** - warn user for very large files (>1GB)
5. **Transcription works best with clear audio** - consider extracting audio first for better results

## Sources

- [yt-dlp Documentation](https://github.com/yt-dlp/yt-dlp)
- [yt-dlp Installation Guide](https://github.com/yt-dlp/yt-dlp#installation)
- [SiliconFlow API Documentation](https://docs.siliconflow.cn/)
- [SiliconFlow Free API Key](https://cloud.siliconflow.cn/account/ak)

Overview

This skill downloads and transcribes videos from YouTube, Bilibili, TikTok and 1000+ other platforms. It combines yt-dlp for robust downloading with SiliconFlow API for AI-powered transcription to Markdown. It supports quality selection, audio extraction, playlist downloads, and cookie-based authentication for protected content. Use it to get offline copies, extract audio, or convert spoken content into editable text.

How this skill works

The skill parses user messages to extract video URLs and detect the platform, then chooses yt-dlp commands tailored to the request (single video, playlist, audio-only, or specific quality). For protected or member-only content it accepts a cookies file for authenticated downloads. After download it can call the SiliconFlow transcription API to convert audio/video to Markdown using a provided API key or environment variable. It reports results including title, duration, file size, location, and any warnings.

When to use it

User asks to download a video URL or says “download this video”
User wants audio extraction or conversion to MP3/M4A
User requests transcription, subtitle extraction, or video→text/Markdown
User needs batch or playlist downloads
Content requires login/cookies for high-quality access
User wants to choose specific quality (4K, 1080p, 720p)

Best practices

Ask whether the user prefers video, audio-only, or both before starting
Request a cookies.txt file when member-only or high-quality content is needed
Warn users about large file sizes (>1GB) and offer lower-quality alternatives
Recommend extracting audio first if transcription quality is the priority
Require the SiliconFlow API key (env var or provided) before transcription

Example use cases

Download a single YouTube video in 1080p and save as MP4
Extract MP3 audio from a TikTok or Instagram video for podcast use
Download an entire playlist and organize files by playlist index
Download a Bilibili member-only episode using cookies.txt
Download a lecture video and transcribe it to Markdown with timestamps

FAQ

Do I need an API key for transcription?

Yes. Transcription via SiliconFlow requires an API key (set SILICONFLOW_API_KEY or provide with the command).

Can I download playlists or many URLs at once?

Yes. The skill supports batch downloads and playlist downloads with customizable output templates.