home / skills / ceeon / videocut-skills / 字幕
This skill automates video subtitle generation and burning by transcribing with Whisper, correcting, reviewing, timestamp matching, and producing ready-to-burn
npx playbooks add skill ceeon/videocut-skills --skill 字幕Review the files below or copy the command above to add this skill to your agents.
---
name: videocut:字幕
description: 字幕生成与烧录。转录→词典纠错→审核→烧录。触发词:加字幕、生成字幕、字幕
---
# 字幕
> 转录 → 纠错 → 审核 → 匹配 → 烧录
## 流程
```
1. 转录视频(Whisper)
↓
2. 词典纠错 + 分句
↓
3. 输出字幕稿(纯文本,一句一行)
↓
【用户审核修改】
↓
4. 用户给回修改后的文本
↓
5. 我匹配时间戳 → 生成 SRT
↓
6. 烧录字幕(FFmpeg)
```
## 转录
使用 OpenAI Whisper 模型进行语音转文字:
```bash
whisper video.mp4 --model medium --language zh --output_format json
```
| 模型 | 用途 |
|------|------|
| `medium` | 默认,平衡速度与准确率 |
| `large-v3` | 高精度,较慢 |
输出 JSON 包含逐词时间戳,用于后续 SRT 生成。
---
## 字幕规范
| 规则 | 说明 |
|------|------|
| 一屏一行 | 不换行,不堆叠 |
| ≤15字/行 | 超过15字必须拆分(4:3竖屏) |
| 句尾无标点 | `你好` 不是 `你好。` |
| 句中保留标点 | `先点这里,再点那里` |
---
## 词典纠错
读取 `词典.txt`,每行一个正确写法:
```
skills
Claude
iPhone
```
我自动识别变体:`claude` → `Claude`
---
## 字幕稿格式
**我给用户的**(纯文本,≤15字/行):
```
今天给大家分享一个技巧
很多人可能不知道
其实这个功能
藏在设置里面
你只要点击这里
就能看到了
```
**用户修改后给回我**,我再匹配时间戳生成 SRT。
---
## 样式
默认:24号白字、黑色描边、底部居中
**可选样式:**
| 样式 | 说明 |
|------|------|
| 默认 | 白字黑边 |
| 黄字 | 黄字黑边(醒目) |
用户可说:
- "字大一点" → 32号
- "放顶部" → 顶部居中
- "黄色字幕" → 黄字黑边
---
## 输出
```
01-xxx_字幕稿.txt # 纯文本,用户编辑
01-xxx.srt # 字幕文件
01-xxx-字幕.mp4 # 带字幕视频
```
This skill automates subtitle generation and hardcoding for videos: transcribe, correct with a custom dictionary, let the user review, then generate timed SRT and burn subtitles into the video. It produces an editable plain-text subtitle draft, accepts user edits, matches timestamps, and outputs SRT and a final burned-in MP4. The workflow balances accuracy and speed using Whisper and FFmpeg.
The skill transcribes audio with Whisper to get word-level timestamps, then applies dictionary-based corrections and sentence splitting to produce a one-line-per-sentence draft (≤15 characters per line). The user reviews and returns edits; the skill aligns the edited lines to the original timestamps to create an SRT. Finally, FFmpeg is used to render chosen subtitle styles and burn them into the video.
What subtitle format is produced?
The skill outputs a plain-text subtitle draft, a timed SRT file, and a burned-in MP4.
How do I ensure product names are correct?
Add correct spellings into dictionary.txt; the system will normalize variants automatically.