home / skills / nanmicoder / claude-code-skills / srt-to-structured-data

This skill converts SRT subtitles into structured JSON, extracting timing, duration, and text for analysis and downstream processing.

npx playbooks add skill nanmicoder/claude-code-skills --skill srt-to-structured-data

Review the files below or copy the command above to add this skill to your agents.

Files (2)
SKILL.md
3.1 KB
---
name: srt-to-structured-data
description: |
  将 SRT 字幕文件转换为结构化 JSON 数据。
  触发场景:
  (1) 需要解析 SRT 字幕文件
  (2) 需要将字幕转为 JSON/结构化格式
  (3) 需要提取字幕时间码和文本
  (4) 视频字幕数据处理和分析
  (5) 生成字幕纯文本或统计信息
metadata:
  author: nanmi
  version: "1.0.0"
---

# SRT 字幕转结构化数据

将 SRT 字幕文件解析为结构化 JSON 格式,支持提取时间码、计算时长、生成统计信息。

## 快速开始

### 基础用法

```bash
# 解析 SRT 文件,输出到终端
python <skill_path>/scripts/parse_srt.py input.srt

# 输出到文件
python <skill_path>/scripts/parse_srt.py input.srt -o output.json

# 包含统计信息
python <skill_path>/scripts/parse_srt.py input.srt --stats

# 仅输出纯文本(去除时间码)
python <skill_path>/scripts/parse_srt.py input.srt --text-only
```

**注意:** `<skill_path>` 是此 skill 的安装路径,通常为 `~/.claude/plugins/srt-to-structured-data@claude-code-skills/skills/srt-to-structured-data`

## 输出格式

### JSON 结构化数据

```json
{
  "subtitles": [
    {
      "index": 1,
      "start_time": "00:00:00,000",
      "end_time": "00:00:02,566",
      "start_ms": 0,
      "end_ms": 2566,
      "duration_ms": 2566,
      "text": "Clawdbot真的太火太火太火了"
    },
    {
      "index": 2,
      "start_time": "00:00:02,633",
      "end_time": "00:00:04,766",
      "start_ms": 2633,
      "end_ms": 4766,
      "duration_ms": 2133,
      "text": "Github一天直接涨了5万星"
    }
  ],
  "statistics": {
    "total_count": 2,
    "total_duration_ms": 4699,
    "total_duration_formatted": "00:04",
    "avg_duration_ms": 2349
  }
}
```

### 纯文本输出

使用 `--text-only` 参数时,仅输出字幕文本,每条一行:

```
Clawdbot真的太火太火太火了
Github一天直接涨了5万星
```

## 命令行参数

| 参数 | 说明 |
|------|------|
| `input.srt` | 输入的 SRT 字幕文件路径 |
| `-o, --output` | 输出文件路径(默认输出到终端) |
| `--stats` | 在 JSON 输出中包含统计信息 |
| `--text-only` | 仅输出纯文本,去除时间码和序号 |

## 字段说明

| 字段 | 类型 | 说明 |
|------|------|------|
| `index` | int | 字幕序号 |
| `start_time` | string | 开始时间(原始格式) |
| `end_time` | string | 结束时间(原始格式) |
| `start_ms` | int | 开始时间(毫秒) |
| `end_ms` | int | 结束时间(毫秒) |
| `duration_ms` | int | 持续时长(毫秒) |
| `text` | string | 字幕文本内容 |

## 使用示例

### 场景 1:分析字幕文件

```bash
python <skill_path>/scripts/parse_srt.py video.srt --stats -o analysis.json
```

### 场景 2:提取纯文本用于翻译

```bash
python <skill_path>/scripts/parse_srt.py video.srt --text-only -o transcript.txt
```

### 场景 3:在 Python 中直接使用

```python
import subprocess
import json

result = subprocess.run(
    ['python', '<skill_path>/scripts/parse_srt.py',
     'input.srt', '--stats'],
    capture_output=True, text=True
)
data = json.loads(result.stdout)
```

Overview

This skill converts SRT subtitle files into structured JSON data for programmatic use. It extracts timecodes, computes millisecond durations, and can emit pure text transcripts or summary statistics. The tool is designed for quick command-line use and easy integration into pipelines.

How this skill works

The script parses standard SRT blocks (index, time range, text), normalizes start and end times, and computes start_ms, end_ms, and duration_ms for each subtitle. It can output a JSON object containing an array of subtitle entries and optional statistics, or output a plain text transcript when requested. Command-line flags control output location, inclusion of statistics, and text-only mode.

When to use it

  • You need to transform SRT captions into JSON for analysis or storage.
  • You want subtitle timestamps converted into millisecond offsets for synchronization.
  • You need a plain transcript extracted from subtitles for translation or NLP.
  • You are building video-processing pipelines that require structured subtitle data.
  • You want quick statistics like total subtitle count and total duration.

Best practices

  • Validate SRT encoding (UTF-8 recommended) before parsing to avoid character issues.
  • Use --stats for analytics workflows and omit it for minimal JSON output.
  • Use --text-only when feeding text to translation or speech models to avoid timecodes.
  • Trim or normalize long subtitle texts if you plan to index or store them in smaller DB fields.
  • Run the parser as a preprocessing step in automated video pipelines to keep downstream steps simple.

Example use cases

  • Convert a video's SRT to JSON for feeding into a subtitle search index.
  • Extract transcript lines with --text-only to send to a translation API.
  • Generate subtitle statistics (total_count, total_duration_ms, avg_duration_ms) for QA reports.
  • Compute precise start_ms and end_ms for subtitle-driven clip extraction.
  • Integrate into a batch job to normalize many SRT files into a consistent JSON schema.

FAQ

What formats are supported as input?

Standard SRT files are supported. Ensure the file follows SRT block formatting (index, time range, text).

How do I get only the transcript without timestamps?

Run the script with the --text-only flag to output each subtitle text on a separate line.

Can I include statistics in the JSON output?

Yes. Use the --stats flag to add total_count, total_duration_ms, total_duration_formatted, and avg_duration_ms.