home / skills / codingheader / myskills / 0xbigboss-extract-transcripts

This skill helps extract readable transcripts from Claude Code and Codex CLI sessions into markdown, enabling quick review and search.

npx playbooks add skill codingheader/myskills --skill 0xbigboss-extract-transcripts

Review the files below or copy the command above to add this skill to your agents.

Files (6)
SKILL.md
3.6 KB
---
name: extract-transcripts
description: Extract readable transcripts from Claude Code and Codex CLI session JSONL files
---

# Extract Transcripts

Extracts readable markdown transcripts from Claude Code and Codex CLI session JSONL files.

## Scripts

### Claude Code Sessions

```bash
# Extract a single session
python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl>

# With tool calls and thinking blocks
python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --include-tools --include-thinking

# Extract all sessions from a directory
python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all

# Output to file
python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> -o output.md

# Summary only (quick overview)
python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <session.jsonl> --summary

# Skip empty/warmup-only sessions
python3 ~/.claude/skills/extract-transcripts/extract_transcript.py <directory> --all --skip-empty
```

**Options:**
- `--include-tools`: Include tool calls and results
- `--include-thinking`: Include Claude's thinking blocks
- `--all`: Process all .jsonl files in directory
- `-o, --output`: Output file path (default: stdout)
- `--summary`: Only output brief summary
- `--skip-empty`: Skip empty and warmup-only sessions
- `--min-messages N`: Minimum messages for --skip-empty (default: 2)

### Codex CLI Sessions

```bash
# Extract a Codex session
python3 ~/.claude/skills/extract-transcripts/extract_codex_transcript.py <session.jsonl>

# Extract from Codex history file
python3 ~/.claude/skills/extract-transcripts/extract_codex_transcript.py ~/.codex/history.jsonl --history
```

## Session File Locations

### Claude Code
- Sessions: `~/.claude/projects/<project-path>/<session-id>.jsonl`

### Codex CLI
- Sessions: `~/.codex/sessions/<session_id>/rollout.jsonl`
- History: `~/.codex/history.jsonl`

## DuckDB-Based Transcript Index

For querying across many sessions, use the DuckDB-based indexer:

```bash
# Index all sessions (incremental - only new/changed files)
python3 ~/.claude/skills/extract-transcripts/transcript_index.py index

# Force full reindex
python3 ~/.claude/skills/extract-transcripts/transcript_index.py index --full

# Limit number of files to process
python3 ~/.claude/skills/extract-transcripts/transcript_index.py index --limit 10

# List recent sessions
python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent
python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent --limit 20
python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent --project myapp
python3 ~/.claude/skills/extract-transcripts/transcript_index.py recent --since 7d

# Search across sessions
python3 ~/.claude/skills/extract-transcripts/transcript_index.py search "error handling"
python3 ~/.claude/skills/extract-transcripts/transcript_index.py search "query" --cwd ~/myproject

# Show a session transcript
python3 ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path>
python3 ~/.claude/skills/extract-transcripts/transcript_index.py show <file_path> --summary
```

**Requirements:** DuckDB (`pip install duckdb`)

**Database location:** `~/.claude/transcript-index/sessions.duckdb`

## Output Format

Transcripts are formatted as markdown with:
- Session metadata (date, duration, model, working directory, git branch)
- User messages prefixed with `## User`
- Assistant responses prefixed with `## Assistant`
- Tool calls in code blocks (if --include-tools)
- Thinking in blockquotes (if --include-thinking)
- Tool usage summary for Codex sessions

Overview

This skill extracts readable Markdown transcripts from Claude Code and Codex CLI session JSONL files. It converts session JSONL into organized transcripts showing metadata, user messages, assistant replies, and optional tool calls or internal thinking. The output is suitable for archiving, searching, or sharing conversational history.

How this skill works

The scripts parse JSONL session files produced by Claude Code and Codex CLI, normalize message roles and timestamps, and render a clean Markdown transcript. Options let you include tool invocation blocks and model thinking blocks, produce summaries only, or batch-process directories. A separate DuckDB indexer can ingest transcripts for fast cross-session search and querying.

When to use it

  • You need readable, shareable logs of assistant sessions for audits or documentation.
  • You want to extract only meaningful exchanges and skip warmup/empty sessions.
  • You need to include or exclude tool calls and internal thinking for compliance or clarity.
  • You want to index many sessions for search using DuckDB.
  • You need quick summaries instead of full transcripts for review.

Best practices

  • Run on the session directory (Claude) or history file (Codex) to batch-process many files.
  • Use --skip-empty with --all to avoid storing warmups or noise.
  • Include --include-tools and --include-thinking only when needed to preserve sensitive internal content.
  • Use the DuckDB indexer for recurring search and analytics across sessions.
  • Specify -o to write transcripts to files for archival and version control.

Example use cases

  • Convert a single Claude session JSONL into a Markdown file for handoff to teammates.
  • Process a project’s session directory to produce a browsable archive of assistant interactions.
  • Index months of sessions into DuckDB and run text searches like “error handling” or specific code snippets.
  • Extract Codex CLI history as a tool-usage summary to review automated steps taken during a rollout.
  • Generate quick summary overviews of sessions for management or compliance review.

FAQ

Which files can this skill process?

It handles Claude Code session .jsonl files and Codex CLI session or history JSONL files located in the standard application paths or any supplied path.

How do I include tool calls and internal thinking in transcripts?

Use the --include-tools flag to add tool invocation blocks and --include-thinking to include model thinking blocks; omit them to keep transcripts concise.

How do I search across many transcripts?

Use the included DuckDB indexer: run the index command to ingest transcripts, then use search and recent subcommands to find sessions and content quickly.