home / skills / georgekhananaev / claude-skills-vault / file-converter

file-converter skill

safe

This skill converts and transforms files across formats and media, enabling batch processing and consistent outputs for workflows.

npx playbooks add skill georgekhananaev/claude-skills-vault --skill file-converter

Review the files below or copy the command above to add this skill to your agents.

Files (10)

SKILL.md

9.1 KB

---
name: file-converter
description: Convert & transform files - images (resize, format, HEIC), markdown (PDF/HTML), data (CSV/JSON/YAML/TOML/XML), SVG, base64, text encoding. Cross-platform, single & batch mode. This skill should be used when converting file formats, resizing images, generating PDFs from markdown, or transforming data between formats.
author: George Khananaev
---

# File Converter

Convert files between formats w/ single & batch support. All scripts use consistent CLI patterns.

## When to Use

- Convert images between PNG, JPG, WEBP, BMP, TIFF, GIF, ICO, AVIF, HEIC/HEIF
- Resize/crop images w/ fit modes (contain, cover, fill, inside, outside)
- Convert markdown -> PDF or HTML w/ themes
- Convert HTML -> markdown (w/ tag stripping control)
- Transform CSV <-> JSON <-> YAML <-> TOML <-> XML
- SVG <-> raster conversion (PNG, JPG, WEBP, BMP, TIFF)
- Base64 encode/decode files (w/ data URI support, stdin)
- Fix text encoding issues (detect, convert w/ error handling strategies)

## Quick Routing

| Task | Script | Deps |
|------|--------|------|
| Image convert/resize | `convert_image.py` | Pillow, pillow-heif (opt) |
| Markdown -> HTML | `md_to_html.py` | markdown, pygments |
| Markdown -> PDF | `md_to_pdf.py` | markdown + weasyprint\|pdfkit |
| HTML -> Markdown | `html_to_md.py` | markdownify, bs4 |
| CSV/JSON/YAML/TOML/XML | `csv_json_yaml.py` | pyyaml, tomli-w, xmltodict, dicttoxml (per format) |
| SVG convert | `svg_convert.py` | cairosvg, Pillow |
| Base64 encode/decode | `base64_codec.py` | (none) |
| Text encoding | `text_encoding.py` | chardet (opt) |
| Cross-platform utils | `platform_utils.py` | (none) - shared by pdf/svg scripts |

## Install Deps

```bash
# All deps (recommended)
pip install Pillow markdown pygments weasyprint markdownify beautifulsoup4 cairosvg pyyaml chardet tomli-w xmltodict dicttoxml

# Optional
pip install pillow-heif                    # HEIC/HEIF support

# Minimal (per task)
pip install Pillow                          # Images only
pip install markdown pygments               # MD -> HTML only
pip install markdown weasyprint             # MD -> PDF only (macOS: brew install pango)
pip install markdownify beautifulsoup4      # HTML -> MD only
pip install pyyaml                          # YAML support
pip install tomli-w                         # TOML write (read: Python 3.11+ built-in)
pip install xmltodict dicttoxml             # XML support
pip install cairosvg                        # SVG -> raster (macOS: brew install cairo)
pip install chardet                         # Encoding detection
```

## CLI Patterns

All scripts share consistent arg patterns:

```bash
# Single file
python3 scripts/<script>.py input.ext output.ext

# Batch (directory or glob)
python3 scripts/<script>.py *.ext --output-dir ./out
python3 scripts/<script>.py ./dir/ --output-dir ./out --format ext
```

## 1. Image Convert & Resize

```bash
# Format conversion
python3 scripts/convert_image.py photo.png photo.webp
python3 scripts/convert_image.py photo.jpg photo.avif --quality 80

# Resize
python3 scripts/convert_image.py photo.jpg thumb.jpg --width 300
python3 scripts/convert_image.py photo.png banner.png --width 1200 --height 400 --fit cover

# Batch convert directory
python3 scripts/convert_image.py ./photos/ --output-dir ./webp --format webp --width 1200 --quality 85
python3 scripts/convert_image.py *.png --output-dir ./thumbs --format jpg --width 300 --height 300 --fit cover
```

**Fit modes:**

| Mode | Behavior |
|------|----------|
| contain | Fit inside bounds, preserve ratio (def) |
| cover | Fill bounds, crop overflow |
| fill | Stretch to exact dimensions |
| inside | Like contain, but only shrink (never enlarge) |
| outside | Like cover, but never crop |

**Supported:** PNG, JPG, WEBP, BMP, TIFF, GIF, ICO, AVIF, HEIC/HEIF (w/ pillow-heif)

Auto-fixes EXIF orientation. Guards against decompression bombs (300M pixel limit).

## 2. Markdown -> HTML

```bash
# Single file
python3 scripts/md_to_html.py README.md readme.html
python3 scripts/md_to_html.py doc.md doc.html --theme dark

# Batch
python3 scripts/md_to_html.py ./docs/ --output-dir ./site --theme github
```

**Themes:** github (def), dark, minimal, print

Features: fenced code blocks, syntax highlighting, tables, TOC w/ permalinks, responsive CSS.

## 3. Markdown -> PDF

```bash
# Single file
python3 scripts/md_to_pdf.py report.md report.pdf
python3 scripts/md_to_pdf.py spec.md spec.pdf --theme report

# Batch
python3 scripts/md_to_pdf.py ./docs/ --output-dir ./pdfs --theme report
```

**Themes:** default, report (formal w/ serif), minimal

**PDF engines:** weasyprint (preferred, no external deps on macOS) or pdfkit (requires wkhtmltopdf). Script auto-detects available engine.

## 4. HTML -> Markdown

```bash
# Single file
python3 scripts/html_to_md.py page.html page.md

# Strip unwanted tags (default: script, style, noscript)
python3 scripts/html_to_md.py page.html page.md --strip script style nav footer

# Keep all HTML tags (no stripping)
python3 scripts/html_to_md.py page.html page.md --keep-all

# Batch
python3 scripts/html_to_md.py ./site/ --output-dir ./docs
```

## 5. Data Formats (CSV/JSON/YAML/TOML/XML)

```bash
# Any direction
python3 scripts/csv_json_yaml.py data.csv data.json
python3 scripts/csv_json_yaml.py data.json data.yaml
python3 scripts/csv_json_yaml.py config.yaml config.json
python3 scripts/csv_json_yaml.py config.toml config.json
python3 scripts/csv_json_yaml.py data.json data.xml

# Batch
python3 scripts/csv_json_yaml.py *.csv --output-dir ./json --format json
```

**Supported:** CSV, JSON, YAML (.yaml/.yml), TOML (.toml), XML (.xml). All directions supported where deps are installed.

## 6. SVG Conversion

```bash
# SVG -> raster
python3 scripts/svg_convert.py icon.svg icon.png --width 512
python3 scripts/svg_convert.py logo.svg logo.jpg --width 1024 --quality 90

# Raster -> SVG (embedded image wrapper)
python3 scripts/svg_convert.py photo.png photo.svg

# Batch
python3 scripts/svg_convert.py *.svg --output-dir ./png --format png --width 256
```

## 7. Base64 Encode/Decode

```bash
# Encode to stdout
python3 scripts/base64_codec.py encode image.png

# Encode to data URI (for HTML/CSS embedding)
python3 scripts/base64_codec.py encode image.png --data-uri

# Encode to file
python3 scripts/base64_codec.py encode image.png -o image.b64

# Decode
python3 scripts/base64_codec.py decode image.b64 -o image.png

# Batch
python3 scripts/base64_codec.py encode *.png --output-dir ./b64
```

## 8. Text Encoding

```bash
# Detect encoding
python3 scripts/text_encoding.py detect file.txt
python3 scripts/text_encoding.py detect *.txt

# Convert encoding (single mode requires -o to prevent accidental overwrite)
python3 scripts/text_encoding.py convert file.txt --to utf-8 -o output.txt
python3 scripts/text_encoding.py convert file.txt --from latin-1 --to utf-8 -o output.txt

# Handle unmappable characters
python3 scripts/text_encoding.py convert file.txt --to ascii --errors replace -o clean.txt
python3 scripts/text_encoding.py convert file.txt --to ascii --errors ignore -o clean.txt

# Batch convert to UTF-8
python3 scripts/text_encoding.py convert *.txt --to utf-8 --output-dir ./utf8
```

**Error modes:** strict (default, fail on unmappable), replace (use ? placeholder), ignore (skip unmappable chars).

## Common Workflows

### Web optimization pipeline

```bash
# Convert photos to WEBP, resize for web, generate thumbnails
python3 scripts/convert_image.py ./photos/ --output-dir ./web --format webp --width 1200 --quality 80
python3 scripts/convert_image.py ./photos/ --output-dir ./thumbs --format webp --width 300 --height 300 --fit cover --quality 75
```

### Documentation pipeline

```bash
# Generate HTML site from markdown docs
python3 scripts/md_to_html.py ./docs/ --output-dir ./site --theme github

# Generate PDF reports
python3 scripts/md_to_pdf.py ./docs/ --output-dir ./pdfs --theme report
```

### Data migration pipeline

```bash
# CSV -> JSON for API import
python3 scripts/csv_json_yaml.py ./exports/ --output-dir ./json --format json

# JSON config -> YAML
python3 scripts/csv_json_yaml.py config.json config.yaml
```

### Icon generation pipeline

```bash
# SVG -> multiple PNG sizes for app icons
for size in 16 32 64 128 256 512; do
  python3 scripts/svg_convert.py icon.svg "icon-${size}.png" --width $size
done
```

## Error Handling

All scripts:
- Print errors per-file in batch mode, continue w/ remaining files
- Exit 1 on fatal errors (missing deps, no input)
- Print size before/after for each conversion
- Create output directories automatically
- Handle KeyboardInterrupt gracefully (exit 130)

## Cross-Platform Support

Scripts work on macOS, Linux, and Windows. Native library paths (cairo, pango, gobject) are auto-configured via `platform_utils.py`:
- **macOS:** `/opt/homebrew/lib`, `/usr/local/lib`
- **Linux:** `/usr/local/lib`, `/usr/lib/x86_64-linux-gnu`
- **Windows:** GTK runtime, MSYS2, Conda paths + `os.add_dll_directory()`

## Stdin Support

`md_to_html.py` and `base64_codec.py` accept `-` for stdin input:

```bash
cat README.md | python3 scripts/md_to_html.py - output.html
cat file.bin | python3 scripts/base64_codec.py encode - -o file.b64
```

## Integration

**Pairs with:** `token-optimizer` (compress markdown before PDF), `code-quality` (validate scripts)

Overview

This skill converts and transforms files across images, markup, and data formats with single-file and batch support. It streamlines common tasks like image resizing and format conversion, markdown to PDF/HTML generation, and bidirectional data format transforms. Cross-platform utilities and consistent CLI patterns make it suitable for automation in build and documentation pipelines.

How this skill works

Each script follows a consistent CLI pattern: single file or directory/glob input with an optional output directory and explicit format flags. Image tools use Pillow (optional pillow-heif) for conversions, resizing, fit modes and EXIF fixes. Markdown and HTML scripts render and convert using markdown, weasyprint/pdfkit and markdownify/bs4. Data converters rely on format-specific libs (pyyaml, tomli-w, xmltodict) to translate between CSV/JSON/YAML/TOML/XML. Base64 and encoding utilities support stdin/stdout and sensible error handling.

When to use it

Convert photos to web-friendly formats (WebP/AVIF) and generate thumbnails in bulk
Generate PDF reports or HTML sites from markdown with selectable themes
Transform CSV exports into JSON/YAML/TOML for API imports or config migration
Convert SVG icons to multiple raster sizes or embed rasters into SVG wrappers
Encode or decode files as base64 data URIs for embedding
Detect and fix text encoding issues or batch-convert files to UTF-8

Best practices

Install only the deps needed per task to keep environments lean (see per-task deps)
Use batch mode with --output-dir to avoid accidental overwrites on single-file runs
Prefer weasyprint when available for Markdown->PDF; fallback to pdfkit only if wkhtmltopdf is installed
Use explicit --format flags for ambiguous batch inputs (directories/globs)
Test transforms on a small sample before running large batches to validate fit modes and error modes

Example use cases

Web optimization: convert photos to WebP at 1200px and create 300px cover thumbnails for a site
Docs pipeline: render a docs directory to HTML with a GitHub theme and output a PDF book using the report theme
Data migration: convert legacy CSV exports into JSON for ingestion into a modern API
Icon generation: produce multi-size raster icons from a single SVG for mobile and web assets
Repair text corpora: detect mixed encodings and convert files to UTF-8 with replace/ignore strategies

FAQ

What happens if a dependency is missing?

Scripts detect missing optional dependencies at runtime, print a clear error and exit; batch runs continue for other files where possible.

Can I stream input via stdin?

Yes. md_to_html.py and base64_codec.py accept - as stdin input for piping workflows.

How are image fit modes handled?

Fit modes include contain, cover, fill, inside and outside. Contain preserves aspect ratio, cover fills and crops, fill stretches to exact size.