home / skills / 89jobrien / steve / file-converter

file-converter skill

/steve/skills/file-converter

This skill generates Python code to convert files between formats across documents, data, and images, handling edge cases and producing ready-to-run scripts.

npx playbooks add skill 89jobrien/steve --skill file-converter

Review the files below or copy the command above to add this skill to your agents.

Files (4)
SKILL.md
3.5 KB
---
name: file-converter
description: This skill handles file format conversions across documents (PDF, DOCX,
  Markdown, HTML, TXT), data files (JSON, CSV, YAML, XML, TOML), and images (PNG,
  JPG, WebP, SVG, GIF). Use when the user requests converting, transforming, or exporting
  files between formats. Generates conversion code dynamically based on the specific
  request.
author: Joseph OBrien
status: unpublished
updated: '2025-12-23'
version: 1.0.1
tag: skill
type: skill
---

# File Converter

## Overview

Convert files between formats across three categories: documents, data files, and images. Generate Python code dynamically for each conversion request, selecting appropriate libraries and handling edge cases.

## Conversion Categories

### Documents

| From | To | Recommended Library |
|------|-----|---------------------|
| Markdown | HTML | `markdown` or `mistune` |
| HTML | Markdown | `markdownify` or `html2text` |
| HTML | PDF | `weasyprint` or `pdfkit` (requires wkhtmltopdf) |
| PDF | Text | `pypdf` or `pdfplumber` |
| DOCX | Markdown | `mammoth` |
| DOCX | PDF | `docx2pdf` (Windows/macOS) or LibreOffice CLI |
| Markdown | PDF | Convert via HTML first, then to PDF |

### Data Files

| From | To | Recommended Library |
|------|-----|---------------------|
| JSON | YAML | `pyyaml` |
| YAML | JSON | `pyyaml` |
| JSON | CSV | `pandas` or stdlib `csv` + `json` |
| CSV | JSON | `pandas` or stdlib `csv` + `json` |
| JSON | TOML | `tomli`/`tomllib` (read) + `tomli-w` (write) |
| XML | JSON | `xmltodict` |
| JSON | XML | `dicttoxml` or `xmltodict.unparse` |

### Images

| From | To | Recommended Library |
|------|-----|---------------------|
| PNG/JPG/WebP/GIF | Any raster | `Pillow` (PIL) |
| SVG | PNG/JPG | `cairosvg` or `svglib` + `reportlab` |
| PNG | SVG | `potrace` (CLI) for tracing, limited fidelity |

## Workflow

1. Identify source format (from file extension or user statement)
2. Identify target format
3. Check `references/` for format-specific guidance
4. Generate conversion code using recommended library
5. Handle edge cases (encoding, transparency, nested structures)
6. Execute conversion and report results

## Quick Patterns

### Data: JSON to YAML

```python
import json
import yaml

with open("input.json") as f:
    data = json.load(f)

with open("output.yaml", "w") as f:
    yaml.dump(data, f, default_flow_style=False, allow_unicode=True)
```

### Data: CSV to JSON

```python
import csv
import json

with open("input.csv") as f:
    reader = csv.DictReader(f)
    data = list(reader)

with open("output.json", "w") as f:
    json.dump(data, f, indent=2)
```

### Document: Markdown to HTML

```python
import markdown

with open("input.md") as f:
    md_content = f.read()

html = markdown.markdown(md_content, extensions=["tables", "fenced_code"])

with open("output.html", "w") as f:
    f.write(html)
```

### Image: PNG to WebP

```python
from PIL import Image

img = Image.open("input.png")
img.save("output.webp", "WEBP", quality=85)
```

### Image: SVG to PNG

```python
import cairosvg

cairosvg.svg2png(url="input.svg", write_to="output.png", scale=2)
```

## Resources

Detailed guidance for complex conversions is in `references/`:

- `references/document-conversions.md` - PDF handling, encoding issues, styling preservation
- `references/data-conversions.md` - Schema handling, type coercion, nested structures
- `references/image-conversions.md` - Quality settings, transparency, color profiles

Consult these references when handling edge cases or when the user has specific quality/fidelity requirements.

Overview

This skill converts files between common document, data, and image formats and generates runnable Python conversion code tailored to the request. It picks appropriate libraries, accounts for common edge cases like encoding and transparency, and can produce both code snippets and runnable scripts. Use it when you need reliable, reproducible format transformations or export pipelines.

How this skill works

The skill detects source and target formats from filenames or user instructions, then selects a recommended Python library and generates conversion code with sensible defaults. It includes handling for encoding, nested data structures, and image quality or transparency settings. For complex cases it suggests patterns and checks for platform constraints (for example, tools that require native binaries). Finally, it can present the code and explain execution steps and caveats.

When to use it

  • Convert documents between Markdown, HTML, PDF, DOCX, and plain text.
  • Transform data files among JSON, YAML, CSV, XML, and TOML for ingestion or storage.
  • Batch convert or resize images (PNG, JPG, WebP, GIF, SVG) and control quality/transparency.
  • Generate reproducible Python scripts for automated pipelines or CI tasks.
  • When you need examples that handle encoding, nested structures, or format-specific pitfalls.

Best practices

  • Identify formats by extension and verify content before batch runs to avoid corrupted outputs.
  • Prefer stable libraries listed for each conversion (e.g., Pillow for raster images, pyyaml for YAML).
  • Normalize data types and schemas before converting between CSV and hierarchical formats to avoid type loss.
  • Test conversions on representative samples and include logging for failures and skipped items.
  • Specify quality, scale, and color/profile settings for image exports to preserve fidelity.

Example use cases

  • Generate a Python script to convert a folder of Markdown files to HTML and then to PDF for printing.
  • Convert JSON API dumps to CSV for spreadsheet analysis while preserving nested object fields.
  • Transform a collection of SVG logos to optimized PNG and WebP variants with specified scales and quality.
  • Produce a one-off script to extract text from PDFs into plain text using recommended PDF libraries.

FAQ

Which libraries should I use for common conversions?

Use Pillow for raster image work, pyyaml for YAML, pandas or csv+json for CSV/JSON, and markdown/mistune for Markdown to HTML. The skill chooses the recommended option per conversion.

Can the skill handle large batches and streaming data?

Yes — generated code can be adapted for streaming or chunked processing; for very large files consider using libraries that support streaming reads/writes and adding error handling and retries.