home / skills / intellectronica / agent-skills / markdown-converter

markdown-converter skill

/skills/markdown-converter

This skill converts documents and assets to Markdown using uvx markitdown, preserving structure for easy LLM processing and text analysis.

This is most likely a fork of the markdown-converter skill from steipete
npx playbooks add skill intellectronica/agent-skills --skill markdown-converter

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
1.9 KB
---
name: markdown-converter
description: Convert documents and files to Markdown using markitdown. Use when converting PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls), HTML, CSV, JSON, XML, images (with EXIF/OCR), audio (with transcription), ZIP archives, YouTube URLs, or EPubs to Markdown format for LLM processing or text analysis.
---

# Markdown Converter

Convert files to Markdown using `uvx markitdown` — no installation required.

## Basic Usage

```bash
# Convert to stdout
uvx markitdown input.pdf

# Save to file
uvx markitdown input.pdf -o output.md
uvx markitdown input.docx > output.md

# From stdin
cat input.pdf | uvx markitdown
```

## Supported Formats

- **Documents**: PDF, Word (.docx), PowerPoint (.pptx), Excel (.xlsx, .xls)
- **Web/Data**: HTML, CSV, JSON, XML
- **Media**: Images (EXIF + OCR), Audio (EXIF + transcription)
- **Other**: ZIP (iterates contents), YouTube URLs, EPub

## Options

```bash
-o OUTPUT      # Output file
-x EXTENSION   # Hint file extension (for stdin)
-m MIME_TYPE   # Hint MIME type
-c CHARSET     # Hint charset (e.g., UTF-8)
-d             # Use Azure Document Intelligence
-e ENDPOINT    # Document Intelligence endpoint
--use-plugins  # Enable 3rd-party plugins
--list-plugins # Show installed plugins
```

## Examples

```bash
# Convert Word document
uvx markitdown report.docx -o report.md

# Convert Excel spreadsheet
uvx markitdown data.xlsx > data.md

# Convert PowerPoint presentation
uvx markitdown slides.pptx -o slides.md

# Convert with file type hint (for stdin)
cat document | uvx markitdown -x .pdf > output.md

# Use Azure Document Intelligence for better PDF extraction
uvx markitdown scan.pdf -d -e "https://your-resource.cognitiveservices.azure.com/"
```

## Notes

- Output preserves document structure: headings, tables, lists, links
- First run caches dependencies; subsequent runs are faster
- For complex PDFs with poor extraction, use `-d` with Azure Document Intelligence

Overview

This skill converts a wide range of documents and media into clean, structured Markdown using the markitdown tool. It supports PDFs, Office files, HTML, CSV/JSON/XML, images (with EXIF/OCR), audio (with transcription), ZIP archives, YouTube links, and ePubs. Use it to make files ready for LLM ingestion, text analysis, or content repurposing. No local installation of conversion internals is required; dependencies are cached on first run.

How this skill works

The skill runs markitdown to parse input files and emit Markdown that preserves headings, lists, tables, and links. It accepts files, stdin, and URLs, and it can iterate ZIP contents or transcribe audio before conversion. Options let you hint file extension, MIME type or charset, enable Azure Document Intelligence for difficult PDFs, and toggle third-party plugins for specialized parsing.

When to use it

  • Preparing documents for LLM prompts or vectorization (embeddings)
  • Converting reports, slides, or spreadsheets into editable Markdown
  • Extracting readable text from scans or images via OCR
  • Batch-processing archives (ZIP) or YouTube transcripts into text
  • Transforming web or data formats (HTML/CSV/JSON/XML) to Markdown for review

Best practices

  • Provide a file extension hint (-x) when piping content via stdin to improve format detection
  • Use -o to write output to a file for large conversions instead of stdout
  • Enable Azure Document Intelligence (-d) for scanned or complex PDFs to improve extraction quality
  • Run --list-plugins to inspect available plugins before enabling them with --use-plugins
  • Check first-run logs since the tool caches dependencies which speeds subsequent conversions

Example use cases

  • Convert a research PDF to Markdown for summarization and note-taking
  • Turn a PowerPoint deck into a Markdown outline for content editing or a blog draft
  • Extract spreadsheets to Markdown tables for quick data review or import into markdown-based tools
  • Process a folder of mixed files inside a ZIP archive and convert each item to Markdown
  • Transcribe a YouTube lecture and convert the transcript to structured Markdown for study notes

FAQ

Can the skill handle scanned PDFs or images?

Yes. It uses OCR and can optionally use Azure Document Intelligence (-d) for better results on complex scans.

How do I convert piped input or stdin?

Pipe bytes to the tool and provide a hint (-x .pdf or -m MIME_TYPE) so the converter recognizes the format.

Does it preserve tables and links?

Yes. The output aims to preserve document structure including headings, tables, lists, and links whenever extraction succeeds.