MarkItDown MCP server for AI agents

MarkItDown is a lightweight Python utility that converts various file formats to Markdown for use with LLMs and text analysis pipelines. It preserves important document structure like headings, lists, tables, and links while converting documents such as PDFs, PowerPoint presentations, Word documents, Excel spreadsheets, images, audio files, and more into Markdown format.

Prerequisites

MarkItDown requires Python 3.10 or higher. It's recommended to use a virtual environment:

With standard Python:

python -m venv .venv
source .venv/bin/activate

With uv:

uv venv --python=3.12 .venv
source .venv/bin/activate
# NOTE: Be sure to use 'uv pip install' rather than just 'pip install'

With Anaconda:

conda create -n markitdown python=3.12
conda activate markitdown

Installation

Install MarkItDown with pip:

pip install 'markitdown[all]'

Or install from source:

git clone [email protected]:microsoft/markitdown.git
cd markitdown
pip install -e 'packages/markitdown[all]'

Usage

Command-Line Usage

Convert a file to Markdown:

markitdown path-to-file.pdf > document.md

Specify an output file:

markitdown path-to-file.pdf -o document.md

Pipe content:

cat path-to-file.pdf | markitdown

Optional Dependencies

You can install only the dependencies you need:

pip install 'markitdown[pdf,docx,pptx]'

Available optional dependencies:

[all]: Installs all optional dependencies
[pptx]: For PowerPoint files
[docx]: For Word files
[xlsx]: For Excel files
[xls]: For older Excel files
[pdf]: For PDF files
[outlook]: For Outlook messages
[az-doc-intel]: For Azure Document Intelligence
[audio-transcription]: For audio transcription
[youtube-transcription]: For YouTube video transcription

Using Plugins

List installed plugins:

markitdown --list-plugins

Enable plugins:

markitdown --use-plugins path-to-file.pdf

Azure Document Intelligence

Use Microsoft Document Intelligence for conversion:

markitdown path-to-file.pdf -o document.md -d -e "<document_intelligence_endpoint>"

More information about setting up an Azure Document Intelligence Resource can be found at Microsoft Learn.

Python API

Basic usage:

from markitdown import MarkItDown

md = MarkItDown(enable_plugins=False)  # Set to True to enable plugins
result = md.convert("test.xlsx")
print(result.text_content)

Document Intelligence conversion:

from markitdown import MarkItDown

md = MarkItDown(docintel_endpoint="<document_intelligence_endpoint>")
result = md.convert("test.pdf")
print(result.text_content)

Using LLMs for image descriptions:

from markitdown import MarkItDown
from openai import OpenAI

client = OpenAI()
md = MarkItDown(llm_client=client, llm_model="gpt-4o", llm_prompt="optional custom prompt")
result = md.convert("example.jpg")
print(result.text_content)

Docker Usage

docker build -t markitdown:latest .
docker run --rm -i markitdown:latest < ~/your-file.pdf > output.md

MCP Server

MarkItDown offers an MCP (Model Context Protocol) server for integration with LLM applications like Claude Desktop. To use the MCP server, you'll need to install the markitdown-mcp package, which is available in the main MarkItDown repository.

Installing the MCP Server

pip install 'markitdown-mcp'

Running the MCP Server

markitdown-mcp

The server will start on the default port 8080. You can specify a different port using the --port option:

markitdown-mcp --port 9000

Once running, the MCP server can be integrated with LLM applications that support the Model Context Protocol to provide document conversion capabilities.

How to install this MCP server

For Claude Code

To add this MCP server to Claude Code, run this command in your terminal:

claude mcp add-json "markitdown-mcp" '{"command":"npx","args":["-y","markitdown-mcp"]}'

See the official Claude Code MCP documentation for more details.

For Cursor

There are two ways to add an MCP server to Cursor. The most common way is to add the server globally in the ~/.cursor/mcp.json file so that it is available in all of your projects.

If you only need the server in a single project, you can add it to the project instead by creating or adding it to the .cursor/mcp.json file.

Adding an MCP server to Cursor globally

To add a global MCP server go to Cursor Settings > Tools & Integrations and click "New MCP Server".

When you click that button the ~/.cursor/mcp.json file will be opened and you can add your server like this:

{
    "mcpServers": {
        "markitdown-mcp": {
            "command": "npx",
            "args": [
                "-y",
                "markitdown-mcp"
            ]
        }
    }
}

Adding an MCP server to a project

To add an MCP server to a project you can create a new .cursor/mcp.json file or add it to the existing one. This will look exactly the same as the global MCP server example above.

How to use the MCP server

Once the server is installed, you might need to head back to Settings > MCP and click the refresh button.

The Cursor agent will then be able to see the available tools the added MCP server has available and will call them when it needs to.

You can also explicitly ask the agent to use the tool by mentioning the tool name and describing what the function does.

For Claude Desktop

To add this MCP server to Claude Desktop:

1. Find your configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

2. Add this to your configuration file:

{
    "mcpServers": {
        "markitdown-mcp": {
            "command": "npx",
            "args": [
                "-y",
                "markitdown-mcp"
            ]
        }
    }
}

3. Restart Claude Desktop for the changes to take effect