PDF Reader MCP server for AI agents

@sylphx/pdf-reader-mcp is a production-ready PDF processing server for AI agents that extracts text, images, and metadata from PDF files with high performance. It offers 5-10x faster processing through parallelization and features Y-coordinate content ordering for natural document layout preservation.

Installation

Claude Code

claude mcp add pdf-reader -- npx @sylphx/pdf-reader-mcp

Claude Desktop

Add to claude_desktop_config.json:

{
  "mcpServers": {
    "pdf-reader": {
      "command": "npx",
      "args": ["@sylphx/pdf-reader-mcp"]
    }
  }
}

VS Code

code --add-mcp '{"name":"pdf-reader","command":"npx","args":["@sylphx/pdf-reader-mcp"]}'

Cursor

Open Settings → MCP → Add new MCP Server
Select Command type
Enter: npx @sylphx/pdf-reader-mcp

Windsurf

Add to your Windsurf MCP config:

{
  "mcpServers": {
    "pdf-reader": {
      "command": "npx",
      "args": ["@sylphx/pdf-reader-mcp"]
    }
  }
}

Cline

Add to Cline's MCP settings:

{
  "mcpServers": {
    "pdf-reader": {
      "command": "npx",
      "args": ["@sylphx/pdf-reader-mcp"]
    }
  }
}

Warp

Go to Settings → AI → Manage MCP Servers → Add
Command: npx, Args: @sylphx/pdf-reader-mcp

Smithery (One-click)

npx -y @smithery/cli install @sylphx/pdf-reader-mcp --client claude

Manual Installation

# Quick start - zero installation
npx @sylphx/pdf-reader-mcp

# Or install globally
npm install -g @sylphx/pdf-reader-mcp

Quick Start

Basic Usage

{
  "sources": [{
    "path": "documents/report.pdf"
  }],
  "include_full_text": true,
  "include_metadata": true,
  "include_page_count": true
}

Result:

✅ Full text content extracted
✅ PDF metadata (author, title, dates)
✅ Total page count
✅ Structural sharing - unchanged parts preserved

Extract Specific Pages

{
  "sources": [{
    "path": "documents/manual.pdf",
    "pages": "1-5,10,15-20"
  }],
  "include_full_text": true
}

Absolute Paths (v1.3.0+)

// Windows - Both formats work!
{
  "sources": [{
    "path": "C:\\Users\\John\\Documents\\report.pdf"
  }],
  "include_full_text": true
}

// Unix/Mac
{
  "sources": [{
    "path": "/home/user/documents/contract.pdf"
  }],
  "include_full_text": true
}

Extract Images with Natural Ordering

{
  "sources": [{
    "path": "presentation.pdf",
    "pages": [1, 2, 3]
  }],
  "include_images": true,
  "include_full_text": true
}

Response includes:

Text and images in exact document order (Y-coordinate sorted)
Base64-encoded images with metadata (width, height, format)
Natural reading flow preserved for AI comprehension

Batch Processing

{
  "sources": [
    { "path": "C:\\Reports\\Q1.pdf", "pages": "1-10" },
    { "path": "/home/user/Q2.pdf", "pages": "1-10" },
    { "url": "https://example.com/Q3.pdf" }
  ],
  "include_full_text": true
}

⚡ All PDFs processed in parallel automatically!

Features

Core Capabilities

✅ Text Extraction - Full document or specific pages with intelligent parsing
✅ Image Extraction - Base64-encoded with complete metadata (width, height, format)
✅ Content Ordering - Y-coordinate based layout preservation for natural reading flow
✅ Metadata Extraction - Author, title, creation date, and custom properties
✅ Page Counting - Fast enumeration without loading full content
✅ Dual Sources - Local files (absolute or relative paths) and HTTP/HTTPS URLs
✅ Batch Processing - Multiple PDFs processed concurrently

Advanced Features

⚡ 5-10x Performance - Parallel page processing with Promise.all
🎯 Smart Pagination - Extract ranges like "1-5,10-15,20"
🖼️ Multi-Format Images - RGB, RGBA, Grayscale with automatic detection
🛡️ Path Flexibility - Windows, Unix, and relative paths all supported (v1.3.0)
🔍 Error Resilience - Per-page error isolation with detailed messages
📏 Large File Support - Efficient streaming and memory management
📝 Type Safe - Full TypeScript with strict mode enabled

API Reference

`read_pdf` Tool

The single tool that handles all PDF operations.

Parameters

Parameter	Type	Description	Default
`sources`	Array	List of PDF sources to process	Required
`include_full_text`	boolean	Extract full text content	`false`
`include_metadata`	boolean	Extract PDF metadata	`true`
`include_page_count`	boolean	Include total page count	`true`
`include_images`	boolean	Extract embedded images	`false`

Source Object

{
  path?: string;        // Local file path (absolute or relative)
  url?: string;         // HTTP/HTTPS URL to PDF
  pages?: string | number[];  // Pages to extract: "1-5,10" or [1,2,3]
}

Examples

Metadata only (fast):

{
  "sources": [{ "path": "large.pdf" }],
  "include_metadata": true,
  "include_page_count": true,
  "include_full_text": false
}

From URL:

{
  "sources": [{
    "url": "https://arxiv.org/pdf/2301.00001.pdf"
  }],
  "include_full_text": true
}

Page ranges:

{
  "sources": [{
    "path": "manual.pdf",
    "pages": "1-5,10-15,20"  // Pages 1,2,3,4,5,10,11,12,13,14,15,20
  }]
}

Troubleshooting

"Absolute paths are not allowed"

Solution: Upgrade to v1.3.0+

npm update @sylphx/pdf-reader-mcp

Restart your MCP client completely.

"File not found"

Causes:

File doesn't exist at path
Wrong working directory
Permission issues

Solutions:

Use absolute path:

{ "path": "C:\\Full\\Path\\file.pdf" }

Or configure cwd:

{
  "pdf-reader-mcp": {
    "command": "npx",
    "args": ["@sylphx/pdf-reader-mcp"],
    "cwd": "/path/to/docs"
  }
}

"No tools showing up"

Solution:

npm cache clean --force
rm -rf node_modules package-lock.json
npm install @sylphx/pdf-reader-mcp@latest

Restart MCP client completely.

How to install this MCP server

For Claude Code

To add this MCP server to Claude Code, run this command in your terminal:

claude mcp add-json "pdf-reader-mcp" '{"command":"npx","args":["@sylphlab/pdf-reader-mcp"],"name":"PDF Reader (npx)"}'

See the official Claude Code MCP documentation for more details.

For Cursor

There are two ways to add an MCP server to Cursor. The most common way is to add the server globally in the ~/.cursor/mcp.json file so that it is available in all of your projects.

If you only need the server in a single project, you can add it to the project instead by creating or adding it to the .cursor/mcp.json file.

Adding an MCP server to Cursor globally

To add a global MCP server go to Cursor Settings > Tools & Integrations and click "New MCP Server".

When you click that button the ~/.cursor/mcp.json file will be opened and you can add your server like this:

{
    "mcpServers": {
        "pdf-reader-mcp": {
            "command": "npx",
            "args": [
                "@sylphlab/pdf-reader-mcp"
            ],
            "name": "PDF Reader (npx)"
        }
    }
}

Adding an MCP server to a project

To add an MCP server to a project you can create a new .cursor/mcp.json file or add it to the existing one. This will look exactly the same as the global MCP server example above.

How to use the MCP server

Once the server is installed, you might need to head back to Settings > MCP and click the refresh button.

The Cursor agent will then be able to see the available tools the added MCP server has available and will call them when it needs to.

You can also explicitly ask the agent to use the tool by mentioning the tool name and describing what the function does.

For Claude Desktop

To add this MCP server to Claude Desktop:

1. Find your configuration file:

macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
Windows: %APPDATA%\Claude\claude_desktop_config.json
Linux: ~/.config/Claude/claude_desktop_config.json

2. Add this to your configuration file:

{
    "mcpServers": {
        "pdf-reader-mcp": {
            "command": "npx",
            "args": [
                "@sylphlab/pdf-reader-mcp"
            ],
            "name": "PDF Reader (npx)"
        }
    }
}

3. Restart Claude Desktop for the changes to take effect

PDF Reader MCP server