home / mcp / pdf reader mcp server
Provides enterprise-grade PDF processing for AI agents, extracting text, images, and metadata from PDFs via a single MCP tool.
Configuration
View docs{
"mcpServers": {
"bach-ai-tools-pdf-reader-mcp": {
"command": "npx",
"args": [
"@bachstudio/pdf-reader-mcp"
]
}
}
}PDF Reader MCP is a production-ready Model Context Protocol server that enables AI agents to process PDFs at enterprise scale. It can extract full text, images, and metadata with fast, parallelized processing and reliable per-page error handling, preserving document structure for AI comprehension.
You run the PDF Reader MCP as an MCP server and connect your client application to it to request PDF processing. Use it to extract text, images, and metadata from local files or URLs, with support for page ranges, per-page ordering by Y-coordinate, and batch processing. The server is designed to be used from an MCP client that loads the appropriate server configuration and then issues processing requests for PDFs.
Prerequisites you need before installing: Node.js version 22 or newer, and a preferred package manager (pnpm recommended, but npm, yarn, or npm-compatible tools also work). Install and run the MCP server using the steps below.
# Quick start - no installation required for the host tool
npx @bachstudio/pdf-reader-mcp
# Or install with pnpm (recommended)
pnpm add @bachstudio/pdf-reader-mcp
# Or using npm
npm install @bachstudio/pdf-reader-mcp
# Or using yarn
yarn add @bachstudio/pdf-reader-mcp
# For Claude Desktop (easiest integration)
npx -y @smithery/cli install @bachstudio/pdf-reader-mcp --client claude{
"mcpServers": {
"pdf-reader-mcp": {
"command": "npx",
"args": ["@bachstudio/pdf-reader-mcp"]
}
}
}After configuring the client, you can request processing for a document by specifying the source path (or URL) and optional parameters such as whether to include full text, metadata, and page count. The server returns text, metadata, and page structure in a single response, with images included if requested.
You can target a subset of pages or include images in the response. Specify page ranges like "1-5,10-15" and enable image extraction to receive base64-encoded images along with their dimensions and format.
Process multiple PDFs in parallel by listing multiple sources in a single request. The server handles each PDF concurrently, leveraging multi-core systems for faster throughput.
The single MCP tool that handles all PDF operations, including text extraction, image extraction, content ordering by Y-coordinate, metadata extraction, and page counting.