home / mcp / trafilatura mcp server

Trafilatura MCP Server

Trafilatura MCP Server

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "fvanevski-trafilatura_mcp": {
      "command": "uv",
      "args": [
        "run",
        "python3",
        "trafilatura_mcp.py"
      ]
    }
  }
}

You run an MCP server that wraps the Trafilatura library to fetch and extract main content and metadata from web pages. This server is designed for use with MCP-compatible clients, so you can programmatically fetch article text, titles, authors, dates, and more from any URL, with options to include or exclude comments and tables. It operates asynchronously and communicates over standard I/O for broad compatibility with MCP tooling.

How to use

You connect an MCP client to the Trafilatura MCP Server to perform web scraping tasks. Use the client to list available tools and then invoke the fetch_and_extract tool with a URL. The tool returns the main article content along with relevant metadata such as title, author, and date. You can tailor the output by enabling or disabling optional data like comments and tables.

How to install

Prerequisites you need before running the server: Python 3.12 or newer and the uv runtime.

Prime the environment and install dependencies with the following steps.

python3 -V
uv --version

# Optional: verify you have Node.js and npx for testing MCP Inspector
node -v
npx -v

# Install Python dependencies if you are using a virtual environment later
# (assumes a requirements-like setup is handled by the server runtime)

Additional sections

Configuration for the server is minimal; there are no external API keys or configuration files required to run the core server.

Testing and development can be done using the MCP Inspector tool, which can launch the server and provide an interactive shell for commands like list_tools and call_tool.

Notes on running in your environment: ensure your Python 3.12+ interpreter is available and that the uv runtime is installed so the server can start and handle asynchronous I/O efficiently.

Available tools

fetch_and_extract

Fetch a URL, extract the main text content and metadata with Trafilatura, and return structured output. You can configure whether to include comments and tables in the extraction.