Web Content Extractor MCP server

Extracts and processes web content using TypeScript, Cheerio, and Turndown for tasks like scraping, summarization, and data transformation.
Back to servers
Provider
Brian W. Smith
Release date
Jan 08, 2025
Language
TypeScript
Stats
7 stars

The MCP Webscan Server is a powerful tool for web content scanning and analysis, providing various functions like fetching web pages, extracting links, crawling sites, checking for broken links, finding URL patterns, and generating sitemaps - all through a Model Context Protocol (MCP) interface.

Installation Options

Quick Installation via Smithery

The fastest way to install the Webscan server for Claude Desktop is through Smithery:

npx -y @smithery/cli install mcp-server-webscan --client claude

Manual Installation

If you prefer to install manually:

# Clone the repository
git clone <repository-url>
cd mcp-server-webscan

# Install dependencies
npm install

# Build the project
npm run build

Starting the Server

Once installed, start the server with:

npm start

The server runs on stdio transport, making it compatible with MCP clients such as Claude Desktop.

Configuring with Claude Desktop

Add the server configuration to your Claude Desktop settings:

{
  "mcpServers": {
    "webscan": {
      "command": "node",
      "args": ["path/to/mcp-server-webscan/build/index.js"],
      "env": {
        "NODE_ENV": "development",
        "LOG_LEVEL": "info"
      }
    }
  }
}

Available Tools

Fetch Page

Converts web pages to Markdown for easier analysis.

Parameters:

  • url (required): URL of the page to fetch
  • selector (optional): CSS selector to target specific content

Example usage:

Could you fetch the content from https://example.com and convert it to Markdown?

Extract Links

Extracts all links from a web page along with their text.

Parameters:

  • url (required): URL of the page to analyze
  • baseUrl (optional): Base URL to filter links
  • limit (optional, default: 100): Maximum number of links to return

Crawl Site

Recursively crawls a website up to a specified depth.

Parameters:

  • url (required): Starting URL to crawl
  • maxDepth (optional, default: 2): Maximum crawl depth (0-5)

Check Links

Identifies broken links on a web page.

Parameters:

  • url (required): URL to check links for

Find Patterns

Locates URLs matching a specific pattern.

Parameters:

  • url (required): URL to search in
  • pattern (required): JavaScript-compatible regex pattern to match URLs against

Generate Site Map

Creates a simple XML sitemap by crawling a website.

Parameters:

  • url (required): Root URL for sitemap crawl
  • maxDepth (optional, default: 2): Maximum crawl depth for discovering URLs (0-5)
  • limit (optional, default: 1000): Maximum number of URLs to include in the sitemap

Error Handling

The server has comprehensive error handling for:

  • Invalid parameters
  • Network errors
  • Content parsing errors
  • URL validation

All errors are properly formatted according to the MCP specification, making debugging straightforward.

How to add this MCP server to Cursor

There are two ways to add an MCP server to Cursor. The most common way is to add the server globally in the ~/.cursor/mcp.json file so that it is available in all of your projects.

If you only need the server in a single project, you can add it to the project instead by creating or adding it to the .cursor/mcp.json file.

Adding an MCP server to Cursor globally

To add a global MCP server go to Cursor Settings > MCP and click "Add new global MCP server".

When you click that button the ~/.cursor/mcp.json file will be opened and you can add your server like this:

{
    "mcpServers": {
        "cursor-rules-mcp": {
            "command": "npx",
            "args": [
                "-y",
                "cursor-rules-mcp"
            ]
        }
    }
}

Adding an MCP server to a project

To add an MCP server to a project you can create a new .cursor/mcp.json file or add it to the existing one. This will look exactly the same as the global MCP server example above.

How to use the MCP server

Once the server is installed, you might need to head back to Settings > MCP and click the refresh button.

The Cursor agent will then be able to see the available tools the added MCP server has available and will call them when it needs to.

You can also explictly ask the agent to use the tool by mentioning the tool name and describing what the function does.

Want to 10x your AI skills?

Get a free account and learn to code + market your apps using AI (with or without vibes!).

Nah, maybe later