home / mcp / qwen3-vl video understanding mcp server (blaxel)

Qwen3-VL Video Understanding MCP Server (Blaxel)

Provides MCP-based video and image analysis via Blaxel using Qwen3-VL-8B-Instruct, including summarization and text extraction.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "adamanz-qwen-video-blaxel-mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/qwen-video-blaxel-mcp",
        "run",
        "server.py"
      ],
      "env": {
        "BLAXEL_MODEL": "qwen-qwen3-vl-8b-instruct",
        "BLAXEL_API_KEY": "YOUR_API_KEY",
        "BLAXEL_API_URL": "https://api.blaxel.ai/v1"
      }
    }
  }
}

You deploy and run the Qwen3-VL Video Understanding MCP Server on Blaxel to enable Claude and other agents to analyze videos and images. This MCP server uses Qwen3-VL-8B-Instruct on Blaxel’s H100 GPUs to provide video analysis, image analysis, video summarization, text extraction, and video Q&A, all orchestrated through a lightweight local MCP workflow.

How to use

You will interact with this MCP server from your MCP client to perform common analysis tasks. You can analyze a video by URL with a custom prompt, analyze an image by URL, generate summaries in different styles, extract on-screen text and transcriptions, and ask targeted questions about video content. When you start the server, you provide your Blaxel API key and model, then call the available functions from your client to perform these actions.

Typical usage patterns include sending a video URL and a natural language prompt to describe or query the content, requesting a summarized version of the video in a chosen style, or extracting both text and speech from video footage. Your MCP client will expose these functions so you can integrate them into your workflows and agents.

How to install

Prerequisites you need before installation: Python 3.10 or newer, and ffmpeg for video frame extraction. You also need an active Blaxel account and access to the Blaxel CLI.

1) Deploy the model to Blaxel using the provided configuration. This registers the Qwen3-VL-8B-Instruct model on your Blaxel account with GPU support.

cat << 'EOF' | blaxel apply -f -
apiVersion: blaxel.ai/v1alpha1
kind: Model
metadata:
  name: qwen-qwen3-vl-8b-instruct
  displayName: Qwen/Qwen3-VL-8B-Instruct
spec:
  enabled: true
  policies: []
  flavors:
    - name: nvidia-h100/x4
      type: gpu
  runtime:
    model: Qwen/Qwen3-VL-8B-Instruct
    type: hf_private_endpoint
    image: ''
    args: []
    endpointName: qwenqwen3-vl-8b-instruct-nvidia-h100
    organization: adamanz
  integrationConnections:
    - huggingface-4s2m2h
EOF
```

Or use the provided config:

```bash
blaxel apply -f blaxel-model.yaml

2) Get your API key from Blaxel to authenticate requests.

blaxel auth token

3) Install the MCP Server locally. Change to the project directory and install in editable mode.

cd qwen-video-blaxel-mcp
pip install -e .
```

Or use the uv runner for development:

```bash
uv pip install -e .

4) Configure environment variables for your Blaxel credentials and model.

cp .env.example .env
# Edit .env with your Blaxel API key and desired model

5) Add the MCP server configuration to Claude Desktop so you can run it directly from the Claude interface.

{
  "mcpServers": {
    "qwen_blaxel_mcp": {
      "command": "uv",
      "args": [
        "--directory",
        "/path/to/qwen-video-blaxel-mcp",
        "run",
        "server.py"
      ],
      "env": {
        "BLAXEL_API_KEY": "your-blaxel-api-key",
        "BLAXEL_MODEL": "qwen-qwen3-vl-8b-instruct"
      }
    }
  }
}

6) Restart Claude Desktop to load the new MCP server. The qwen_blaxel_mcp tools will be available for use from your Claude interface.

Additional notes and setup details

Configuration and prerequisites ensure the server can access Blaxel with your API key and use the Qwen3-VL-8B-Instruct model on H100 GPUs. ffmpeg is required to extract frames from videos for analysis. You will specify URLs for videos/images when calling the tools from your MCP client.

Required formats: Video formats include mp4, webm, mov, and avi; image formats include jpg, jpeg, png, gif, and webp.

If you need to inspect or adjust the Blaxel configuration, you can run a quick check to verify API connectivity and model readiness.

Security: Keep your Blaxel API key secret. Use environment variables in a secure way and restrict access to the Claude desktop configuration.

Available tools

analyze_video

Analyze a video from a URL with a custom prompt to extract content, frames, and insights.

analyze_image

Analyze a single image from a URL and generate descriptive responses based on the prompt.

summarize_video

Create a summary of the video in a chosen style (brief, standard, or detailed).

video_qa

Answer specific questions about the video content, given a URL.

extract_video_text

Extract on-screen text and transcribe spoken content from a video.

check_configuration

Check the current Blaxel API configuration for correctness.

list_capabilities

List all capabilities exposed by this MCP server.