home / mcp / gemini vision mcp server

Gemini Vision MCP Server

Provides an MCP endpoint to analyze images and YouTube videos via Gemini API with a 16 MB image limit.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "artin0123-gemini-vision-mcp": {
      "command": "node",
      "args": [
        "/absolute/path/to/gemini-vision-mcp/dist/index.js"
      ],
      "env": {
        "GEMINI_MODEL": "models/gemini-flash-lite-latest",
        "GEMINI_API_KEY": "your_api_key_here"
      }
    }
  }
}

You can run a Gemini-based MCP server that analyzes images and YouTube videos through a single, remote-safe endpoint. This server connects to the Gemini API, enforces an image size limit for quick responses, and allows you to override the model if needed. It’s useful for automated image and video analysis workflows without handling files locally.

How to use

To use this MCP server, start the local Gemini-based MCP server and connect your MCP client to its endpoint. Provide the Gemini API key to authorize requests, and optionally override the Gemini model for different analysis capabilities. Once running, you can invoke two main tools to analyze media from URLs: analyze_image for image URLs and analyze_youtube_video for YouTube URLs.

How to install

Prerequisites you need before installation:

      Step-by-step build and run flow

      # Clone the repository
      # Note: use the actual repository URL where you maintain your MCP server code
      # git clone https://github.com/Artin0123/gemini-vision-mcp.git
      # cd gemini-vision-mcp
      
      # Install dependencies
      npm install
      
      # Compile TypeScript to dist/
      npm run build
      
      # Start the server using the node runtime and the built entry point
      node /absolute/path/to/gemini-vision-mcp/dist/index.js

      Configuration overview

      Configure the server by creating an MCP configuration that specifies how to start the node-based server, which API key to use, and which model to load by default.

      How to configure the MCP server

      {
        "mcpServers": {
          "gemini_media": {
            "command": "node",
            "args": ["/absolute/path/to/gemini-vision-mcp/dist/index.js"],
            "env": {
              "GEMINI_API_KEY": "your_api_key_here",
              "GEMINI_MODEL": "models/gemini-flash-lite-latest"
            }
          }
        }
      }

      Notes on keys, models, and behavior

      If no Gemini API key is provided, the server can still start, but any tool invocation will return a configuration error until a valid API key is configured.

      Model override: the server defaults to the Gemini model gemini-flash-lite-latest. You can override it by setting the GEMINI_MODEL environment variable or by specifying modelName in your client configuration.

      Supported tools

      analyze_image: Analyze one or more image URLs. Maximum file size per image is 16 MB. Image URLs are downloaded and processed with the size limit to ensure fast response times.

      analyze_youtube_video: Analyze a YouTube video from URL. There is no size limit because YouTube videos are streamed directly by the Gemini API.

      Development and maintenance

      Install dependencies, run tests, and build when developing or maintaining the MCP server.

      npm install
      npm test
      npm run build

      License

      MIT license.

      Available tools

      analyze_image

      Analyze one or more image URLs with a maximum file size of 16 MB per image.

      analyze_youtube_video

      Analyze a YouTube video URL; there is no size limit since the video is streamed via Gemini API.