home / mcp / gemini image mcp server
MCP server for Gemini 2.5 Flash Image (Nano Banana) - Generate and edit images with AI
Configuration
View docs{
"mcpServers": {
"brunoqgalvao-gemini-image-mcp-server": {
"command": "npx",
"args": [
"-y",
"github:brunoqgalvao/gemini-image-mcp-server"
],
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
}
}You can integrate Gemini Image tool capabilities directly into your AI workflows by running a focused MCP server locally. This server exposes image generation and editing as MCP endpoints, so your assistant can generate images, edit existing ones, or compose multiple images with simple prompts. It’s designed to be easy to connect to Claude Code or other MCP clients and to keep API keys securely configured on your side.
You connect to the Gemini Image MCP server from your MCP client by configuring a server entry that points to the MCP runtime and provides your API key. Once connected, you can issue prompts to generate new images, apply edits to existing images, or combine multiple inputs into a single composition. The server will handle the API calls and save outputs to your specified paths.
Key capabilities you can leverage through the MCP server include text-to-image generation, image editing via natural language instructions, multi-image composition, and flexible aspect ratios. You can also request image-only outputs or save the full API response for debugging or auditing.
Example workflow patterns you can implement in your MCP client include: generate a sunset over mountains and save as sunset.png; edit an image to remove the background; or create a collage by combining several input images. Your client simply passes prompts and image paths to the MCP endpoint and the server returns the produced image and any metadata.
Prerequisites: you need Node.js installed to run the MCP server entrypoint. You can also use the Python CLI version if you prefer Python, but this guide focuses on the MCP server approach.
Option A: Install the MCP server using npx (recommended) and configure your MCP client to use it.
{
"mcpServers": {
"gemini-image": {
"command": "npx",
"args": ["-y", "github:brunoqgalvao/gemini-image-mcp-server"],
"env": {
"GEMINI_API_KEY": "your_api_key_here"
}
}
}
}If you prefer a local, self-hosted setup, you can run the MCP server directly from the repository by cloning it, installing dependencies, and starting the server locally. You will need to provide your Google Gemini API key in the environment.
Environment variables you should prepare include GEMINI_API_KEY with your Google Gemini API key. Ensure you keep this key secure and do not commit it to public repositories.
Common issues include missing API keys, invalid image paths, or unsupported aspect ratios. Double-check that your GEMINI_API_KEY is correctly set in the MCP client configuration and that any input image paths you pass exist and are accessible.
If you encounter network errors, verify that the MCP server process has network access and that there are no firewall blocks between your MCP client and the server.
All generated outputs are saved to the path you specify in the MCP client call, and you can request an image-only response if you do not want any accompanying text.
The server supports multiple aspect ratios, with common options including 1:1, 16:9, and 21:9. You can mix input images for edits or combine several images into a single composition.
- Text-to-Image Generation: Create images from textual prompts.
- Image Editing: Apply edits to images using natural language instructions.
- Multi-Image Composition: Combine multiple inputs into one output.
- Flexible Aspect Ratios: Support for ten commonly used aspect ratios.
- Character Consistency: Maintain appearance across generations with clear prompts.
- MCP Server Integration: Connects with Claude Code and other MCP clients.
- Command-Line Interface: Quick operations via the MCP client.
- Python API: Use the module in your own projects if you prefer Python workflows.
Create images from textual prompts using Gemini Image capabilities.
Modify existing images based on natural language instructions.
Combine several input images into a single output image.
Support multiple aspect ratios for flexible output formats.
Maintain consistent character appearance across generations with clear prompts.
Expose endpoints that integrate with Claude Code and other MCP clients.
Provide a straightforward command-line interface for quick image tasks.
Offer a Python API for embedding Gemini Image features into Python projects.