Minimax AI MCP server

Integrates with Minimax's AI services to provide image generation and text-to-speech capabilities through a Node.js server, enabling editors to access image-01 and speech-01 models for creating visuals and natural-sounding audio without leaving the editing environment.
Back to servers
Provider
PsychArch
Release date
Mar 16, 2025
Language
TypeScript
Stats
46 stars

The Minimax MCP Tools server provides a Model Context Protocol implementation that connects with Minimax API to deliver image generation and text-to-speech functionality. It allows you to generate images from text prompts and convert text to natural-sounding speech with various customization options.

Installation and Setup

Prerequisites

  • Node.js 16 or higher
  • A Minimax API key (obtain from Minimax Platform)
  • Minimax Group ID for TTS functionality

Configuration

Create or update your MCP configuration file with the following settings:

{
  "mcpServers": {
    "minimax-mcp-tools": {
      "command": "npx",
      "args": [
        "minimax-mcp-tools"
      ],
      "env": {
        "MINIMAX_API_KEY": "your-minimax-api-key",
        "MINIMAX_GROUP_ID": "your-minimax-group-id"
      }
    }
  }
}

Using Image Generation

Generate images based on text prompts by providing parameters in the following format:

{
  "prompt": "A mountain landscape at sunset",
  "aspectRatio": "16:9",
  "n": 1,
  "outputFile": "/absolute/path/to/image.jpg",
  "subjectReference": "/path/to/reference.jpg" // Optional
}

Parameters

  • prompt (required): Description of the image to generate
  • outputFile (required): Absolute path to save the generated image file. The directory must already exist.
  • aspectRatio (optional): Image aspect ratio (default: "1:1", options: "1:1", "16:9", "4:3", "3:2", "2:3", "3:4", "9:16", "21:9")
  • n (optional): Number of images to generate (default: 1, range: 1-9). For multiple images, files will be automatically numbered.
  • subjectReference (optional): Path to a local image file or public URL for character reference. Supported formats: JPG, JPEG, PNG

Using Text-to-Speech

Convert text to speech with various customization options:

{
  "text": "Hello, this is a test of the text-to-speech functionality.",
  "model": "speech-02-hd",
  "voiceId": "female-shaonv",
  "speed": 1.0,
  "volume": 1.0,
  "pitch": 0,
  "emotion": "happy",
  "format": "mp3",
  "outputFile": "/absolute/path/to/audio.mp3",
  "subtitleEnable": true
}

Basic Parameters

  • text (required): Text to convert to speech (max 10,000 characters)
  • outputFile (required): Absolute path to save the generated audio file
  • model (optional): Model version (default: "speech-02-hd")
    • speech-02-hd: High-definition model with excellent timbre similarity and audio quality
    • speech-02-turbo: Fast model with low latency and enhanced multilingual capabilities
  • voiceId (optional): Voice ID to use (default: "male-qn-qingse")
  • speed (optional): Speech speed (default: 1.0, range: 0.5-2.0)
  • volume (optional): Speech volume (default: 1.0, range: 0.1-10.0)
  • pitch (optional): Speech pitch (default: 0, range: -12 to 12)
  • emotion (optional): Emotion of the speech (default: "neutral", options: "happy", "sad", "angry", "fearful", "disgusted", "surprised", "neutral")

Voice Mixing

You can mix up to 4 different voices with weight settings:

"timberWeights": [
  { "voice_id": "male-qn-qingse", "weight": 70 },
  { "voice_id": "female-shaonv", "weight": 30 }
]

Audio Settings

  • format (optional): Audio format (default: "mp3", options: "mp3", "pcm", "flac", "wav")
  • sampleRate (optional): Sample rate in Hz (default: 32000, options: 8000, 16000, 22050, 24000, 32000, 44100)
  • bitrate (optional): Bitrate for MP3 format (default: 128000, options: 32000, 64000, 128000, 256000)
  • channel (optional): Number of audio channels (default: 1, options: 1=mono, 2=stereo)

Advanced Features

  • latexRead (optional): Whether to read LaTeX formulas (default: false)
  • pronunciationDict (optional): List of pronunciation replacements:
    "pronunciationDict": ["处理/(chu3)(li3)", "危险/dangerous"]
    
  • stream (optional): Whether to use streaming mode (default: false)
  • languageBoost (optional): Enhance recognition of specific languages
  • subtitleEnable (optional): Whether to enable subtitle generation (default: false)

How to add this MCP server to Cursor

There are two ways to add an MCP server to Cursor. The most common way is to add the server globally in the ~/.cursor/mcp.json file so that it is available in all of your projects.

If you only need the server in a single project, you can add it to the project instead by creating or adding it to the .cursor/mcp.json file.

Adding an MCP server to Cursor globally

To add a global MCP server go to Cursor Settings > MCP and click "Add new global MCP server".

When you click that button the ~/.cursor/mcp.json file will be opened and you can add your server like this:

{
    "mcpServers": {
        "cursor-rules-mcp": {
            "command": "npx",
            "args": [
                "-y",
                "cursor-rules-mcp"
            ]
        }
    }
}

Adding an MCP server to a project

To add an MCP server to a project you can create a new .cursor/mcp.json file or add it to the existing one. This will look exactly the same as the global MCP server example above.

How to use the MCP server

Once the server is installed, you might need to head back to Settings > MCP and click the refresh button.

The Cursor agent will then be able to see the available tools the added MCP server has available and will call them when it needs to.

You can also explictly ask the agent to use the tool by mentioning the tool name and describing what the function does.

Want to 10x your AI skills?

Get a free account and learn to code + market your apps using AI (with or without vibes!).

Nah, maybe later