home / mcp / minimax mcp tools server
Async MCP server with Minimax API integration for image generation and text-to-speech
Configuration
View docs{
"mcpServers": {
"psycharch-minimax-mcp-tools": {
"command": "npx",
"args": [
"minimax-mcp-tools"
],
"env": {
"MINIMAX_API_KEY": "YOUR_API_KEY"
}
}
}
}You can run the minimax MCP Tools server to generate images and speech asynchronously, with smart rate limiting and a barrier mechanism that synchronizes all submitted tasks. This setup is ideal for batch content creation, multimedia pipelines, and scalable asset production where you want to submit many tasks in parallel and collect results together at the end.
You interact with the Minimax MCP Tools server by submitting tasks for image generation and speech generation. Each submission returns a task ID so you can continue submitting more work without waiting. When you are ready to gather results for all tasks in a batch, you invoke a barrier operation that waits for every submitted task to complete and then returns a comprehensive results summary. This approach maximizes throughput by saturating rate limits and then synchronizing at the end.
Key capabilities you can leverage include: submitting multiple image generation requests in parallel with individual prompts, and submitting multiple speech generation tasks for different text chunks or voices. Use the barrier to obtain all results and file paths for the produced assets in one consolidated response.
Prerequisites: make sure you have Node.js (LTS version) and npm installed on your system. You should also have npx available with your Node installation.
Prepare your environment and start the MCP server using the provided CLI. You will need your API key to authenticate with the Minimax service.
Run the following steps to install and start the MCP server locally in standard usage.
# Windows users (PowerShell)
$Env:MINIMAX_API_KEY = "your_api_key_here"
# macOS/Linux users (bash/zsh)
export MINIMAX_API_KEY=your_api_key_here
# Start the MCP server via npx
npx minimax-mcp-toolsThe server uses an asynchronous submit-and-barrier pattern designed for batch content creation. You submit tasks for image and speech generation; each submission returns a task ID and continues to run in the background. A smart rate limiter enforces practical limits (for example, image generation and speech generation rate caps), with a barrier mechanism that waits for all submitted tasks to complete and then returns a unified results summary.
Typical flows include narrating a slideshow by generating many slide images and corresponding narration in parallel, producing chapters with multiple voice characters for an audiobook, and generating website assets that share a consistent aesthetic and audio across projects. This is especially effective in LLM-driven pipelines where you need both visuals and audio outputs in tandem.
To enable the MCP server, provide the MCP settings to point to the minimax mcp tools server with your API key. The common configuration uses an MCP server entry named minimax_mcp_tools, configured to run via npx with the minimax-mcp-tools package and your API key.
submit_image_generation - Submit an image generation task. Required: prompt, outputFile. Optional: aspectRatio, customSize, seed, subjectReference, style.
submit_speech_generation - Submit a speech generation task. Required: text, outputFile. Optional: highQuality, voiceId, speed, volume, pitch, emotion, format, sampleRate, bitrate, languageBoost, intensity, timbre, sound_effects.
task_barrier - Wait for all submitted tasks to complete and retrieve results. This is essential for batch processing to obtain a complete set of outputs.
Keep your API key secure and do not expose it in client-side code or logs. Use environment variables to manage credentials and rotate keys periodically. When running the server locally, ensure your environment is trusted and your network access is controlled.
If you encounter rate limiting, review your submission strategy to balance quick bursts with barrier synchronization. If a task appears stuck, check the task status with the provided task IDs and ensure the barrier is invoked only after all intended tasks have been submitted.
Submit an image generation task asynchronously. Requires prompt and outputFile; supports optional parameters such as aspectRatio, customSize, seed, subjectReference, and style.
Submit a speech generation task asynchronously. Requires text and outputFile; supports optional parameters such as highQuality, voiceId, speed, volume, pitch, emotion, format, sampleRate, bitrate, languageBoost, intensity, timbre, and sound_effects.
Wait for all submitted tasks to complete and retrieve a comprehensive results summary for batch processing.