home / mcp / say tts mcp server
Provides multiple text-to-speech services via MCP protocol with several voice options and models.
Configuration
View docs{
"mcpServers": {
"blacktop-mcp-say": {
"command": "mcp-tts",
"args": [],
"env": {
"OPENAI_API_KEY": "YOUR_OPENAI_API_KEY",
"GOOGLE_AI_API_KEY": "YOUR_GOOGLE_API_KEY",
"ELEVENLABS_API_KEY": "YOUR_ELEVENLABS_API_KEY",
"ELEVENLABS_VOICE_ID": "1SM7GgM6IMuvQlz2BwM3",
"OPENAI_TTS_INSTRUCTIONS": "Speak in a cheerful and positive tone",
"MCP_TTS_ALLOW_CONCURRENT": "false",
"MCP_TTS_SUPPRESS_SPEAKING_OUTPUT": "true"
}
}
}
}You run an MCP TTS server to provide multiple text-to-speech services via the MCP protocol. This server lets clients request spoken audio from several voices and models, including macOS built-in speech, ElevenLabs, Google Gemini-based voices, and OpenAI TTS, all coordinated through a single endpoint. It supports configuration to control concurrency, output verbosity, and custom voice instructions for OpenAI TTS.
You host the TTS service and connect your MCP clients to it to convert text into speech. You can choose among the available tools: say_tts (macOS only), elevenlabs_tts, google_tts, and openai_tts. Each tool exposes different voices and controls, and you can adjust speed and voice behavior through environment variables or command line flags. By default, speech operations run sequentially to avoid overlapping audio, but you can enable concurrent speech if your workflow requires multiple voices speaking at once.
Prerequisites: you need a Go toolchain installed to build and install the MCP TTS server.
# Install the MCP TTS server
go install github.com/blacktop/mcp-tts@latest
# Run the server (verify help for available flags)
mcp-tts --helpUse environment variables or explicit command line flags to tailor behavior, including concurrency, output suppression, and API keys for cloud-based voices.
Getting started is straightforward once you have the server installed. The following excerpts show a typical setup and the environment you would configure for different TTS backends. Use these examples as a guide and tailor them to your environment and API keys.
Uses macOS built-in 'say' command to speak the text with system voices (macOS only)
Uses ElevenLabs API to synthesize speech with premium voices
Uses Google's Gemini TTS models to synthesize speech with many high-quality voices
Uses OpenAI TTS API with multiple voice options and models (gpt-4o-mini-tts, tts-1, tts-1-hd) and speed control