home / mcp / scribe mcp server
Provides real-time and batch transcription via MCP over HTTP/WebSocket with context management.
Configuration
View docs{
"mcpServers": {
"aromanstatue-mcp-elevenlab-scribe-asr": {
"url": "https://localhost:8000/mcp",
"headers": {
"ELEVENLABS_API_KEY": "YOUR_API_KEY"
}
}
}
}You run an MCP server that connects to ElevenLabs a0Scribe and handles real-time transcription through a simple, bidirectional protocol. This server lets you stream audio from a microphone or upload files for transcription, while preserving conversation context and supporting automatic language detection. It exposes standard MCP messages to manage sessions, streaming, and results, and communicates over a local development WebSocket or HTTP endpoints.
Start by launching the MCP server locally and then connect with an MCP client that speaks the supported message types. You can stream audio from a microphone for real-time transcription or send audio files for batch results. Use the MCP flow to INIT a session, START streaming, A UDI O data, and receive TRANSCRIPTION results until you STOP and DONE the session. The server also supports WebSocket for real-time bidirectional communication and can manage conversation context to improve accuracy.
To begin a transcription session from your client, perform the standard MCP sequence: INIT to create a session, START to begin audio streaming, send AUDIO payloads, receive TRANSCRIPTION messages, and finally STOP and DONE to terminate the session. If you need to operate on media in real time, connect via WebSocket at the designated transcribe endpoint and push audio chunks as they become available. For file-based transcription, prepare a file payload and submit it through the REST interface, then monitor the TRANSCRIPTION results as they are produced.
Prerequisites: you need Python 3.8 or newer and a compatible package manager installed on your system.
1) Create a project directory and navigate into it.
2) Create and activate a virtual environment.
3) Install the MCP server in editable mode.
4) Create a configuration file with your ElevenLabs API key.
Environment variable for API access: ELEVENLABS_API_KEY. This key is required to authenticate with the ElevenLabs Scribe API.
Starting the server uses a standard Python module entry point. The server defaults to port 8000, and will use the next available port if 8000 is in use.
Example client workflows are provided for both file-based and microphone-based transcription. Use the example client to test file uploads or mic input against the MCP server.
Real-time transcription is supported through real-time streaming with MCP messages. File-based transcription is available via the REST API. The server implements a bidirectional WebSocket interface for live transcription and a REST endpoint for on-demand transcription.
Stream audio in real time to the MCP server and receive instantaneous transcription results.
Upload audio files for batch transcription and receive results asynchronously.
Full MCP message handling (INIT, START, AUDIO, TRANSCRIPTION, ERROR, STOP, DONE) for session management.
WebSocket support for bidirectional real-time transcription messages.
Maintain and leverage conversation context to improve transcription accuracy and continuity.