home / mcp / baidu xiling mcp server

Baidu Xiling MCP Server

Baidu XiLing DIgital Human MCP Server

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "baidu-xiling-mcp": {
      "command": "uvx",
      "args": [
        "${path/to/dh-mcp-server}"
      ],
      "env": {
        "DH_API_AK": "${API Key}",
        "DH_API_SK": "${Secret Key}"
      }
    }
  }
}

You can access Baidu Xiling Digital Human MCP capabilities through an MCP client to generate digital portraits, synthesize videos, perform voice cloning, and generate audio—all via MCP-compliant interfaces. This server exposes a variety of tools to quickly integrate digital human services into your models and applications, enabling end-to-end workflows from portrait creation to video and audio production.

How to use

Connect your MCP-enabled agent or client to the server using the provided MCP configuration. You can start by listing available tools, then invoke the specific endpoints for 2D portrait creation, video synthesis, 123-second video production, speech synthesis, file uploads, voice queries, figure queries, and sound cloning. Use the appropriate tool for your scenario (e.g., create a 2D portrait with a real video, generate a digital human video from an existing portrait and timbre, or synthesize audio from text). Each tool returns task or figure identifiers that you can poll for status until your final artifact (video or audio) is available.

Typical usage patterns include: 1) Upload any required media assets (videos, audio) for your chosen workflow. 2) Choose a digital portrait or let the system generate a portrait from a source video. 3) Submit a synthesis or cloning task and poll for status using the task or figure IDs. 4) Retrieve the resulting video or audio URL when the task succeeds.

How to install

Prerequisites you need before installing the MCP server: - Python 3.12 or higher - API Key and Secret Key from Xiling Open Platform - Internet access to install packages and access MCP services.

Install required tooling and the MCP server package, then start using MCP with a local inspector for testing.

Additional sections

Configuration and start-up rely on an MCP runtime that can be run in a local development environment or integrated into your existing toolchain via MCP. The server is designed to be accessible from an MCP client and supports both local (stdio) and remote (http) connection patterns. Ensure you provide your API credentials in secure environment variables when starting the server.

Sample local start configuration demonstrates how to wire the MCP server into your development environment. The following configuration shows two stdio connections: one for a local digital human MCP wrapper and another for the Baidu Digital Human MCP Server package. Use your actual API credentials where placeholders appear.

{
  "mcpServers": {
    "DH-STDIO": {
      "timeout": 60,
      "type": "stdio",
      "command": "uvx",
      "args": [
        "${path/to/dh-mcp-server}"
      ],
      "env": {
        "DH_API_AK": "${API Key}",
        "DH_API_SK": "${Secret Key}"
      }
    },
    "baidu_dh": {
      "timeout": 60,
      "type": "stdio",
      "command": "uvx",
      "args": [
        "mcp-server-baidu-digitalhuman"
      ],
      "env": {
        "DH_API_AK": "${API Key}",
        "DH_API_SK": "${Secret Key}"
      }
    }
  }
}

Security and best practices

Keep API keys and secret keys secure. Do not expose credentials in client-side code or logs. Use environment variables and secret management practices to protect sensitive information. When deploying, prefer running the MCP server behind authentication and access controls to prevent unauthorized usage.

Troubleshooting and notes

If a task fails, review the provided failedCode and failedMessage to identify the root cause. Check that required inputs (files, portrait IDs, or text content) meet the specified limits, and ensure the network path to the MCP server is reachable from your client. Use the status endpoints to poll for progress and retrieve the final outputs once a status of SUCCESS is reported.

Examples and tips

Example workflows you can build with MCP: - Create a 2D digital portrait from a video and then generate a digital human video using a chosen timbre. - Generate a short 10-second to 4-minute live video using a pre-existing portrait and a selected voice model. - Synthesize speech from text for audio-only outputs using a chosen timbre. - Upload media assets once and reuse them across subsequent tasks like video production and voice cloning.

Available tools

generateLite2dGeneralVideo

Generate a digital portrait from an uploaded real-person video for basic video production using a universal lip drive.

getLite2dGeneralStatus

Query progress of digital portrait generation and list available system portraits.

generateDhVideo

Create a digital human video from a selected portrait and timbre with options for driving type, resolution, and optional subtitles.

getDhVideoStatus

Poll the status of a digital human video synthesis task and retrieve the video URL on success.

generateDh123Video

Produce a digital human video directly from a sample video and timbre without portrait generation.

getDh123VideoStatus

Query the status of a 123 digital human video synthesis task and retrieve the output URL.

generateText2Audio

Synthesize audio from text based on a chosen timbre without generating video.

getText2AudioStatus

Query the status of text-to-audio synthesis and retrieve the audio URL.

uploadFiles

Upload required media files for subsequent digital human services such as cloning or video production.

getVoices

Query available system and clone voices for selection.

getFigures

Query available 2D digital portrait figures.

generateVoiceClone

Create timbres by cloning from uploaded audio for use in synthesis and video production.

getVoiceCloneStatus

Check the status and results of a voice clone task.