home / mcp / streamsets mcp server

StreamSets MCP Server

Provides an MCP server to manage and build StreamSets pipelines via Control Hub APIs.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "bracta-streamsets-mcp-server": {
      "command": "python",
      "args": [
        "/path/to/streamsets_server.py"
      ],
      "env": {
        "STREAMSETS_CRED_ID": "your-credential-id",
        "PIPELINE_STORAGE_PATH": "/var/lib/mcp/pipelines",
        "STREAMSETS_CRED_TOKEN": "your-auth-token",
        "STREAMSETS_HOST_PREFIX": "https://your-instance.streamsets.com"
      }
    }
  }
}

You can run a StreamSets MCP Server to manage and build StreamSets pipelines through conversational interactions. It connects to StreamSets Control Hub APIs, letting you list, monitor, start, stop, and analyze pipelines and jobs, manage connections, and build pipelines via a guided, interactive experience. It also supports persistent pipeline-building sessions so you can continue across conversations and devices.

How to use

You will use an MCP client to connect to the StreamSets MCP Server and perform both read and write operations. Start by authenticating with your API credentials, then choose the action you want to perform: list or monitor jobs, browse pipelines, manage connections, view metrics, or begin an interactive pipeline building session. You can persist your pipeline-building session across conversations and reload your progress later.

How to install

Prerequisites and setup steps are as follows.

{
  "step": "Install and run the MCP server locally"
}

Step-by-step installation and run (local)

1) Prepare the environment and install Python dependencies.

pip install -r requirements.txt

Step-by-step installation and run (local)

2) Configure environment variables for StreamSets access.

export STREAMSETS_HOST_PREFIX="https://your-instance.streamsets.com"
export STREAMSETS_CRED_ID="your-credential-id"
export STREAMSETS_CRED_TOKEN="your-auth-token"

Run the server locally

3) Start the MCP server in the foreground to test connectivity.

python streamsets_server.py

Docker deployment options

If you prefer containerized deployment, use Docker to run the server with persistence for pipeline builders.

# Build the image
docker build -t streamsets-mcp-server .

# Create a persistent volume for pipeline builders
docker volume create streamsets-pipeline-data

# Run with persistence and environment variables
docker run --rm -it \
  -e STREAMSETS_HOST_PREFIX="https://your-instance.streamsets.com" \
  -e STREAMSETS_CRED_ID="your-credential-id" \
  -e STREAMSETS_CRED_TOKEN="your-auth-token" \
  -v streamsets-pipeline-data:/data \
  streamsets-mcp-server

Claude Desktop integration options

You can connect Claude Desktop to the MCP Server either as a local development setup or in production with Docker persistence.

{
  "mcpServers": {
    "streamsets": {
      "command": "python",
      "args": ["/path/to/streamsets_server.py"],
      "env": {
        "STREAMSETS_HOST_PREFIX": "https://your-instance.streamsets.com",
        "STREAMSETS_CRED_ID": "your-credential-id",
        "STREAMSETS_CRED_TOKEN": "your-auth-token"
      }
    }
  }
}

Claude Desktop integration (production)

Alternatively, run in production with Docker persistence.

{
  "mcpServers": {
    "streamsets": {
      "command": "docker",
      "args": [
        "run", "--rm", "-i",
        "-v", "streamsets-pipeline-data:/data",
        "-e", "STREAMSETS_HOST_PREFIX=https://your-instance.streamsets.com",
        "-e", "STREAMSETS_CRED_ID=your-credential-id",
        "-e", "STREAMSETS_CRED_TOKEN=your-auth-token",
        "streamsets-mcp-server"
      ]
    }
  }
}

Available tools

sdc_list_jobs

List all jobs for an organization and filter by status or other criteria.

sdc_get_job_details

Retrieve detailed information about a specific job by its ID.

sdc_start_job

Start a single job by its ID.

sdc_stop_job

Stop a running job by its ID.

sdc_start_multiple_jobs

Start multiple jobs in one request by listing their IDs.

sdc_search_pipelines

Search pipelines by name or query to locate specific configurations.

sdc_get_pipeline_details

Fetch detailed information about a specific pipeline.

sdc_export_pipelines

Export pipelines or commits for backup or migration.

sdc_get_job_metrics

Retrieve performance metrics for a given job.

sdc_get_job_count_by_status

Get a count of jobs by their current status.

sdc_get_executor_metrics

Access metrics for executor components like collectors.

sdc_get_security_audit_metrics

Retrieve security audit metrics and logs for an organization.