home / mcp / chaos mesh mcp server

Chaos Mesh MCP Server

Provides AI-assisted access to Chaos Mesh for creating, validating, and managing chaos experiments via natural language conversations.

Installation
Add the following to your MCP client configuration file.

Configuration

View docs
{
  "mcpServers": {
    "ernestolee13-chaos-mesh-mcp": {
      "command": "python",
      "args": [
        "-m",
        "chaos_mesh_mcp.server"
      ]
    }
  }
}

You can interact with Chaos Mesh through a dedicated MCP server that lets you create, manage, and validate chaos experiments using natural language. This makes it easy to perform resilience testing and automated chaos operations from conversational interfaces.

How to use

You use an MCP client to talk to the Chaos Mesh MCP server. The server exposes a set of tools to create chaos experiments, validate your environment, and manage ongoing experiments. Start by configuring your MCP client with the server connection details, then issue natural language requests like creating a NetworkChaos delay, listing active experiments, or checking chaos type requirements.

How to install

Prerequisites to run the MCP server are: kubectl installed and configured, a Kubernetes cluster accessible via kubectl (v1.15+), and Chaos Mesh installed in the cluster (v2.6+ recommended). You may also need Python and pip for the MCP server runtime.

Install Chaos Mesh in your cluster if it is not already present. The following sequence uses Helm to install Chaos Mesh version 2.8.0 into the chaos-mesh namespace.

# Using Helm (Recommended)
helm repo add chaos-mesh https://charts.chaos-mesh.org
helm install chaos-mesh chaos-mesh/chaos-mesh \
  --namespace chaos-mesh \
  --create-namespace \
  --version 2.8.0

# Verify installation
kubectl get pods -n chaos-mesh
# Should see: chaos-controller-manager, chaos-daemon, chaos-dashboard

Install MCP Server

Install the MCP server runtime so you can start interacting with Chaos Mesh via natural language automation.

Option 1: Install from GitHub (Recommended)

pip install git+https://github.com/ernestolee13/chaos-mesh-mcp.git

Option 2: Install from Source

git clone https://github.com/ernestolee13/chaos-mesh-mcp.git
cd chaos-mesh-mcp
pip install -e .

Available tools

Environment Validation

Runs comprehensive checks to ensure kubectl is available, the cluster is reachable, Chaos Mesh is installed, CRDs exist for all chaos types, and required components are running.

check_chaos_type_requirements

Verifies the prerequisites and component requirements for a specific chaos type, ensuring the requested chaos can be executed.

get_chaos_requirements

Retrieves detailed requirements for a chosen chaos type, including any external dependencies or agent requirements.

get_experiment_status

Fetches detailed status information for a given chaos experiment, including phase, start time, and observed events.

list_active_experiments

Lists all currently running chaos experiments across namespaces with their status.

delete_experiment

Deletes a specified chaos experiment from the cluster.

pause_experiment

Pauses a running chaos experiment to halt progression without deleting it.

resume_experiment

Resumes a paused chaos experiment from where it left off.

get_experiment_events

Retrieves Kubernetes events related to chaos experiments for debugging and auditing.

create_network_delay

Injects network latency to simulate delays between pods or services.

create_network_loss

Introduces packet loss to mimic unreliable network conditions.

create_network_partition

Creates network partitions to test resilience against split-brain scenarios.

create_network_corrupt

Corrupts network packets to test data integrity under adverse conditions.

create_stress_cpu

Applies CPU stress to containers to evaluate performance under load.

create_stress_memory

Applies memory pressure to containers to simulate OOM-like scenarios.

create_stress_combined

Applies both CPU and memory stress simultaneously.

create_pod_kill

Permanently kills pods to test recovery mechanisms.

create_pod_failure

Temporarily makes pods unavailable without killing them.

create_container_kill

Kills specific containers within pods to test intra-pod resilience.

create_io_latency

Injects I/O latency to simulate slow disk operations.

create_io_fault

Returns error codes for file operations to test error handling.

create_io_attr_override

Modifies file attributes such as permissions and size.

create_io_mistake

Injects data corruption into read/write operations.

create_http_abort

Aborts HTTP connections to simulate network failures.

create_http_delay

Adds latency to HTTP requests and responses.

create_http_replace

Replaces HTTP message content, including headers and body.

create_http_patch

Appends or modifies content in HTTP messages.

create_dns_error

Returns DNS errors for specified domain patterns.

create_dns_random

Returns random IP addresses for DNS queries.

create_physical_stress_cpu

Applies CPU stress on physical or virtual machines.

create_physical_stress_memory

Applies memory pressure on physical or virtual machines.

create_physical_disk_fill

Fills disk space on physical or virtual machines.

create_physical_process_kill

Kills processes on physical or virtual machines.

create_physical_clock_skew

Skews the system clock on physical or virtual machines.