home / skills / openclaw / skills / k8s-capi

k8s-capi skill

safe

This skill helps you manage Kubernetes clusters lifecycle with Cluster API tools, enabling provisioning, scaling, upgrading across multi-cluster environments.

npx playbooks add skill openclaw/skills --skill k8s-capi

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

4.5 KB

---
name: k8s-capi
description: Cluster API lifecycle management for provisioning, scaling, and upgrading Kubernetes clusters. Use when managing cluster infrastructure or multi-cluster operations.
---

# Cluster API Lifecycle Management

Manage Kubernetes clusters using kubectl-mcp-server's Cluster API tools (11 tools).

## Check Installation

```python
capi_detect_tool()
```

## List Clusters

```python
# List all CAPI clusters
capi_clusters_list_tool(namespace="default")

# Shows:
# - Cluster name
# - Phase (Provisioning, Provisioned, Deleting)
# - Infrastructure ready
# - Control plane ready
```

## Get Cluster Details

```python
capi_cluster_get_tool(name="my-cluster", namespace="default")

# Shows:
# - Spec (control plane, infrastructure)
# - Status (phase, conditions)
# - Network configuration
```

## Get Cluster Kubeconfig

```python
# Get kubeconfig for workload cluster
capi_cluster_kubeconfig_tool(name="my-cluster", namespace="default")

# Returns kubeconfig to access the cluster
```

## Machines

### List Machines

```python
capi_machines_list_tool(namespace="default")

# Shows:
# - Machine name
# - Cluster
# - Phase (Running, Provisioning, Failed)
# - Provider ID
# - Version
```

### Get Machine Details

```python
capi_machine_get_tool(name="my-cluster-md-0-xxx", namespace="default")
```

## Machine Deployments

### List Machine Deployments

```python
capi_machinedeployments_list_tool(namespace="default")

# Shows:
# - Deployment name
# - Cluster
# - Replicas (ready/total)
# - Version
```

### Scale Machine Deployment

```python
# Scale worker nodes
capi_machinedeployment_scale_tool(
    name="my-cluster-md-0",
    namespace="default",
    replicas=5
)
```

## Machine Sets

```python
capi_machinesets_list_tool(namespace="default")
```

## Machine Health Checks

```python
capi_machinehealthchecks_list_tool(namespace="default")

# Health checks automatically remediate unhealthy machines
```

## Cluster Classes

```python
# List cluster templates
capi_clusterclasses_list_tool(namespace="default")

# ClusterClasses define reusable cluster configurations
```

## Create Cluster

```python
kubectl_apply(manifest="""
apiVersion: cluster.x-k8s.io/v1beta1
kind: Cluster
metadata:
  name: my-cluster
  namespace: default
spec:
  clusterNetwork:
    pods:
      cidrBlocks:
      - 192.168.0.0/16
    services:
      cidrBlocks:
      - 10.96.0.0/12
  controlPlaneRef:
    apiVersion: controlplane.cluster.x-k8s.io/v1beta1
    kind: KubeadmControlPlane
    name: my-cluster-control-plane
  infrastructureRef:
    apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
    kind: AWSCluster
    name: my-cluster
""")
```

## Create Machine Deployment

```python
kubectl_apply(manifest="""
apiVersion: cluster.x-k8s.io/v1beta1
kind: MachineDeployment
metadata:
  name: my-cluster-md-0
  namespace: default
spec:
  clusterName: my-cluster
  replicas: 3
  selector:
    matchLabels:
      cluster.x-k8s.io/cluster-name: my-cluster
  template:
    spec:
      clusterName: my-cluster
      version: v1.28.0
      bootstrap:
        configRef:
          apiVersion: bootstrap.cluster.x-k8s.io/v1beta1
          kind: KubeadmConfigTemplate
          name: my-cluster-md-0
      infrastructureRef:
        apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
        kind: AWSMachineTemplate
        name: my-cluster-md-0
""")
```

## Cluster Lifecycle Workflows

### Provision New Cluster
```python
1. kubectl_apply(cluster_manifest)
2. capi_clusters_list_tool(namespace)  # Wait for Provisioned
3. capi_cluster_kubeconfig_tool(name, namespace)  # Get access
```

### Scale Workers
```python
1. capi_machinedeployments_list_tool(namespace)
2. capi_machinedeployment_scale_tool(name, namespace, replicas)
3. capi_machines_list_tool(namespace)  # Monitor
```

### Upgrade Cluster
```python
1. # Update control plane version
2. # Update machine deployment version
3. capi_machines_list_tool(namespace)  # Monitor rollout
```

## Troubleshooting

### Cluster Stuck Provisioning

```python
1. capi_cluster_get_tool(name, namespace)  # Check conditions
2. capi_machines_list_tool(namespace)  # Check machine status
3. get_events(namespace)  # Check events
4. # Check infrastructure provider logs
```

### Machine Failed

```python
1. capi_machine_get_tool(name, namespace)
2. get_events(namespace)
3. # Common issues:
   # - Cloud provider quota
   # - Invalid machine template
   # - Network issues
```

## Related Skills

- [k8s-multicluster](../k8s-multicluster/SKILL.md) - Multi-cluster operations
- [k8s-operations](../k8s-operations/SKILL.md) - kubectl operations

Overview

This skill provides Cluster API (CAPI) lifecycle management tools for provisioning, scaling, and upgrading Kubernetes clusters. It exposes routines to detect installation, list and inspect clusters, retrieve kubeconfigs, and manage machines, machine deployments, and cluster classes. Use it to automate cluster creation, day‑2 operations, and multi-cluster maintenance with reproducible manifests and control-plane awareness.

How this skill works

The skill wraps common Cluster API operations into callable tools that query CAPI resources and run kubectl apply for manifests. It inspects Cluster, Machine, MachineDeployment, MachineSet, MachineHealthCheck, and ClusterClass objects to report phases, versions, readiness, and conditions. For cluster access it returns workload kubeconfigs; for changes it supports applying manifests and scaling MachineDeployments, then monitors status and events for remediation.

When to use it

Provision a new Kubernetes workload cluster using CAPI manifests
Scale worker pools or roll out node upgrades via MachineDeployments
Retrieve kubeconfig for cluster access and automated testing
Troubleshoot stuck provisioning or failed machines with targeted resource inspection
Manage reusable cluster templates using ClusterClasses for consistent environments

Best practices

Keep Cluster and MachineDeployment manifests in source control and apply via CI/CD pipelines
Validate infrastructure provider compatibility and quotas before provisioning
Use ClusterClasses and templates to reduce drift and enforce standards
Monitor MachineHealthChecks and events to automate remediation and reduce manual intervention
Perform upgrades in staged rollouts: control plane first, then machine deployments

Example use cases

Provision a new AWS-backed cluster with a Cluster manifest and KubeadmControlPlane reference
Scale a worker node pool from 3 to 5 replicas by invoking the MachineDeployment scale tool
Upgrade node OS or Kubernetes version by updating MachineDeployment and monitoring machine rollouts
Investigate a cluster stuck in Provisioning by checking cluster conditions, machine states, and namespace events
Recover from a failed machine by inspecting Machine details, checking provider logs, and letting health checks remediate

FAQ

How do I get kubeconfig for a workload cluster?

Use the capi_cluster_kubeconfig_tool with the cluster name and namespace; it returns the kubeconfig to access the workload cluster.

What do I check when a cluster is stuck provisioning?

Inspect the Cluster conditions with capi_cluster_get_tool, list machines with capi_machines_list_tool, and review namespace events and infrastructure provider logs for errors.