home / skills / openclaw / skills / k8-autoscaling
This skill configures Kubernetes autoscaling using HPA, VPA, and KEDA to optimize resource use and scale workloads efficiently.
npx playbooks add skill openclaw/skills --skill k8-autoscalingReview the files below or copy the command above to add this skill to your agents.
---
name: k8s-autoscaling
description: Configure Kubernetes autoscaling with HPA, VPA, and KEDA. Use for horizontal/vertical pod autoscaling, event-driven scaling, and capacity management.
---
# Kubernetes Autoscaling
Comprehensive autoscaling using HPA, VPA, and KEDA with kubectl-mcp-server tools.
## Quick Reference
### HPA (Horizontal Pod Autoscaler)
Basic CPU-based scaling:
```yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: my-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
minReplicas: 2
maxReplicas: 10
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
```
Apply and verify:
```
apply_manifest(hpa_yaml, namespace)
get_hpa(namespace)
```
### VPA (Vertical Pod Autoscaler)
Right-size resource requests:
```yaml
apiVersion: autoscaling.k8s.io/v1
kind: VerticalPodAutoscaler
metadata:
name: my-app-vpa
spec:
targetRef:
apiVersion: apps/v1
kind: Deployment
name: my-app
updatePolicy:
updateMode: "Auto"
```
## KEDA (Event-Driven Autoscaling)
### Detect KEDA Installation
```
keda_detect_tool()
```
### List ScaledObjects
```
keda_scaledobjects_list_tool(namespace)
keda_scaledobject_get_tool(name, namespace)
```
### List ScaledJobs
```
keda_scaledjobs_list_tool(namespace)
```
### Trigger Authentication
```
keda_triggerauths_list_tool(namespace)
keda_triggerauth_get_tool(name, namespace)
```
### KEDA-Managed HPAs
```
keda_hpa_list_tool(namespace)
```
See [KEDA-TRIGGERS.md](KEDA-TRIGGERS.md) for trigger configurations.
## Common KEDA Triggers
### Queue-Based Scaling (AWS SQS)
```yaml
apiVersion: keda.sh/v1alpha1
kind: ScaledObject
metadata:
name: sqs-scaler
spec:
scaleTargetRef:
name: queue-processor
minReplicaCount: 0 # Scale to zero!
maxReplicaCount: 100
triggers:
- type: aws-sqs-queue
metadata:
queueURL: https://sqs.region.amazonaws.com/...
queueLength: "5"
```
### Cron-Based Scaling
```yaml
triggers:
- type: cron
metadata:
timezone: America/New_York
start: 0 8 * * 1-5 # 8 AM weekdays
end: 0 18 * * 1-5 # 6 PM weekdays
desiredReplicas: "10"
```
### Prometheus Metrics
```yaml
triggers:
- type: prometheus
metadata:
serverAddress: http://prometheus:9090
metricName: http_requests_total
query: sum(rate(http_requests_total{app="myapp"}[2m]))
threshold: "100"
```
## Scaling Strategies
| Strategy | Tool | Use Case |
|----------|------|----------|
| CPU/Memory | HPA | Steady traffic patterns |
| Custom metrics | HPA v2 | Business metrics |
| Event-driven | KEDA | Queue processing, cron |
| Vertical | VPA | Right-size requests |
| Scale to zero | KEDA | Cost savings, idle workloads |
## Cost-Optimized Autoscaling
### Scale to Zero with KEDA
Reduce costs for idle workloads:
```
keda_scaledobjects_list_tool(namespace)
# ScaledObjects with minReplicaCount: 0 can scale to zero
```
### Right-Size with VPA
Get recommendations and apply:
```
get_resource_recommendations(namespace)
# Apply VPA recommendations
```
### Predictive Scaling
Use cron triggers for known patterns:
```yaml
# Scale up before traffic spike
triggers:
- type: cron
metadata:
start: 0 7 * * * # 7 AM
end: 0 9 * * * # 9 AM
desiredReplicas: "20"
```
## Multi-Cluster Autoscaling
Configure KEDA across clusters:
```
keda_scaledobjects_list_tool(namespace, context="production")
keda_scaledobjects_list_tool(namespace, context="staging")
```
## Troubleshooting
### HPA Not Scaling
```
get_hpa(namespace)
get_pod_metrics(name, namespace) # Metrics available?
describe_pod(name, namespace) # Resource requests set?
```
### KEDA Not Triggering
```
keda_scaledobject_get_tool(name, namespace) # Check status
get_events(namespace) # Check events
```
### Common Issues
| Symptom | Check | Resolution |
|---------|-------|------------|
| HPA unknown | Metrics server | Install metrics-server |
| KEDA no scale | Trigger auth | Check TriggerAuthentication |
| VPA not updating | Update mode | Set updateMode: Auto |
| Scale down slow | Stabilization | Adjust stabilizationWindowSeconds |
## Best Practices
1. **Always Set Resource Requests**
- HPA requires requests to calculate utilization
2. **Use Multiple Metrics**
- Combine CPU + custom metrics for accuracy
3. **Stabilization Windows**
- Prevent flapping with scaleDown stabilization
4. **Scale to Zero Carefully**
- Consider cold start time
- Use activation threshold
## Related Skills
- [k8s-cost](../k8s-cost/SKILL.md) - Cost optimization
- [k8s-troubleshoot](../k8s-troubleshoot/SKILL.md) - Debug scaling issues
This skill configures Kubernetes autoscaling using HPA, VPA, and KEDA to manage pod counts and resource requests across workloads. It helps implement horizontal CPU/custom-metric scaling, vertical right-sizing, and event-driven or cron-based scaling with scale-to-zero support. Use it to reduce cost, improve responsiveness, and automate capacity management.
The skill applies and inspects HPA, VPA, and KEDA resources and retrieves metrics and recommendations from the cluster. It can create and list ScaledObjects/ScaledJobs, check KEDA trigger and authentication status, and fetch VPA recommendations for right-sizing. It also provides quick troubleshooting checks for missing metrics, trigger failures, and stabilization settings.
Do I need metrics-server for HPA to work?
Yes. HPA requires metrics-server (or a compatible metrics pipeline) to read pod resource utilization.
Can VPA and HPA be used together?
Use caution: VPA adjusts resource requests and can affect HPA calculations. Prefer using VPA in recommendation mode or isolate VPA to workloads where HPA is not required.