home / skills / dexploarer / hyper-forge / sla-monitor-generator
This skill helps generate SLA/SLO/SLI monitoring configurations to track reliability, error budgets, and alerting for service health.
npx playbooks add skill dexploarer/hyper-forge --skill sla-monitor-generatorReview the files below or copy the command above to add this skill to your agents.
---
name: sla-monitor-generator
description: Generate SLA/SLO/SLI monitoring configurations for reliability tracking and error budget management. Activates for SLO setup, reliability targets, and error budget configuration.
allowed-tools: [Read, Write, Edit, Bash, Grep, Glob]
---
# SLA Monitor Generator
Define and monitor Service Level Objectives (SLOs) and track error budgets.
## SLO Definition Example
```yaml
slos:
- name: api-availability
sli:
metric: http_requests_total
filter: status < 500
target: 99.9 # 99.9% availability
window: 30d
- name: api-latency
sli:
metric: http_request_duration_seconds
percentile: 99
target: 200 # 200ms at p99
window: 30d
- name: error-rate
sli:
metric: http_requests_total
filter: status >= 500
target: 0.1 # < 0.1% error rate
window: 30d
```
## Prometheus AlertManager Rules
```yaml
groups:
- name: slo-alerts
rules:
- alert: SLOBudgetBurnRate
expr: |
(
1 - (sum(rate(http_requests_total{status!~"5.."}[5m]))
/ sum(rate(http_requests_total[5m])))
) > 0.001 * 14.4
for: 2m
labels:
severity: critical
annotations:
summary: "Fast burn rate detected - 2% budget in 1 hour"
```
## Best Practices
- ✅ Define SLIs based on user experience
- ✅ Set realistic SLO targets (99.9% not 100%)
- ✅ Track error budgets continuously
- ✅ Alert on burn rate, not just breaches
- ✅ Review and adjust SLOs quarterly
This skill generates SLA, SLO, and SLI monitoring configurations to track reliability and manage error budgets. It produces concrete SLO definitions and alerting rules suitable for systems that expose metrics (e.g., Prometheus). Use it to translate reliability targets into actionable monitoring and burn-rate alerts.
The skill inspects desired reliability targets and SLI definitions (metrics, filters, percentiles, windows) and outputs YAML configurations for SLOs and alert rules. It computes expressions for availability, latency, and error-rate SLIs and creates Prometheus-style alerting rules for budget burn rate and breaches. Outputs are ready to paste into your monitoring pipeline and adjust for your metric names and windows.
Can this output be used directly with Prometheus Alertmanager?
Yes. The skill produces Prometheus-style alerting rules and SLO YAML that can be integrated into Prometheus and Alertmanager, but metric names and label filters should be adjusted to match your instrumentation.
How are error budgets and burn rates calculated?
The skill expresses burn rate as the ratio of current observed SLI deviation versus allowed budget over a lookback interval, then translates that into PromQL expressions that compare recent error ratios to the SLO target scaled by the burn multiplier.