home / skills / jeremylongshore / claude-code-plugins-plus-skills / retellai-prod-checklist

This skill guides you through deploying Retell AI integrations to production, ensuring readiness, monitoring, and safe rollback procedures.

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill retellai-prod-checklist

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
3.5 KB
---
name: retellai-prod-checklist
description: |
  Execute Retell AI production deployment checklist and rollback procedures.
  Use when deploying Retell AI integrations to production, preparing for launch,
  or implementing go-live procedures.
  Trigger with phrases like "retellai production", "deploy retellai",
  "retellai go-live", "retellai launch checklist".
allowed-tools: Read, Bash(kubectl:*), Bash(curl:*), Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---

# Retell AI Production Checklist

## Overview
Complete checklist for deploying Retell AI integrations to production.

## Prerequisites
- Staging environment tested and verified
- Production API keys available
- Deployment pipeline configured
- Monitoring and alerting ready

## Instructions

### Step 1: Pre-Deployment Configuration
- [ ] Production API keys in secure vault
- [ ] Environment variables set in deployment platform
- [ ] API key scopes are minimal (least privilege)
- [ ] Webhook endpoints configured with HTTPS
- [ ] Webhook secrets stored securely

### Step 2: Code Quality Verification
- [ ] All tests passing (`npm test`)
- [ ] No hardcoded credentials
- [ ] Error handling covers all Retell AI error types
- [ ] Rate limiting/backoff implemented
- [ ] Logging is production-appropriate

### Step 3: Infrastructure Setup
- [ ] Health check endpoint includes Retell AI connectivity
- [ ] Monitoring/alerting configured
- [ ] Circuit breaker pattern implemented
- [ ] Graceful degradation configured

### Step 4: Documentation Requirements
- [ ] Incident runbook created
- [ ] Key rotation procedure documented
- [ ] Rollback procedure documented
- [ ] On-call escalation path defined

### Step 5: Deploy with Gradual Rollout
```bash
# Pre-flight checks
curl -f https://staging.example.com/health
curl -s https://status.retellai.com

# Gradual rollout - start with canary (10%)
kubectl apply -f k8s/production.yaml
kubectl set image deployment/retellai-integration app=image:new --record
kubectl rollout pause deployment/retellai-integration

# Monitor canary traffic for 10 minutes
sleep 600
# Check error rates and latency before continuing

# If healthy, continue rollout to 50%
kubectl rollout resume deployment/retellai-integration
kubectl rollout pause deployment/retellai-integration
sleep 300

# Complete rollout to 100%
kubectl rollout resume deployment/retellai-integration
kubectl rollout status deployment/retellai-integration
```

## Output
- Deployed Retell AI integration
- Health checks passing
- Monitoring active
- Rollback procedure documented

## Error Handling
| Alert | Condition | Severity |
|-------|-----------|----------|
| API Down | 5xx errors > 10/min | P1 |
| High Latency | p99 > 5000ms | P2 |
| Rate Limited | 429 errors > 5/min | P2 |
| Auth Failures | 401/403 errors > 0 | P1 |

## Examples

### Health Check Implementation
```typescript
async function healthCheck(): Promise<{ status: string; retellai: any }> {
  const start = Date.now();
  try {
    await retellaiClient.ping();
    return { status: 'healthy', retellai: { connected: true, latencyMs: Date.now() - start } };
  } catch (error) {
    return { status: 'degraded', retellai: { connected: false, latencyMs: Date.now() - start } };
  }
}
```

### Immediate Rollback
```bash
kubectl rollout undo deployment/retellai-integration
kubectl rollout status deployment/retellai-integration
```

## Resources
- [Retell AI Status](https://status.retellai.com)
- [Retell AI Support](https://docs.retellai.com/support)

## Next Steps
For version upgrades, see `retellai-upgrade-migration`.

Overview

This skill executes a concise production deployment and rollback checklist for Retell AI integrations. It guides teams through pre-deployment checks, code and infrastructure verification, gradual rollout steps, and documented rollback procedures. Use it to reduce launch risk and confirm operational readiness.

How this skill works

The skill inspects preconditions like secure production API keys, environment variables, and webhook configuration, then validates code quality and infrastructure readiness (health checks, monitoring, circuit breakers). It outlines a staged deployment process with canary rollout commands, monitoring windows, and explicit rollback commands. It also produces or validates documentation items such as runbooks, key rotation, and escalation paths.

When to use it

  • Deploying Retell AI integrations to production
  • Preparing for a public launch or go-live event
  • When promoting changes from staging to production
  • Introducing new API scopes or rotating keys
  • When implementing or verifying rollback capability

Best practices

  • Store production API keys in a secure vault and use least-privilege scopes
  • Run full test suite and static analysis before promoting code
  • Ensure health check includes Retell AI connectivity and latency metrics
  • Implement rate limiting, exponential backoff, and circuit breakers
  • Roll out gradually (canary → partial → full) and monitor key KPIs before advancing

Example use cases

  • Pre-launch checklist validation for a marketing release integrating Retell AI
  • Canary rollout of a new Retell AI client version with 10% traffic and 10-minute observation
  • Emergency rollback after detecting increased 5xx or 429 rates using kubectl rollout undo
  • On-call playbook creation: document incident runbook, key rotation, and escalation path

FAQ

What immediate actions if Retell AI returns 5xx errors?

Treat as P1: pause the rollout, revert to the last stable revision (kubectl rollout undo), and follow the incident runbook. Notify on-call and check Retell AI status page.

How long should each canary observation window be?

Start with 10 minutes for an initial canary, then 5 minutes at intermediate stages; extend if traffic volume is low or metrics are noisy. Adjust based on your service SLAs and risk tolerance.