home / skills / jeremylongshore / claude-code-plugins-plus-skills / replit-prod-checklist

replit-prod-checklist skill

needs review

/plugins/saas-packs/replit-pack/skills/replit-prod-checklist

This skill guides you through the Replit production deployment checklist and rollback steps to ensure smooth go-lives.

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill replit-prod-checklist

Review the files below or copy the command above to add this skill to your agents.

Files (1)

SKILL.md

3.4 KB

---
name: replit-prod-checklist
description: |
  Execute Replit production deployment checklist and rollback procedures.
  Use when deploying Replit integrations to production, preparing for launch,
  or implementing go-live procedures.
  Trigger with phrases like "replit production", "deploy replit",
  "replit go-live", "replit launch checklist".
allowed-tools: Read, Bash(kubectl:*), Bash(curl:*), Grep
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---

# Replit Production Checklist

## Overview
Complete checklist for deploying Replit integrations to production.

## Prerequisites
- Staging environment tested and verified
- Production API keys available
- Deployment pipeline configured
- Monitoring and alerting ready

## Instructions

### Step 1: Pre-Deployment Configuration
- [ ] Production API keys in secure vault
- [ ] Environment variables set in deployment platform
- [ ] API key scopes are minimal (least privilege)
- [ ] Webhook endpoints configured with HTTPS
- [ ] Webhook secrets stored securely

### Step 2: Code Quality Verification
- [ ] All tests passing (`npm test`)
- [ ] No hardcoded credentials
- [ ] Error handling covers all Replit error types
- [ ] Rate limiting/backoff implemented
- [ ] Logging is production-appropriate

### Step 3: Infrastructure Setup
- [ ] Health check endpoint includes Replit connectivity
- [ ] Monitoring/alerting configured
- [ ] Circuit breaker pattern implemented
- [ ] Graceful degradation configured

### Step 4: Documentation Requirements
- [ ] Incident runbook created
- [ ] Key rotation procedure documented
- [ ] Rollback procedure documented
- [ ] On-call escalation path defined

### Step 5: Deploy with Gradual Rollout
```bash
# Pre-flight checks
curl -f https://staging.example.com/health
curl -s https://status.replit.com

# Gradual rollout - start with canary (10%)
kubectl apply -f k8s/production.yaml
kubectl set image deployment/replit-integration app=image:new --record
kubectl rollout pause deployment/replit-integration

# Monitor canary traffic for 10 minutes
sleep 600
# Check error rates and latency before continuing

# If healthy, continue rollout to 50%
kubectl rollout resume deployment/replit-integration
kubectl rollout pause deployment/replit-integration
sleep 300

# Complete rollout to 100%
kubectl rollout resume deployment/replit-integration
kubectl rollout status deployment/replit-integration
```

## Output
- Deployed Replit integration
- Health checks passing
- Monitoring active
- Rollback procedure documented

## Error Handling
| Alert | Condition | Severity |
|-------|-----------|----------|
| API Down | 5xx errors > 10/min | P1 |
| High Latency | p99 > 5000ms | P2 |
| Rate Limited | 429 errors > 5/min | P2 |
| Auth Failures | 401/403 errors > 0 | P1 |

## Examples

### Health Check Implementation
```typescript
async function healthCheck(): Promise<{ status: string; replit: any }> {
  const start = Date.now();
  try {
    await replitClient.ping();
    return { status: 'healthy', replit: { connected: true, latencyMs: Date.now() - start } };
  } catch (error) {
    return { status: 'degraded', replit: { connected: false, latencyMs: Date.now() - start } };
  }
}
```

### Immediate Rollback
```bash
kubectl rollout undo deployment/replit-integration
kubectl rollout status deployment/replit-integration
```

## Resources
- [Replit Status](https://status.replit.com)
- [Replit Support](https://docs.replit.com/support)

## Next Steps
For version upgrades, see `replit-upgrade-migration`.

Overview

This skill executes a production deployment checklist and rollback procedures for Replit integrations. It guides teams through pre-deployment configuration, code and infrastructure verification, gradual rollout steps, and documented rollback. The outcome is a validated production deployment with monitoring and incident procedures in place.

How this skill works

The skill inspects prerequisites (staging verification, API keys, pipeline) and walks through five key steps: pre-deployment configuration, code quality checks, infrastructure setup, documentation, and a gradual rollout. It provides concrete commands for canary releases, monitoring windows, and immediate rollback commands. It also maps alert thresholds and expected outputs like health checks and monitoring activation.

When to use it

Before launching a Replit integration to production
When preparing a canary or phased rollout for a Replit-related service
During go-live readiness reviews and pre-flight checks
When verifying production-grade observability and incident runbooks
Before rotating production API keys or changing webhook endpoints

Best practices

Store production API keys in a secure vault and apply least-privilege scopes
Run full test suite and remove hardcoded credentials before deploy
Implement health checks that verify Replit connectivity and latency
Roll out gradually (canary → 50% → 100%) and monitor for a defined window
Document incident runbooks, key rotation, and a clear rollback path

Example use cases

Execute a canary deployment for a Replit integration using kubectl rollout with a 10-minute canary window
Validate production readiness by running health checks against staging and Replit status
Trigger an immediate rollback when API 5xx or auth failures exceed thresholds
Onboard on-call teams using the incident runbook and escalation path
Add circuit breaker and graceful degradation to handle Replit outages

FAQ

What immediate steps should I take if the canary shows high errors?

Pause the rollout, analyze logs and metrics, then either fix and re-deploy the canary or run an immediate kubectl rollout undo to revert to the previous stable version.

Which alert thresholds are recommended for production?

Use P1 for sustained 5xx errors (>10/min) or any 401/403 auth failures; use P2 for p99 latency > 5000ms or 429 rate limits >5/min.