home / skills / jeremylongshore / claude-code-plugins-plus-skills / replit-reliability-patterns

This skill helps you implement robust Replit reliability patterns such as circuit breakers, idempotency, bulkheads, and graceful degradation to improve fault

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill replit-reliability-patterns

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
7.0 KB
---
name: replit-reliability-patterns
description: |
  Implement Replit reliability patterns including circuit breakers, idempotency, and graceful degradation.
  Use when building fault-tolerant Replit integrations, implementing retry strategies,
  or adding resilience to production Replit services.
  Trigger with phrases like "replit reliability", "replit circuit breaker",
  "replit idempotent", "replit resilience", "replit fallback", "replit bulkhead".
allowed-tools: Read, Write, Edit
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---

# Replit Reliability Patterns

## Overview
Production-grade reliability patterns for Replit integrations.

## Prerequisites
- Understanding of circuit breaker pattern
- opossum or similar library installed
- Queue infrastructure for DLQ
- Caching layer for fallbacks

## Circuit Breaker

```typescript
import CircuitBreaker from 'opossum';

const replitBreaker = new CircuitBreaker(
  async (operation: () => Promise<any>) => operation(),
  {
    timeout: 30000,
    errorThresholdPercentage: 50,
    resetTimeout: 30000,
    volumeThreshold: 10,
  }
);

// Events
replitBreaker.on('open', () => {
  console.warn('Replit circuit OPEN - requests failing fast');
  alertOps('Replit circuit breaker opened');
});

replitBreaker.on('halfOpen', () => {
  console.info('Replit circuit HALF-OPEN - testing recovery');
});

replitBreaker.on('close', () => {
  console.info('Replit circuit CLOSED - normal operation');
});

// Usage
async function safeReplitCall<T>(fn: () => Promise<T>): Promise<T> {
  return replitBreaker.fire(fn);
}
```

## Idempotency Keys

```typescript
import { v4 as uuidv4 } from 'uuid';
import crypto from 'crypto';

// Generate deterministic idempotency key from input
function generateIdempotencyKey(
  operation: string,
  params: Record<string, any>
): string {
  const data = JSON.stringify({ operation, params });
  return crypto.createHash('sha256').update(data).digest('hex');
}

// Or use random key with storage
class IdempotencyManager {
  private store: Map<string, { key: string; expiresAt: Date }> = new Map();

  getOrCreate(operationId: string): string {
    const existing = this.store.get(operationId);
    if (existing && existing.expiresAt > new Date()) {
      return existing.key;
    }

    const key = uuidv4();
    this.store.set(operationId, {
      key,
      expiresAt: new Date(Date.now() + 24 * 60 * 60 * 1000),
    });
    return key;
  }
}
```

## Bulkhead Pattern

```typescript
import PQueue from 'p-queue';

// Separate queues for different operations
const replitQueues = {
  critical: new PQueue({ concurrency: 10 }),
  normal: new PQueue({ concurrency: 5 }),
  bulk: new PQueue({ concurrency: 2 }),
};

async function prioritizedReplitCall<T>(
  priority: 'critical' | 'normal' | 'bulk',
  fn: () => Promise<T>
): Promise<T> {
  return replitQueues[priority].add(fn);
}

// Usage
await prioritizedReplitCall('critical', () =>
  replitClient.processPayment(order)
);

await prioritizedReplitCall('bulk', () =>
  replitClient.syncCatalog(products)
);
```

## Timeout Hierarchy

```typescript
const TIMEOUT_CONFIG = {
  connect: 5000,      // Initial connection
  request: 30000,     // Standard requests
  upload: 120000,     // File uploads
  longPoll: 300000,   // Webhook long-polling
};

async function timedoutReplitCall<T>(
  operation: 'connect' | 'request' | 'upload' | 'longPoll',
  fn: () => Promise<T>
): Promise<T> {
  const timeout = TIMEOUT_CONFIG[operation];

  return Promise.race([
    fn(),
    new Promise<never>((_, reject) =>
      setTimeout(() => reject(new Error(`Replit ${operation} timeout`)), timeout)
    ),
  ]);
}
```

## Graceful Degradation

```typescript
interface ReplitFallback {
  enabled: boolean;
  data: any;
  staleness: 'fresh' | 'stale' | 'very_stale';
}

async function withReplitFallback<T>(
  fn: () => Promise<T>,
  fallbackFn: () => Promise<T>
): Promise<{ data: T; fallback: boolean }> {
  try {
    const data = await fn();
    // Update cache for future fallback
    await updateFallbackCache(data);
    return { data, fallback: false };
  } catch (error) {
    console.warn('Replit failed, using fallback:', error.message);
    const data = await fallbackFn();
    return { data, fallback: true };
  }
}
```

## Dead Letter Queue

```typescript
interface DeadLetterEntry {
  id: string;
  operation: string;
  payload: any;
  error: string;
  attempts: number;
  lastAttempt: Date;
}

class ReplitDeadLetterQueue {
  private queue: DeadLetterEntry[] = [];

  add(entry: Omit<DeadLetterEntry, 'id' | 'lastAttempt'>): void {
    this.queue.push({
      ...entry,
      id: uuidv4(),
      lastAttempt: new Date(),
    });
  }

  async processOne(): Promise<boolean> {
    const entry = this.queue.shift();
    if (!entry) return false;

    try {
      await replitClient[entry.operation](entry.payload);
      console.log(`DLQ: Successfully reprocessed ${entry.id}`);
      return true;
    } catch (error) {
      entry.attempts++;
      entry.lastAttempt = new Date();

      if (entry.attempts < 5) {
        this.queue.push(entry);
      } else {
        console.error(`DLQ: Giving up on ${entry.id} after 5 attempts`);
        await alertOnPermanentFailure(entry);
      }
      return false;
    }
  }
}
```

## Health Check with Degraded State

```typescript
type HealthStatus = 'healthy' | 'degraded' | 'unhealthy';

async function replitHealthCheck(): Promise<{
  status: HealthStatus;
  details: Record<string, any>;
}> {
  const checks = {
    api: await checkApiConnectivity(),
    circuitBreaker: replitBreaker.stats(),
    dlqSize: deadLetterQueue.size(),
  };

  const status: HealthStatus =
    !checks.api.connected ? 'unhealthy' :
    checks.circuitBreaker.state === 'open' ? 'degraded' :
    checks.dlqSize > 100 ? 'degraded' :
    'healthy';

  return { status, details: checks };
}
```

## Instructions

### Step 1: Implement Circuit Breaker
Wrap Replit calls with circuit breaker.

### Step 2: Add Idempotency Keys
Generate deterministic keys for operations.

### Step 3: Configure Bulkheads
Separate queues for different priorities.

### Step 4: Set Up Dead Letter Queue
Handle permanent failures gracefully.

## Output
- Circuit breaker protecting Replit calls
- Idempotency preventing duplicates
- Bulkhead isolation implemented
- DLQ for failed operations

## Error Handling
| Issue | Cause | Solution |
|-------|-------|----------|
| Circuit stays open | Threshold too low | Adjust error percentage |
| Duplicate operations | Missing idempotency | Add idempotency key |
| Queue full | Rate too high | Increase concurrency |
| DLQ growing | Persistent failures | Investigate root cause |

## Examples

### Quick Circuit Check
```typescript
const state = replitBreaker.stats().state;
console.log('Replit circuit:', state);
```

## Resources
- [Circuit Breaker Pattern](https://martinfowler.com/bliki/CircuitBreaker.html)
- [Opossum Documentation](https://nodeshift.dev/opossum/)
- [Replit Reliability Guide](https://docs.replit.com/reliability)

## Next Steps
For policy enforcement, see `replit-policy-guardrails`.

Overview

This skill implements production-grade reliability patterns for Replit integrations, including circuit breakers, idempotency, bulkheads, graceful degradation, and a dead letter queue. It provides concrete code patterns and configuration guidance to make Replit-backed services resilient under load and partial failure. Use it to protect calls, prevent duplicates, and ensure controlled degradation when upstream systems falter.

How this skill works

The skill wraps Replit operations with a circuit breaker to fail fast and recover safely, and uses idempotency keys to avoid duplicate side effects. It separates work into priority queues (bulkhead) and applies tailored timeouts per operation. When requests fail, it falls back to cached data or a fallback path and routes hard failures into a dead letter queue for retries and operational inspection.

When to use it

  • Building integrations that call Replit APIs from production services
  • Protecting critical flows from cascading failures or high latency
  • Ensuring operations remain idempotent across retries
  • Prioritizing traffic with limited upstream capacity
  • Adding recoverable fallback behavior and backlog processing

Best practices

  • Tune circuit breaker thresholds (error percentage, volume, reset timeout) based on real traffic patterns
  • Generate deterministic idempotency keys from operation name and payload to avoid duplicates
  • Use separate queues with conservative concurrency for noncritical work to limit blast radius
  • Keep a short, explicit timeout hierarchy for connect/request/upload/long-poll operations
  • Persist fallback cache and DLQ entries to durable storage and monitor DLQ growth

Example use cases

  • Protecting a payment or provisioning call to Replit with a circuit breaker and idempotency key
  • Routing high-priority customer requests through a critical queue while syncing bulk data on a low-priority queue
  • Serving stale but acceptable cached responses when Replit is partially degraded
  • Enqueuing failed deployments into a DLQ for controlled retries and operator alerts
  • Adding health checks that expose degraded state when circuit is open or DLQ grows

FAQ

How do I choose circuit breaker thresholds?

Start with conservative volume and error thresholds reflecting baseline traffic, monitor failures, then gradually adjust errorPercentage, volumeThreshold, and resetTimeout to balance sensitivity and availability.

When should I use deterministic vs random idempotency keys?

Use deterministic keys when operations are repeatable from the same input (prevents duplicates reliably). Use random keys with stored mapping when the operation identity is external or ephemeral.