home / skills / toilahuongg / shopify-agents-kit / resilience-engineering
This skill helps you implement resilient Shopify integrations by managing rate limits, retry strategies, queues, and circuit breakers.
npx playbooks add skill toilahuongg/shopify-agents-kit --skill resilience-engineeringReview the files below or copy the command above to add this skill to your agents.
---
name: resilience-engineering
description: Strategies for handling Shopify API Rate Limits (429), retry policies, and circuit breakers. Essential for high-traffic apps.
---
# Resilience Engineering for Shopify Apps
Shopify's API limit is a "Leaky Bucket". If you pour too much too fast, it overflows (429 Too Many Requests). Your app must handle this gracefully.
## 1. Handling Rate Limits (429)
### The "Retry-After" Header
When Shopify returns a 429, they include a `Retry-After` header (seconds to wait).
**Implementation (using `bottleneck` or custom delay)**:
```typescript
async function fetchWithRetry(url, options, retries = 3) {
try {
const res = await fetch(url, options);
if (res.status === 429) {
const wait = parseFloat(res.headers.get("Retry-After") || "1.0");
if (retries > 0) {
await new Promise(r => setTimeout(r, wait * 1000));
return fetchWithRetry(url, options, retries - 1);
}
}
return res;
} catch (err) {
// network error handling
}
}
```
*Note: The official `@shopify/shopify-api` client handles retries automatically if configured.*
## 2. Queues & Throttling
For bulk operations (e.g., syncing 10,000 products), you cannot just loop and await.
### Using `bottleneck`
```bash
npm install bottleneck
```
```typescript
import Bottleneck from "bottleneck";
const limiter = new Bottleneck({
minTime: 500, // wait 500ms between requests (2 req/sec)
maxConcurrent: 5,
});
const products = await limiter.schedule(() => shopify.rest.Product.list({ ... }));
```
### Background Jobs (BullMQ)
Move heavy lifting to a background worker. (See `redis-bullmq` skill - *to be added if needed, but conceptually here*).
## 3. Circuit Breaker
If an external service (e.g., your own backend API or a shipping carrier) goes down, stop calling it to prevent cascading failures.
### Using `cockatiel`
```bash
npm install cockatiel
```
```typescript
import { CircuitBreaker, handleAll, retry } from 'cockatiel';
// Create a Retry Policy
const retryPolicy = retry(handleAll, { maxAttempts: 3, backoff: new ExponentialBackoff() });
// Create a Circuit Breaker (open after 5 failures, reset after 10s)
const circuitBreaker = new CircuitBreaker(handleAll, {
halfOpenAfter: 10 * 1000,
breaker: new ConsecutiveBreaker(5),
});
// Execute
const result = await retryPolicy.execute(() =>
circuitBreaker.execute(() => fetchMyService())
);
```
## 4. Webhook Idempotency
Shopify guarantees "at least once" delivery. You might receive the same `orders/create` webhook twice.
**Fix**: Store `X-Shopify-Webhook-Id` in Redis/DB with a short TTL (e.g., 24h). If it exists, ignore the request.
This skill teaches practical resilience engineering for Shopify apps to handle API rate limits (429), retries, throttling, and circuit breakers. It focuses on concrete patterns: honoring Retry-After headers, queuing and throttling bulk work, using circuit breakers for unreliable services, and ensuring webhook idempotency. The guidance is geared for high-traffic apps and large sync jobs.
It inspects common failure modes from Shopify and related services and prescribes code-level strategies. Key behaviors include reading the Retry-After header on 429 responses, applying exponential backoff and retry policies, scheduling requests through a limiter or background queue, and wrapping unstable dependencies with a circuit breaker. It also enforces webhook idempotency by tracking unique webhook delivery IDs.
How many retries should I allow after a 429?
Use a small number (2–4) and always wait the Retry-After interval; unlimited retries can worsen congestion.
Should I use fixed delays or exponential backoff?
Prefer exponential backoff with jitter for network errors, but honor exact Retry-After when Shopify provides it.
When should a circuit breaker open?
Open after a short burst of consecutive failures (e.g., 3–5) and keep it open for a measured cooldown (seconds to minutes) before half-open trials.