home / skills / jeremylongshore / claude-code-plugins-plus-skills / exa-rate-limits

This skill implements Exa rate limiting with exponential backoff and idempotency to optimize API throughput and reliability.

npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill exa-rate-limits

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
4.3 KB
---
name: exa-rate-limits
description: |
  Implement Exa rate limiting, backoff, and idempotency patterns.
  Use when handling rate limit errors, implementing retry logic,
  or optimizing API request throughput for Exa.
  Trigger with phrases like "exa rate limit", "exa throttling",
  "exa 429", "exa retry", "exa backoff".
allowed-tools: Read, Write, Edit
version: 1.0.0
license: MIT
author: Jeremy Longshore <[email protected]>
---

# Exa Rate Limits

## Overview
Handle Exa rate limits gracefully with exponential backoff and idempotency.

## Prerequisites
- Exa SDK installed
- Understanding of async/await patterns
- Access to rate limit headers

## Instructions

### Step 1: Understand Rate Limit Tiers

| Tier | Requests/min | Requests/day | Burst |
|------|-------------|--------------|-------|
| Free | 60 | 1,000 | 10 |
| Pro | 300 | 10,000 | 50 |
| Enterprise | 1,000 | 100,000 | 200 |

### Step 2: Implement Exponential Backoff with Jitter

```typescript
async function withExponentialBackoff<T>(
  operation: () => Promise<T>,
  config = { maxRetries: 5, baseDelayMs: 1000, maxDelayMs: 32000, jitterMs: 500 }
): Promise<T> {
  for (let attempt = 0; attempt <= config.maxRetries; attempt++) {
    try {
      return await operation();
    } catch (error: any) {
      if (attempt === config.maxRetries) throw error;
      const status = error.status || error.response?.status;
      if (status !== 429 && (status < 500 || status >= 600)) throw error;

      // Exponential delay with jitter to prevent thundering herd
      const exponentialDelay = config.baseDelayMs * Math.pow(2, attempt);
      const jitter = Math.random() * config.jitterMs;
      const delay = Math.min(exponentialDelay + jitter, config.maxDelayMs);

      console.log(`Rate limited. Retrying in ${delay.toFixed(0)}ms...`);
      await new Promise(r => setTimeout(r, delay));
    }
  }
  throw new Error('Unreachable');
}
```

### Step 3: Add Idempotency Keys

```typescript
import { v4 as uuidv4 } from 'uuid';
import crypto from 'crypto';

// Generate deterministic key from operation params (for safe retries)
function generateIdempotencyKey(operation: string, params: Record<string, any>): string {
  const data = JSON.stringify({ operation, params });
  return crypto.createHash('sha256').update(data).digest('hex');
}

async function idempotentRequest<T>(
  client: ExaClient,
  params: Record<string, any>,
  idempotencyKey?: string  // Pass existing key for retries
): Promise<T> {
  // Use provided key (for retries) or generate deterministic key from params
  const key = idempotencyKey || generateIdempotencyKey(params.method || 'POST', params);
  return client.request({
    ...params,
    headers: { 'Idempotency-Key': key, ...params.headers },
  });
}
```

## Output
- Reliable API calls with automatic retry
- Idempotent requests preventing duplicates
- Rate limit headers properly handled

## Error Handling
| Header | Description | Action |
|--------|-------------|--------|
| X-RateLimit-Limit | Max requests | Monitor usage |
| X-RateLimit-Remaining | Remaining requests | Throttle if low |
| X-RateLimit-Reset | Reset timestamp | Wait until reset |
| Retry-After | Seconds to wait | Honor this value |

## Examples

### Queue-Based Rate Limiting
```typescript
import PQueue from 'p-queue';

const queue = new PQueue({
  concurrency: 5,
  interval: 1000,
  intervalCap: 10,
});

async function queuedRequest<T>(operation: () => Promise<T>): Promise<T> {
  return queue.add(operation);
}
```

### Monitor Rate Limit Usage
```typescript
class RateLimitMonitor {
  private remaining: number = 60;
  private resetAt: Date = new Date();

  updateFromHeaders(headers: Headers) {
    this.remaining = parseInt(headers.get('X-RateLimit-Remaining') || '60');
    const resetTimestamp = headers.get('X-RateLimit-Reset');
    if (resetTimestamp) {
      this.resetAt = new Date(parseInt(resetTimestamp) * 1000);
    }
  }

  shouldThrottle(): boolean {
    // Only throttle if low remaining AND reset hasn't happened yet
    return this.remaining < 5 && new Date() < this.resetAt;
  }

  getWaitTime(): number {
    return Math.max(0, this.resetAt.getTime() - Date.now());
  }
}
```

## Resources
- [Exa Rate Limits](https://docs.exa.com/rate-limits)
- [p-queue Documentation](https://github.com/sindresorhus/p-queue)

## Next Steps
For security configuration, see `exa-security-basics`.

Overview

This skill implements Exa rate limiting, exponential backoff with jitter, and idempotency patterns to make API usage reliable and predictable. It provides reusable patterns for retries, idempotency keys, queue-based throughput control, and rate-limit monitoring. Use it to prevent duplicate side effects, avoid thundering-herd retries, and respect Exa rate-limit headers.

How this skill works

The skill wraps API operations in an exponential-backoff loop that retries on 429 and 5xx errors, adding random jitter to spread retry attempts. It supports deterministic idempotency keys derived from operation parameters so retries do not create duplicate effects. It also reads rate-limit headers (X-RateLimit-*, Retry-After) and optionally gates requests through a queue to enforce concurrency and interval caps.

When to use it

  • Handling Exa 429 responses or intermittent 5xx errors
  • Implementing safe retry logic for operations with side effects
  • Ensuring requests are idempotent across retries and client restarts
  • Throttling high-throughput applications to stay within tier limits
  • Scheduling backoff when Retry-After or reset headers indicate wait time

Best practices

  • Honor Retry-After and X-RateLimit-Reset values before applying backoff
  • Use deterministic idempotency keys for stateful operations (hash operation + params)
  • Add jitter to exponential backoff to avoid synchronized retries
  • Combine a request queue (concurrency + intervalCap) with header monitoring
  • Log retry attempts and remaining quota to detect usage spikes early

Example use cases

  • A job runner that retries failed Exa API calls with exponential backoff and idempotency keys to avoid duplicate jobs
  • An interactive app that reads X-RateLimit-Remaining and temporarily throttles UI-triggered requests when quota is low
  • A batch uploader that uses a PQueue-style queue to cap requests per second and spread bursts
  • A webhook consumer that respects Retry-After headers and automatically reschedules processing
  • A monitoring agent that updates internal counters from X-RateLimit-* headers and alerts before limits are reached

FAQ

What errors trigger retries?

Retry on 429 (rate limited) and server errors (5xx). Do not retry on client errors (4xx) other than 429 unless the operation is known safe.

How should I generate idempotency keys?

Create a deterministic hash (e.g., SHA-256) of the operation name and canonicalized parameters. Allow passing an explicit key for controlled retry attempts.