home / skills / secondsky / claude-skills / api-rate-limiting

This skill helps you implement robust API rate limiting using token bucket, sliding window, and Redis-based strategies to protect endpoints.

npx playbooks add skill secondsky/claude-skills --skill api-rate-limiting

Review the files below or copy the command above to add this skill to your agents.

Files (1)
SKILL.md
1.9 KB
---
name: api-rate-limiting
description: Implements API rate limiting using token bucket, sliding window, and Redis-based algorithms to protect against abuse. Use when securing public APIs, implementing tiered access, or preventing denial-of-service attacks.
---

# API Rate Limiting

Protect APIs from abuse using rate limiting algorithms with per-user and per-endpoint strategies.

## Algorithms

| Algorithm | Pros | Cons |
|-----------|------|------|
| Token Bucket | Handles bursts, smooth | Memory per user |
| Sliding Window | Accurate | Memory intensive |
| Fixed Window | Simple | Boundary spikes |

## Token Bucket (Node.js)

```javascript
class TokenBucket {
  constructor(capacity, refillRate) {
    this.capacity = capacity;
    this.tokens = capacity;
    this.refillRate = refillRate; // tokens per second
    this.lastRefill = Date.now();
  }

  consume() {
    this.refill();
    if (this.tokens >= 1) {
      this.tokens--;
      return true;
    }
    return false;
  }

  refill() {
    const now = Date.now();
    const elapsed = (now - this.lastRefill) / 1000;
    this.tokens = Math.min(this.capacity, this.tokens + elapsed * this.refillRate);
    this.lastRefill = now;
  }
}
```

## Express Middleware

```javascript
const rateLimit = require('express-rate-limit');

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15 minutes
  max: 100,
  standardHeaders: true,
  message: { error: 'Too many requests, try again later' }
});

app.use('/api/', limiter);
```

## Response Headers

```
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 45
X-RateLimit-Reset: 1705320000
Retry-After: 60
```

## Tiered Limits

| Tier | Requests/Hour |
|------|---------------|
| Free | 100 |
| Pro | 1,000 |
| Enterprise | 10,000 |

## Best Practices

- Use Redis for distributed rate limiting
- Include proper headers in responses
- Return 429 status with Retry-After
- Implement tiered limits for different plans
- Monitor rate limit metrics
- Test under load

Overview

This skill implements robust API rate limiting using token bucket, sliding window, and Redis-backed algorithms to protect services from abuse. It provides per-user and per-endpoint controls, tiered limits, and standard response headers so integrators can enforce fair usage and prevent denial-of-service conditions. The implementation is production-ready for TypeScript environments and integrates with common frameworks like Express.

How this skill works

The skill offers multiple algorithms: token bucket for burst tolerance, sliding window for accurate rolling limits, and fixed-window for simple scenarios. Redis is used for distributed coordination so limits remain consistent across multiple instances. Middleware hooks inject X-RateLimit headers and return 429 responses with Retry-After when limits are exceeded.

When to use it

  • Protect public APIs from automated abuse or scraping
  • Enforce tiered quotas for free, pro, and enterprise plans
  • Prevent denial-of-service and traffic spikes from affecting stability
  • Throttle specific heavy endpoints (uploads, search) per-user or per-key
  • Coordinate limits across multiple servers using Redis

Best practices

  • Prefer Redis-based algorithms for multi-instance deployments
  • Expose X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset, and Retry-After headers
  • Return HTTP 429 with a clear Retry-After value and structured error body
  • Use token bucket for bursty traffic and sliding window for precise control
  • Monitor metrics and test under realistic load to tune capacities
  • Apply tiered limits per API key or account and document quotas clearly

Example use cases

  • Apply a token-bucket limiter to login and authentication endpoints to allow brief bursts but stop rapid retries
  • Use sliding-window limits on search APIs to enforce per-minute rolling quotas for paid plans
  • Attach Redis-backed middleware across all API servers to keep client counters consistent
  • Implement tiered hourly quotas: Free 100/hr, Pro 1,000/hr, Enterprise 10,000/hr
  • Return standard rate-limit headers so client SDKs can gracefully back off and retry

FAQ

Which algorithm should I choose for my public API?

Use token bucket for endpoints that need burst tolerance, sliding window for precise rolling limits, and fixed window only for very simple cases where spikes are acceptable.

Do I need Redis for rate limiting?

For single-instance apps you can use in-memory buckets, but use Redis for distributed deployments to ensure counters are consistent across instances.