home / skills / a5c-ai / babysitter / rate-limiter-designer

rate-limiter-designer skill

safe

/plugins/babysitter/skills/babysit/process/specializations/sdk-platform-development/skills/rate-limiter-designer

This skill designs and implements rate limiting strategies to protect APIs with token bucket, sliding window, and quotas for fair access.

npx playbooks add skill a5c-ai/babysitter --skill rate-limiter-designer

Review the files below or copy the command above to add this skill to your agents.

Files (2)

SKILL.md

1.9 KB

---
name: rate-limiter-designer
description: Design and implement rate limiting strategies
allowed-tools:
  - Read
  - Write
  - Edit
  - Glob
  - Grep
  - Bash
---

# Rate Limiter Designer Skill

## Overview

This skill designs and implements rate limiting strategies including token bucket, sliding window, and quota systems to protect APIs while providing fair access.

## Capabilities

- Implement token bucket and leaky bucket algorithms
- Configure per-key and per-user limits
- Design quota and usage systems
- Generate rate limit HTTP headers
- Implement distributed rate limiting
- Configure burst allowances
- Design rate limit tiers
- Handle rate limit exceeded responses

## Target Processes

- Platform API Gateway Design
- Authentication and Authorization Patterns
- Developer Portal Implementation

## Integration Points

- Redis for distributed state
- Rate limiting middleware
- API gateway plugins
- CDN rate limiting
- Database-backed quotas

## Input Requirements

- Rate limit requirements
- Tier definitions
- Burst allowances
- Distribution strategy
- Header conventions

## Output Artifacts

- Rate limiting implementation
- Quota management system
- Rate limit headers
- Tier configuration
- Admin management API
- Usage tracking

## Usage Example

```yaml
skill:
  name: rate-limiter-designer
  context:
    algorithm: sliding-window
    storage: redis
    tiers:
      - name: free
        requests: 100
        window: 1h
        burst: 10
      - name: pro
        requests: 10000
        window: 1h
        burst: 100
    headers:
      limit: X-RateLimit-Limit
      remaining: X-RateLimit-Remaining
      reset: X-RateLimit-Reset
    responses:
      exceeded:
        status: 429
        retryAfter: true
```

## Best Practices

1. Use sliding window for accuracy
2. Include burst allowances
3. Return standard rate limit headers
4. Provide clear Retry-After values
5. Implement distributed limiting
6. Design fair quota systems

Overview

This skill designs and implements production-ready rate limiting strategies to protect APIs while preserving fair access. It covers common algorithms (token bucket, sliding window, leaky bucket), quota systems, burst handling, and distributed deployments. The focus is on pragmatic, testable artifacts: middleware, headers, admin APIs, and storage integrations like Redis.

How this skill works

You provide requirements such as per-key/user limits, tier definitions, burst allowances, header conventions, and storage choice. The skill selects and configures algorithms (e.g., sliding window for accuracy, token bucket for smooth bursts), generates enforcement middleware or plugins, and produces headers and responses for clients. For distributed systems it designs state synchronization strategies (Redis, database-backed counters or coordinatorless algorithms) and produces admin/usage endpoints and tests.

When to use it

Protect public or internal APIs from overload and abuse
Implement tiered plans (free, pro, enterprise) with different quotas and bursts
Add fair per-user or per-key limits in multi-tenant platforms
Move from single-node limits to distributed rate limiting using Redis or DB
Expose standard rate limit headers and Retry-After behavior for client UX

Best practices

Prefer sliding-window or hybrid approaches for accurate short- and long-term limits
Include configurable burst allowances and smoothing to avoid client surprise
Return standard X-RateLimit headers and clear Retry-After values for retry guidance
Design for distributed state early: use Redis scripts or consistent hashing to avoid race conditions
Provide admin controls and usage reporting so operators can tune tiers and troubleshoot

Example use cases

Design a free/pro API tier: 100 req/hour with 10-request bursts for free, 10k req/hour with 100-request bursts for pro
Implement per-user sliding-window limits stored in Redis with Lua scripts to ensure atomic updates
Add middleware that emits X-RateLimit-Limit, X-RateLimit-Remaining, and X-RateLimit-Reset headers and 429 responses with Retry-After
Create quota management and billing hooks that decrement monthly allowances and expose admin dashboards
Migrate single-node token bucket logic to a distributed implementation using a central Redis counter and local leaky bucket smoothing

FAQ

Which algorithm should I choose for API gateways?

Use sliding-window for accuracy across windows; token bucket works well when burst smoothing is important. Hybrid approaches can combine both.

How do I handle distributed enforcement?

Use Redis or a centralized counter with atomic operations (Lua scripts) or consistent hashing to route keys to owning nodes; ensure clocks and TTLs are consistent.

What headers should I return to clients?

Return limit, remaining, and reset headers (e.g., X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) and include a Retry-After when returning 429.