home / skills / jeremylongshore / claude-code-plugins-plus-skills / rate-limiting-apis
/plugins/api-development/api-rate-limiter/skills/rate-limiting-apis
This skill helps you implement robust rate limiting for APIs using sliding windows, token buckets, and quotas to protect resources.
npx playbooks add skill jeremylongshore/claude-code-plugins-plus-skills --skill rate-limiting-apisReview the files below or copy the command above to add this skill to your agents.
---
name: rate-limiting-apis
description: |
Implement sophisticated rate limiting with sliding windows, token buckets, and quotas.
Use when protecting APIs from excessive requests.
Trigger with phrases like "add rate limiting", "limit API requests", or "implement rate limits".
allowed-tools: Read, Write, Edit, Grep, Glob, Bash(api:ratelimit-*)
version: 1.0.0
author: Jeremy Longshore <[email protected]>
license: MIT
---
# Rate Limiting Apis
## Overview
This skill provides automated assistance for api rate limiter tasks.
This skill provides automated assistance for the described functionality.
## Prerequisites
Before using this skill, ensure you have:
- API design specifications or requirements documented
- Development environment with necessary frameworks installed
- Database or backend services accessible for integration
- Authentication and authorization strategies defined
- Testing tools and environments configured
## Instructions
1. Use Read tool to examine existing API specifications from {baseDir}/api-specs/
2. Define resource models, endpoints, and HTTP methods
3. Document request/response schemas and data types
4. Identify authentication and authorization requirements
5. Plan error handling and validation strategies
1. Generate boilerplate code using Bash(api:ratelimit-*) with framework scaffolding
2. Implement endpoint handlers with business logic
3. Add input validation and schema enforcement
4. Integrate authentication and authorization middleware
5. Configure database connections and ORM models
1. Write integration tests covering all endpoints
See `{baseDir}/references/implementation.md` for detailed implementation guide.
## Output
- `{baseDir}/src/routes/` - Endpoint route definitions
- `{baseDir}/src/controllers/` - Business logic handlers
- `{baseDir}/src/models/` - Data models and schemas
- `{baseDir}/src/middleware/` - Authentication, validation, logging
- `{baseDir}/src/config/` - Configuration and environment variables
- OpenAPI 3.0 specification with complete endpoint definitions
## Error Handling
See `{baseDir}/references/errors.md` for comprehensive error handling.
## Examples
See `{baseDir}/references/examples.md` for detailed examples.
## Resources
- Express.js and Fastify for Node.js APIs
- Flask and FastAPI for Python APIs
- Spring Boot for Java APIs
- Gin and Echo for Go APIs
- OpenAPI Specification 3.0+ for API documentation
This skill helps implement sophisticated API rate limiting strategies including sliding windows, token buckets, and quota enforcement. It provides practical guidance, code scaffolding patterns, and integration steps so you can protect endpoints from excessive requests without degrading legitimate traffic. The goal is predictable throttling, fair usage enforcement, and easy observability of limits.
The skill inspects API design and traffic requirements, then recommends and scaffolds appropriate rate limiting primitives (sliding window counters, token buckets, fixed windows) and quota models. It shows where to apply middleware, how to persist counters (in-memory, Redis, or a database), and how to surface limits in responses and metrics. It also outlines testing and error handling patterns to validate behavior under load.
Which store should I use for counters in a multi-instance API?
Use a fast shared store like Redis with atomic ops (INCR, LUA scripts) to ensure consistent counters across instances.
How do I choose between token bucket and sliding window?
Use token bucket when you need burst capability and refill behavior; choose sliding window when you want smoother enforcement and more even distribution.
How should limits be communicated to clients?
Return standard headers (e.g., X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset) and clear 429 responses with retry-after info.