Rate Limiting & Throttling
On this page
Why rate limiting exists
Rate limiting protects your API from abuse, bugs, traffic spikes, and expensive endpoints. It also gives you predictable capacity planning.
Common limiting strategies
- Per API key / per user
- Per IP (coarse but useful)
- Per endpoint (stricter limits for expensive routes)
Return the right status code
Use 429 Too Many Requests when the client is over the limit.
Retry guidance
Include Retry-After when possible so clients know when to retry.
Example: 429 response
HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json
{
"title": "Rate limit exceeded",
"status": 429,
"detail": "Too many requests. Try again in 30 seconds."
}
Throttling vs hard limits
- Hard limit: reject requests immediately after threshold.
- Throttling: slow down responses or apply queues (more complex).
Common mistakes
- No limits at all on expensive endpoints
- Returning 500 instead of 429
- No Retry-After, forcing clients to guess
Checklist
- 429 is used for limits.
- Retry-After (or equivalent) is provided.
- Limits are documented and measurable.