Rate Limiting & Throttling

Protect your API with clear limits and retry hints.

On this page

Why rate limiting exists

Rate limiting protects your API from abuse, bugs, traffic spikes, and expensive endpoints. It also gives you predictable capacity planning.

Common limiting strategies

Per API key / per user
Per IP (coarse but useful)
Per endpoint (stricter limits for expensive routes)

Return the right status code

Use 429 Too Many Requests when the client is over the limit.

Retry guidance

Include Retry-After when possible so clients know when to retry.

Example: 429 response

HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json

{
  "title": "Rate limit exceeded",
  "status": 429,
  "detail": "Too many requests. Try again in 30 seconds."
}

Throttling vs hard limits

Hard limit: reject requests immediately after threshold.
Throttling: slow down responses or apply queues (more complex).

Common mistakes

No limits at all on expensive endpoints
Returning 500 instead of 429
No Retry-After, forcing clients to guess

Checklist

429 is used for limits.
Retry-After (or equivalent) is provided.
Limits are documented and measurable.

← Error Format (Problem Details)

Versioning →