Rate Limiting & Throttling

Protect your API with clear limits and retry hints.

On this page

Why rate limiting exists

Rate limiting protects your API from abuse, bugs, traffic spikes, and expensive endpoints. It also gives you predictable capacity planning.

Common limiting strategies

  • Per API key / per user
  • Per IP (coarse but useful)
  • Per endpoint (stricter limits for expensive routes)

Return the right status code

Use 429 Too Many Requests when the client is over the limit.

Retry guidance

Include Retry-After when possible so clients know when to retry.

Example: 429 response

HTTP/1.1 429 Too Many Requests
Retry-After: 30
Content-Type: application/json

{
  "title": "Rate limit exceeded",
  "status": 429,
  "detail": "Too many requests. Try again in 30 seconds."
}

Throttling vs hard limits

  • Hard limit: reject requests immediately after threshold.
  • Throttling: slow down responses or apply queues (more complex).

Common mistakes

  • No limits at all on expensive endpoints
  • Returning 500 instead of 429
  • No Retry-After, forcing clients to guess

Checklist

  • 429 is used for limits.
  • Retry-After (or equivalent) is provided.
  • Limits are documented and measurable.