Server-Side Rate Limiting (Fairness & Abuse)
On this page
Production incident
A botnet hits your login and search endpoints. Even without a massive bandwidth attack, your CPU and DB melt because the requests are expensive. You have no server-side rate limiting, no per-identity limits, and no cost-based protection. The site becomes unusable for real users. You later add a naive global limiter and accidentally rate limit health checks and internal service calls, causing a different outage. Rate limiting is a surgical tool, not a hammer.
Symptoms
- CPU spikes, DB query volume spikes, cache miss ratio worsens.
- Latency and error rate increase even though bandwidth is not saturated.
- Hot endpoints (login, search, OTP) dominate traffic.
- After naive limiter: internal calls and real users get blocked.
Root causes
- No protective limits on expensive endpoints.
- Using IP-only limiting behind NAT or proxies (punishes many users, misses real attackers).
- No differentiation for authenticated users vs anonymous, or for trusted internal traffic.
- Not returning actionable responses (429 without Retry-After, no clear policy).
Diagnosis
# Identify top endpoints by RPS and cost # Use logs/metrics; in codebase, find rate limiter wiring grep -R "AddRateLimiter" -n . grep -R "RequireRateLimiting" -n . # Look for proxy headers trust config for correct client IP grep -R "UseForwardedHeaders" -n .
Anti-pattern
- One global rate limiter for everything.
- Limiting purely by IP without considering NAT and proxies.
- Limiting health checks, readiness probes, or internal service calls.
- Not monitoring rejected requests and not tuning thresholds.
Correct pattern
Apply different policies per route category, and key limits by the right identity signal: user id for authenticated, IP for anonymous, and separate buckets for internal traffic. Protect expensive endpoints harder.
// .NET 7+ example: rate limiter middleware
services.AddRateLimiter(options =>
{
options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
options.AddPolicy("login", httpContext =>
{
// Prefer user-based key when available; fallback to IP
var key = httpContext.User?.Identity?.IsAuthenticated == true
? "user:" + httpContext.User.FindFirst("sub")?.Value
: "ip:" + httpContext.Connection.RemoteIpAddress;
return RateLimitPartition.GetTokenBucketLimiter(key ?? "unknown", _ => new TokenBucketRateLimiterOptions
{
TokenLimit = 10,
TokensPerPeriod = 10,
ReplenishmentPeriod = TimeSpan.FromMinutes(1),
QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
QueueLimit = 0
});
});
options.AddPolicy("search", httpContext =>
{
var key = "ip:" + httpContext.Connection.RemoteIpAddress;
return RateLimitPartition.GetFixedWindowLimiter(key ?? "unknown", _ => new FixedWindowRateLimiterOptions
{
PermitLimit = 60,
Window = TimeSpan.FromMinutes(1),
QueueLimit = 0
});
});
});
app.UseRateLimiter();
app.MapPost("/login", Login).RequireRateLimiting("login");
app.MapGet("/search", Search).RequireRateLimiting("search");
Operational reality
- Correct client IP: behind proxies, you must configure forwarded headers correctly or you will rate limit the proxy IP and break everyone.
- Cost-based limits: some requests are expensive. Limit them harder or add caching/async processing.
- Retry-After: include it when possible to help clients behave.
Security and performance impact
- Security: reduces brute force, credential stuffing, and application-layer DOS.
- Performance: protects downstreams and keeps tail latency stable under abuse or traffic spikes.
Operational notes
- Monitoring: 429 rate by route, rejected requests by key type, queueing time (if any), and top offenders.
- Rollout: start with logging thresholds and soft limits for low-risk routes, then enforce on the most abused endpoints.
- Rollback: keep policies configurable. If you block legit traffic, relax thresholds immediately.
Checklist
- Policies are per-route and cost-aware.
- Keying strategy is correct (user id vs IP vs internal).
- Forwarded headers are configured so client IP is accurate.
- 429 responses are monitored and tuned.
- Health/readiness endpoints are not accidentally rate limited.