Server-Side Rate Limiting (Fairness & Abuse)

Server-side rate limiting is not about being nice. It is about survival: protecting CPU, memory, thread pool, and downstream dependencies. Done wrong, it blocks legitimate traffic and still lets abuse through.

On this page

Production incident

A botnet hits your login and search endpoints. Even without a massive bandwidth attack, your CPU and DB melt because the requests are expensive. You have no server-side rate limiting, no per-identity limits, and no cost-based protection. The site becomes unusable for real users. You later add a naive global limiter and accidentally rate limit health checks and internal service calls, causing a different outage. Rate limiting is a surgical tool, not a hammer.

Symptoms

CPU spikes, DB query volume spikes, cache miss ratio worsens.
Latency and error rate increase even though bandwidth is not saturated.
Hot endpoints (login, search, OTP) dominate traffic.
After naive limiter: internal calls and real users get blocked.

Root causes

No protective limits on expensive endpoints.
Using IP-only limiting behind NAT or proxies (punishes many users, misses real attackers).
No differentiation for authenticated users vs anonymous, or for trusted internal traffic.
Not returning actionable responses (429 without Retry-After, no clear policy).

Diagnosis

# Identify top endpoints by RPS and cost
# Use logs/metrics; in codebase, find rate limiter wiring
grep -R "AddRateLimiter" -n .
grep -R "RequireRateLimiting" -n .

# Look for proxy headers trust config for correct client IP
grep -R "UseForwardedHeaders" -n .

Anti-pattern

One global rate limiter for everything.
Limiting purely by IP without considering NAT and proxies.
Limiting health checks, readiness probes, or internal service calls.
Not monitoring rejected requests and not tuning thresholds.

Correct pattern

Apply different policies per route category, and key limits by the right identity signal: user id for authenticated, IP for anonymous, and separate buckets for internal traffic. Protect expensive endpoints harder.

// .NET 7+ example: rate limiter middleware
services.AddRateLimiter(options =>
{
    options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;

    options.AddPolicy("login", httpContext =>
    {
        // Prefer user-based key when available; fallback to IP
        var key = httpContext.User?.Identity?.IsAuthenticated == true
            ? "user:" + httpContext.User.FindFirst("sub")?.Value
            : "ip:" + httpContext.Connection.RemoteIpAddress;

        return RateLimitPartition.GetTokenBucketLimiter(key ?? "unknown", _ => new TokenBucketRateLimiterOptions
        {
            TokenLimit = 10,
            TokensPerPeriod = 10,
            ReplenishmentPeriod = TimeSpan.FromMinutes(1),
            QueueProcessingOrder = QueueProcessingOrder.OldestFirst,
            QueueLimit = 0
        });
    });

    options.AddPolicy("search", httpContext =>
    {
        var key = "ip:" + httpContext.Connection.RemoteIpAddress;
        return RateLimitPartition.GetFixedWindowLimiter(key ?? "unknown", _ => new FixedWindowRateLimiterOptions
        {
            PermitLimit = 60,
            Window = TimeSpan.FromMinutes(1),
            QueueLimit = 0
        });
    });
});

app.UseRateLimiter();

app.MapPost("/login", Login).RequireRateLimiting("login");
app.MapGet("/search", Search).RequireRateLimiting("search");

Operational reality

Correct client IP: behind proxies, you must configure forwarded headers correctly or you will rate limit the proxy IP and break everyone.
Cost-based limits: some requests are expensive. Limit them harder or add caching/async processing.
Retry-After: include it when possible to help clients behave.

Security and performance impact

Security: reduces brute force, credential stuffing, and application-layer DOS.
Performance: protects downstreams and keeps tail latency stable under abuse or traffic spikes.

Operational notes

Monitoring: 429 rate by route, rejected requests by key type, queueing time (if any), and top offenders.
Rollout: start with logging thresholds and soft limits for low-risk routes, then enforce on the most abused endpoints.
Rollback: keep policies configurable. If you block legit traffic, relax thresholds immediately.

Checklist

Policies are per-route and cost-aware.
Keying strategy is correct (user id vs IP vs internal).
Forwarded headers are configured so client IP is accurate.
429 responses are monitored and tuned.
Health/readiness endpoints are not accidentally rate limited.

← Idempotency for APIs (You Need This)

Bulkheads & Concurrency Limits →