Timeouts & Deadlines (Stop Hanging Threads)
Production incident
An upstream endpoint becomes slow but does not fail. Requests hang for minutes. Your service keeps accepting traffic and waiting. Thread pool queue grows, memory climbs due to buffered responses, and the whole API becomes unresponsive. The root cause is simple: no explicit timeouts, no cancellation propagation, and a default HttpClient timeout that is either infinite or completely misaligned with your SLO.
Symptoms
- Latency climbs gradually, then falls off a cliff under concurrency.
- Many requests stuck in "in flight" state with no response.
- Thread pool starvation signals, increased queue length, and rising memory due to pending tasks and buffers.
- Downstream looks "alive" but slow, causing a slow-loris style failure at the application level.
Causes
- No deadline: outbound calls wait indefinitely.
- Wrong timeout layer: setting only HttpClient.Timeout or only server request timeout, but not aligning budgets across the stack.
- Cancellation not wired: CancellationToken is ignored, so aborting the request does not abort the outbound call.
- Connect vs request: you need connect timeouts and total request deadlines; they are not the same.
Diagnosis
# Find missing cancellation propagation
grep -R "CancellationToken" -n . | head
grep -R "GetAsync(" -n .
grep -R "PostAsync(" -n .
# Look for long running outbound spans in tracing
# If you do not have traces, log outbound duration and status codes
Confirm whether aborted inbound requests still continue to call upstream. This is a classic leak of work and money in production.
Anti-pattern
// No cancellation, no deadline. This will accumulate hanging calls.
public Task<HttpResponseMessage> CallUpstream()
{
return _http.GetAsync("api/slow");
}
// Misleading: one global timeout that does not match per-endpoint budgets _http.Timeout = TimeSpan.FromMinutes(5);
Correct pattern
Set budgets per call, propagate CancellationToken, and use a linked token source to enforce a hard deadline.
public async Task<HttpResponseMessage> CallUpstreamAsync(CancellationToken requestAborted)
{
// Example budget: total 2 seconds for this downstream call
using var cts = CancellationTokenSource.CreateLinkedTokenSource(requestAborted);
cts.CancelAfter(TimeSpan.FromSeconds(2));
using var req = new HttpRequestMessage(HttpMethod.Get, "api/slow");
// ResponseHeadersRead prevents buffering the entire response before returning
var resp = await _http.SendAsync(
req,
HttpCompletionOption.ResponseHeadersRead,
cts.Token
);
return resp;
}
Connect timeout and handshake considerations
Some hangs are connect or TLS handshake stalls. Prefer configuring handler-level timeouts where available and keep them tighter than your total budget.
services.AddHttpClient("UpstreamB")
.ConfigurePrimaryHttpMessageHandler(() => new SocketsHttpHandler
{
ConnectTimeout = TimeSpan.FromSeconds(1),
PooledConnectionLifetime = TimeSpan.FromMinutes(5)
});
Budget alignment
- Inbound server budget: if your endpoint SLO is 1 second, your downstream calls cannot each take 1 second. You need a budget split.
- Per-hop budgets: allocate time for retries, queueing, and serialization.
- Hard stop: always have a total deadline, even if you allow some retries inside.
Security and performance impact
- Performance: hung calls are a resource leak. Timeouts protect capacity and reduce tail latency.
- Security: slow upstream dependencies can be used to degrade availability. Proper deadlines reduce application-level DOS impact.
Operational notes
- Monitoring: timeout rate per upstream and per endpoint, duration histograms, cancelled request counts, and concurrent outbound calls.
- Rollout: introduce timeouts gradually. Too aggressive can increase error rate. The goal is controlled failure, not random failure.
- Rollback: keep budgets configurable so you can relax them temporarily during upstream incidents, without redeploy.
Checklist
- Every outbound call has a total deadline.
- CancellationToken is propagated from inbound request to outbound call.
- Connect timeout is configured where appropriate.
- Response buffering is controlled (ResponseHeadersRead for large payloads).
- Dashboards show timeout rate and long-tail durations.