HttpClient keeps getting 429s: why retries amplify rate limiting in .NET

Jan 30, 20269 min read

Category:.NET

HttpClient keeps getting 429s: why retries amplify rate limiting in .NET

When retries multiply 429 errors instead of fixing them: how retry amplification happens, how to prove it, and how to honor Retry-After with budgets.

Download available. Jump to the shipped asset.

Paid pack available. Jump to the Axiom pack.

The incident pattern is familiar: a vendor starts throttling, you get a burst of 429s, and then your own service becomes unstable. Latency spikes, queues grow, and on-call starts scaling out a system that is not down. It is being told to slow down.

The cost is not the 429 itself. The cost is what happens when you treat 429 like a transient error and your retry code ignores backpressure. You multiply load against the throttled dependency, you burn your thread pool on waiting, and you create a retry backlog that hits the vendor the moment it recovers.

This post is the production playbook for .NET: how to treat 429 as backpressure, honor Retry-After correctly, and make the behavior provable in logs.

Rescuing an .NET service in production? Start at the .NET Production Rescue hub and the .NET category.

If you only do three things
  • Treat 429 as backpressure, not a transient failure.
  • Honor Retry-After (seconds and date forms) and cap your total budget.
  • Log retry decisions (reason, wait, endpoint, correlation ID) so you can prove the policy is working.

Why 429s multiply when you retry: backpressure vs transient errors

A 429 is not the same as a 503 or a timeout. It is an explicit statement: the upstream is protecting itself and you are above your allowed rate or concurrency.

If you immediately retry a 429, you have not improved your odds of success. You have increased upstream pressure at the exact moment the upstream asked for less. Under load, that turns into an amplifier.

The failure is usually not one request. It is the shape:

  • 429 rate rises
  • retries rise faster than original traffic
  • latency rises because calls wait on backoff (or worse, retry immediately)
  • your own queues fill because threads are tied up waiting and retrying

Teams get stuck because the dashboards look like a vendor outage, so they scale out. Scaling out increases concurrency and makes the throttling worse.

How to prove retries are amplifying 429s: diagnosis checklist

Start with the shortest path to truth. The goal is to prove whether throttling is real and whether your client is respecting it.

  1. Confirm the response is actually 429 from the dependency you think.
  • Log the upstream host, route, and status code.
  • If you have multiple dependencies behind one HttpClient, separate them. A single misbehaving vendor can poison unrelated calls.
  1. Check whether Retry-After is present and what form it is in.

Retry-After can be:

  • a delta in seconds, like Retry-After: 10
  • an HTTP date, like Retry-After: Wed, 21 Oct 2015 07:28:00 GMT

If you only support one form, you are only sometimes honoring backpressure.

  1. Measure amplification.

If your request rate to the vendor is 1,000 rpm and your retry attempts add 2,000 rpm, you are not throttled. You are self attacking.

  1. Look for the common foot-guns.
  • retry policies that treat any non-2xx as retryable
  • retry loops that ignore Retry-After
  • retries without a per-attempt timeout (stacked waits)
  • no total budget (one call can burn a worker for minutes)

If you see those, you have a policy problem. Not a vendor problem. If threads are tied up waiting on 429 retries and you see requests timing out with normal CPU, you have thread pool starvation from retry backlog.

How to stop retry amplification: honor Retry-After with budgets

The safe goal is not "never see a 429". The goal is to fail fast when you must, retry when it is safe, and slow down when the upstream asks.

1) Gate 429 retries behind idempotency and a total budget

A 429 retry is only safe if a duplicate attempt does not create a duplicate side effect.

  • Safe: GETs, idempotent POSTs with idempotency keys, retries that hit a cache read
  • Unsafe: payment capture, shipment creation, "send email" endpoints without dedupe

Also, treat timeouts and retries as one policy. A retry policy without a budget is a slow leak that becomes a queue pileup. See retry logic anti-patterns for why retrying without classification causes cascading failures.

2) Honor Retry-After, but cap it

Retry-After is advice. In production you still need boundaries.

  • honor it when it is within your total time budget
  • cap extremely large waits (for example, if the upstream says 600 seconds, you may want to stop and surface an actionable failure)
  • add jitter when you have many callers so you do not synchronize

3) Reduce concurrency, not just delay

Delaying an individual request helps, but if you have 200 in flight callers, you can still overload the upstream.

If throttling is sustained:

  • cap concurrency for that dependency (bulkhead)
  • shed low priority calls
  • cache where safe
  • degrade features instead of stacking retries

That is the difference between stabilization and "we waited longer before we failed". Learn more about why retries amplify outages when they lack backoff and jitter.

Parse Retry-After correctly: handle seconds and HTTP date formats

The goal is not a perfect policy framework. The goal is to stop doing the wrong thing by default.

csharp
// Works in .NET Framework and modern .NET.
// Use as a DelegatingHandler or inside your retry policy.
 
static bool TryGetRetryAfterDelay(HttpResponseMessage response, DateTimeOffset now, out TimeSpan delay)
{
    delay = TimeSpan.Zero;
 
    // Prefer typed header parsing when available.
    var ra = response.Headers.RetryAfter;
    if (ra == null) return false;
 
    if (ra.Delta.HasValue)
    {
        delay = ra.Delta.Value;
        return delay > TimeSpan.Zero;
    }
 
    if (ra.Date.HasValue)
    {
        var target = ra.Date.Value;
        delay = target > now ? (target - now) : TimeSpan.Zero;
        return delay > TimeSpan.Zero;
    }
 
    return false;
}

This should be paired with:

  • per-attempt timeout
  • attempt cap
  • total budget cap
  • logging of the decision

Do not ship this alone and call it fixed.

What to log so throttling becomes provable

You need enough fields to answer one question in one query: "Did we honor backpressure or did we amplify it?"

Log at the retry decision point:

  • dependency: vendor name or host
  • route: normalized route or operation name
  • status: 429
  • retry_after_ms: parsed value (or null)
  • retry_delay_ms: what you actually waited (post-cap, post-jitter)
  • attempt: attempt number
  • total_elapsed_ms: time spent so far
  • budget_ms: max allowed
  • decision: delay-and-retry | fail-fast | degrade | escalate
  • correlation_id: request correlation id

Example log line:

json
{
  "event": "http.retry.decision",
  "dependency": "vendor-x",
  "route": "GET /v2/orders",
  "status": 429,
  "retry_after_ms": 10000,
  "retry_delay_ms": 11234,
  "attempt": 2,
  "total_elapsed_ms": 15300,
  "budget_ms": 20000,
  "decision": "delay-and-retry",
  "correlation_id": "01H..."
}

If you cannot answer that question, the next throttling incident will look like mystery latency. Use correlation IDs to trace which original request spawned which 429 retries.

Shipped asset

Download
Free

HttpClient 429 + Retry-After package

Copy/paste-ready handler, runbook, and logging fields to stop retry amplification when a dependency is throttling. Safe for legacy .NET systems.

Use this when a vendor starts returning 429 and your retry logic is making latency and queueing worse.

What you get (4 files)

  • RetryAfterDelegatingHandler.cs
  • 429-retry-after-runbook.md
  • retry-after-logging-fields.md
  • README.md

How to use

  • On call: use the runbook to confirm throttling source and contain blast radius.
  • Tech lead: standardize the handler + budgets across services that call the dependency.
  • CTO: use the logging fields to make throttling measurable and reduce repeat incidents.
Axiom Pack
$49

Retry Policy Kit: Battle-Tested Resilience for Production

Managing retries across multiple services? Get pre-configured Polly policies with monitoring integration, circuit breaker patterns, and incident runbooks. Stop debugging retry storms in production.

  • 10+ production-grade Polly policies for HTTP, gRPC, and database calls
  • Circuit breaker + retry coordination patterns
  • Monitoring integration (Prometheus, OpenTelemetry, Application Insights)
  • Incident runbooks for retry storm diagnosis and mitigation
Get Retry Policy Kit →

Resources

Internal:

External:

If 429s persist after honoring Retry-After, check concurrency. Many instances retrying at once still overwhelm the upstream even with delays. Add a bulkhead (concurrency cap per dependency) and add jitter so instances don't retry in synchronized waves.

Compare request rate vs retry rate. If original requests = 1000/min but retries add 3000/min, you're amplifying load 4x. Check logs for: retry attempts per endpoint, 429 rate trend, latency spike correlation with retry rate spike.

No. Retry-After tells you WHEN to retry, not IF it's safe. Only retry if the operation is idempotent. Non-idempotent operations (payments, orders, emails) should fail fast or use an idempotency key, not blind retry.

Both. Retry-After can be delta seconds (10) or HTTP date (Wed, 21 Oct 2015 07:28:00 GMT). If you only parse one format, you're only sometimes honoring backpressure. Parse both or you'll amplify load when the vendor switches formats.

Polly can express retry policies, but doesn't parse Retry-After by default. You need a custom DelegatingHandler or Polly policy that reads the header and calculates wait time. Polly provides the retry framework; you provide the backpressure logic.

Scaling out increases total concurrency to the throttled dependency. If you had 5 instances making 100 req/sec (500 total) and scale to 10 instances, you're now making 1000 req/sec. If the upstream throttle is 600 req/sec, more instances = more 429s = more retries = worse amplification.

Cap it. If your total request budget is 30 seconds, waiting 10 minutes isn't viable. Honor the signal (slow down), but fail fast within your budget. Log the uncapped value so you can discuss reasonable limits with the vendor.


Additional Questions

No. A 429 is a request to slow down, not a promise that retrying will work. Retry only when the operation is safe to repeat (idempotent) and only within a budget that protects your own thread pool. If the upstream is throttling for minutes, the right move is usually to reduce concurrency and degrade features, not to keep retrying.

Treat that as a weak signal, not permission to hammer. Use a small bounded backoff with jitter, cap attempts, and log that Retry-After was missing. If throttling is sustained, you still need a concurrency cap and a fail-fast path that produces an actionable error for the caller.

Add jitter to the delay you honor, and cap concurrency per dependency. Without jitter, 100 instances that all see Retry-After: 10 will all retry at the same moment. Without a bulkhead, you will still have too much in flight work even if each call is delayed.

Polly can express the policy, but it does not make the policy safe by default. The safety comes from classification (what is retryable), budgets (per-attempt and total), and observability (logging decisions). Many incidents happen because a policy exists but nobody can prove what it did under load.

Scaling out increases concurrency. If the upstream is throttling, more concurrency creates more 429s and more retries, which creates more backlog and more waiting. In these incidents, the fix is often the opposite: cap concurrency, shed low priority calls, and keep the rest within a budget.

Coming soon

If this incident pattern feels familiar, the fastest win is a consistent set of defaults across services: budgets, backpressure handling, and logging fields. Axiom is where these packages live so you do not have to re-derive them during an incident.

Coming soon

Axiom (Coming Soon)

Get notified when we ship real operational assets (runbooks, templates, schemas), not generic tutorials.

Key takeaways

  • 429 is backpressure. Treat it as a request for less concurrency.
  • Honor Retry-After correctly, but keep boundaries (attempt caps and total budget caps).
  • If you cannot prove behavior in logs, you will relive the incident with better dashboards but the same broken defaults.

Recommended resources

Download the shipped checklist/templates for this post.

A copy/paste handler that parses Retry-After (seconds and HTTP date) plus a 429 runbook and logging fields so throttling becomes bounded, observable, and non-amplifying in .NET.

resource

Related posts