Jan 31, 202612 min read

Share |

Trading bot keeps getting 429s after deploy: stop rate limit storms

When deploys trigger 429 storms: why synchronized restarts amplify rate limits, how to diagnose fixed window vs leaky bucket, and guardrails that stop repeat incidents.

Free download: Exchange rate limiting & backoff strategy package. Jump to the download section.

429 is not a glitch. In production it becomes a retry storm: orders fail, your bot misses fills, and your deploy window turns into an incident because five instances all hit the same rate limit at once.

This is not a tutorial. It is a playbook for operators running trading bots against exchange APIs. You will leave with a decision framework, stop rules, and logging fields that let you prove you fixed the problem.

This post is in the Crypto Automation hub and the Crypto Automation category.

If you only do three things

Assume deploys create bursts: add jittered startup delay, singleflight resync, and a 2–5 minute ramp.
Treat 429 as backpressure: honor Retry-After, add jitter, and cap attempts (no lockstep retries).
Centralize rate limiting per exchange + credential so concurrency and budgets are enforced in one place.

Fast triage table (what to check first)

Symptom	Likely cause	Confirm fast	First safe move
429s spike immediately after deploy	Synchronized restarts + full resync burst	Deploy marker aligns with first 429 cluster	Add random startup delay + ramp; singleflight heavy resync
Scaling out made 429s worse	Limits are shared (IP/account), not per instance	Exchange docs/headers indicate shared scope	Centralize limiter; cap total concurrency per key/account
429s persist even with retries	Retry-After ignored or retries synchronized	Logs show `Retry-After` missing/unparsed; retries cluster by second	Honor `Retry-After`, add jitter, cap attempts; reduce concurrency
You “barely call the API” but still hit 429	Weight-based limits, not request count	One endpoint has high weight; headers show budget drain	Track weight, not count; reduce hot endpoint frequency
Vendor recovers then re-throttles repeatedly	Retry backlog + no jitter	Retry attempts surge at same timestamps across instances	Add jitter; introduce degrade mode when cooloff is large

Mini incident: the 429 storm after deploy

It is 14:03 UTC and a deploy finishes. Five bot instances restart, each does a full resync, and each strategy starts polling the same endpoints.

At 14:04 UTC, you see clusters of 429 responses. By 14:05 UTC, retries are synchronized and the bot is spending more capacity retrying than trading. By 14:07 UTC, the exchange escalates and you start seeing longer cooloffs.

Nothing is "down". Your system is. Rate limiting is backpressure, and your client behavior decides whether backpressure is a small slowdown or a full incident.

Fixed window vs leaky bucket: why 429 patterns change after deploy

Most advice treats "rate limiting" as one thing. It is not. The limiter model affects what failure looks like, how you should pace requests, and what signals confirm progress.

Two common models show up in exchange APIs:

Fixed window counters. You get a budget for a window, like 1200 weight per 60 seconds. When you hit the cap, you get hard 429s until the window resets.
Leaky bucket style pacing. Requests drain at a steady rate. Bursts get rejected or delayed, and constant pacing tends to succeed.

The operational difference is the pattern of 429s. Fixed window tends to produce sharp bursts. Leaky bucket tends to spread failures across time as burst pressure drains.

How to diagnose 429 storms: is it fixed window or leaky bucket?

Do these in order. The goal is to identify whether you are over budget, misclassifying errors, or amplifying retries.

Confirm it is truly 429. Some exchanges embed rate limiting in a JSON error body or custom code even when the HTTP status is 200.
Capture response headers. Log Retry-After and any vendor headers that expose remaining budget or reset time.
Identify the budget key. Is it per IP, per API key, per account, or per endpoint group. This determines whether scaling out helps or hurts.
Measure request weight, not request count. If the exchange uses weights, a single call can cost 5-20 units.
Compare patterns over time. A cluster at the top of the minute suggests fixed window. A smoother bleed suggests leaky bucket or server-side queueing.
Check concurrency after deploy. Instance count, reconnect logic, and "catch-up" jobs are the usual source of surprise bursts.

If you cannot answer budget key + weight + concurrency, you are still guessing.

Which rate limiting strategy stops 429 storms: token bucket vs pacing

Do not pick a limiter strategy because it is popular. Pick it because it matches the exchange behavior and your bot architecture.

If you see fixed window bursts, you need burst smoothing. Token bucket at the client edge is a good fit, but you must also coordinate across processes.
If you see leaky bucket behavior, you need pacing. A steady queue with backpressure can eliminate most 429s without aggressive backoff.
If the exchange returns Retry-After, it is telling you the window. Your policy should follow it.

In both cases, your biggest risk is synchronized retry. A bot that retries in lockstep is effectively a self-inflicted denial of service.

How to prevent 429 storms: guardrails for multi-instance trading bots

Your goal is not "never see 429". Your goal is "429 never triggers a retry storm".

1. Centralize rate limiting per exchange and credential

Do not let each caller own its own retry loop. The limiter should live in one place so it can enforce budgets, caps, and stop rules.

Partition by:

exchange
account or api key hash
endpoint group (public, private, trading)

This prevents one noisy strategy from starving everything else.

2. Add a queue, then apply backpressure

If you have multiple strategies, you need a queue even if it is in-memory. The queue gives you a place to apply policy: limit concurrency, drop low value work, and prioritize trading over metrics.

Backpressure rules that work:

hard cap concurrency per key
enforce a minimum spacing between requests when budget is low
reject or defer non-critical calls when remaining budget is under a threshold

3. Retry policy with jitter and stop rules

Retry is a tool, not a default.

Policy that usually holds up:

2-3 attempts max on 429
exponential backoff with jitter
respect Retry-After when present
circuit break when consecutive 429 exceeds a threshold

For .NET HttpClient specifics, see how to honor Retry-After correctly.

Stop rules that keep you safe:

If Retry-After is large (example: 60+ seconds), enter degrade mode and stop trading actions.
If 429s persist across multiple windows, stop and page. You are not recovering, you are being rate limited by design.

4. Make deploy behavior boring

Most 429 incidents happen right after deploy.

Guardrails:

random startup delay per instance
singleflight resync (only one instance performs heavy catch-up)
warm-up mode that ramps request rate over 2-5 minutes

Startup bursts also affect background jobs that resync after restart.

5. Validate with a burst test

Before you ship changes, run a burst test and record the pattern.

Example procedure:

send a short burst to a known endpoint group
observe the shape of 429s
confirm your limiter spreads retries and settles

Your acceptance criteria is not "no 429". It is "no synchronized retry and no prolonged cooloff".

What to log

If you cannot prove the limiter is working, the incident will repeat.

Log enough fields to answer:

what budget did we exceed
what did the limiter decide
did retries synchronize
did we respect exchange guidance

json

{
  "ts": "2026-01-27T14:04:22.481Z",
  "event": "exchange_rate_limit",
  "exchange": "binance",
  "account_key_hash": "k_7c9b...",
  "endpoint": "/api/v3/order",
  "endpoint_group": "trading",
  "http_status": 429,
  "retry_after_seconds": 5,
  "request_weight": 1,
  "window_type": "fixed",
  "window_seconds": 60,
  "limiter_decision": "delay_then_retry",
  "attempt": 1,
  "backoff_ms": 1200,
  "jitter_ms": 430,
  "next_retry_at": "2026-01-27T14:04:28Z",
  "consecutive_429": 3,
  "breaker_state": "closed",
  "instance_id": "bot-03",
  "request_id": "req_4f1f..."
}

With this you can build simple dashboards:

429 rate by endpoint group
consecutive 429 and breaker transitions
retries per instance during deploy windows

Add correlation IDs to trace which requests triggered rate limit escalation.

Shipped asset: exchange rate limiting package

Download

Free

Exchange Rate Limiting Package

Config template, decision checklist, and a 429 logging schema for trading bots. The download is a real local zip, not a placeholder.

Get the package

When to use this (fit check)

You run multiple bot instances and need coordinated rate limiting (not per-loop retries).
You see 429 storms around deploy windows and want predictable startup traffic shape.
You need a concrete config template + logging schema to prove the limiter is working.

When NOT to use this (yet)

You can’t classify the scope (per IP vs per account vs per key) and you’re still guessing.
You retry 429 at full concurrency (fix stop rules + concurrency caps before tuning backoff).
Your bot places non-idempotent writes without dedupe (add idempotency/guards before retries).

This is intentionally compact here. Full package details are on the resource page.

Included files:

rate-limit-config-template.yaml
rate-limit-decision-checklist.md
rate-limit-429-logging-schema.json
README.md

Preview (config excerpt):

yaml

rate_limits:
  binance:
    trading:
      limit: 1200
      window_seconds: 60
      window_type: fixed
      retry_strategy: exponential_backoff
      max_retries: 3
      jitter_enabled: true
      respect_retry_after: true

Axiom Pack

$99

Trading Bot Hardening Suite: Production-Ready Crypto Infrastructure

Running production trading bots? Get exchange-specific rate limiters, signature validation, and incident recovery playbooks. Stop losing money to preventable API failures.

✓Exchange-specific rate limiting (Binance, Coinbase, Kraken, Bybit)
✓Signature validation & timestamp drift detection
✓API ban prevention patterns & key rotation strategies
✓Incident runbooks for 429s, signature errors, and reconnection storms

Coming soon

Checklist (copy/paste)

We know the limit scope (per IP vs per account vs per key) and log it.
We track request weight (not just request count) for hot endpoint groups.
429 is treated as backpressure: honor Retry-After when present.
Retry is bounded: max attempts, jittered backoff, and a total time budget.
Concurrency is capped per exchange + credential + endpoint group.
Rate limiting policy is centralized (one scheduler/queue), not scattered across callers.
Deploy behavior is controlled: random startup delay, ramp-up window, and singleflight resync.
Degrade mode exists when cooloff is large (stop non-critical calls/trading actions).
Logs capture: endpoint_group, weight, attempt, chosen delay, jitter, limiter_decision, breaker_state.

Tradeoffs and failure modes to plan for

Rate limiting policy has costs. Naming them up front makes rollout safer.

You will delay work. That is the point. But it can cause missed opportunities if you have no degrade mode.
A strict limiter can hide upstream degradation by slowing everything. That is why breaker state and 429 rate must be visible.
Multiple instances need coordination. If each instance has its own limiter, you can still exceed a shared IP budget.

The clean solution is boring: centralize policy, cap concurrency, add stop rules, and make deploy traffic predictable.

Resources

This is intentionally compact. Full package details are on the resource page.

External references:

Troubleshooting Questions Engineers Search

Deploys restart all instances at once, creating synchronized traffic bursts. Each instance does a full resync, reconnect, and catch-up, hitting the same endpoints simultaneously. Add random startup delays (0-30s per instance) and ramp request rate over 2-5 minutes after restart.

Check the 429 pattern over time. Fixed window: sharp clusters of 429s, then clean success right after reset (top of minute). Leaky bucket: 429s spread across time as burst pressure drains steadily. Log timestamps and response headers to see the pattern.

Retries without jitter synchronize. If 100 requests get 429 at the same moment and all retry after 5 seconds, you send 100 synchronized requests again. Add jitter (random 0-2s) to spread retries over time and prevent thundering herd.

Only if rate limits are per API key, not per IP or account. Scaling out with shared IP limits makes 429s worse (more instances = more total requests). Check exchange docs for limit scope before scaling. Add centralized rate limiting if shared.

Some exchanges (Binance, Bybit) use weighted rate limiting. One API call can cost 1-20 weight units depending on complexity. A simple balance check might cost 1, while fetching all open orders costs 10. Track weight, not just request count, or you'll hit limits early.

2-3 max with exponential backoff and jitter. If Retry-After header says wait 60+ seconds, don't retry - enter degrade mode and stop non-critical actions. If 429s persist across multiple windows, stop completely and alert. You're not recovering, you're being throttled by design.

Don't assume it works. Some exchanges rate limit by account (multiple keys = same limit), others by IP (multiple keys on same server = same limit). Check exchange terms - bypassing limits with multiple keys may violate ToS and get all keys banned.

Additional Questions

Look at the shape of 429s.

Fixed window often looks like clusters, then clean success right after a reset. Leaky bucket style limiting often looks like failures that spread across time as bursts drain.

Do not rely on intuition. Log headers, record timestamps, and compare patterns around deploy windows.

No. That is how 429 becomes a storm.

Centralize the retry policy and make concurrency part of the policy. If you retry with the same concurrency that caused the limit, you will keep hitting the limit.

Use your own backoff policy and make it visible.

Start with a conservative base delay, add jitter, cap attempts, then escalate to degrade mode if 429 continues.

Do not assume it helps.

Some exchanges count limits per account, not per key. Others use IP-based limits. If you get it wrong you can still be blocked, and you may violate terms.

Deploys change traffic shape.

Restarts create resync bursts, reconnect logic can stampede, and instance count can increase overnight. Put guardrails around startup and ramp request rate.

Not always.

The target is that 429 is contained. A single request might be delayed, but the bot keeps operating and the operator has clear signals.

Coming soon

If this kind of post is useful, the Axiom waitlist is where we ship operational templates (runbooks, decision trees, defaults) that keep automation out of incident mode.

Coming soon

Axiom (Coming Soon)

Get notified when we ship real operational assets (runbooks, templates, benchmarks), not generic tutorials.

Join waitlist

Key takeaways

429 is backpressure. Your client decides whether it becomes an incident.
Fixed window and leaky bucket produce different 429 patterns. Diagnose before you tune.
Centralize rate limiting policy per exchange and credential.
Add jitter and stop rules. Never retry in lockstep.
Log limiter decisions so you can prove the fix.

Recommended resources

Download the shipped checklist/templates for this post.

Exchange rate limiting & backoff strategy packageFree

YAML config templates for Binance, Kraken, Coinbase, Bybit + decision checklist + 429 logging schema. Know when you're being rate-limited before your bot crashes.

resource

Automation > CryptoJan 11, 2026

API key suddenly forbidden: why exchange APIs ban trading bots without warning

When API key flips from working to 403 forbidden after bot runs for hours: why exchange APIs ban trading bots for traffic bursts, retry storms, and auth failures, and the client behavior that prevents it.

Automation > EngineeringJan 14, 2026

Retries amplify failures: why exponential backoff without jitter creates storms

When retries make dependency failures worse and 429s multiply: why exponential backoff without jitter creates synchronized waves, and the bounded retry policy that stops amplification.

Automation > AgentsJan 16, 2026

Agent keeps calling same tool: why autonomous agents loop forever in production

When agent loops burn tokens calling same tool repeatedly and cost spikes: why autonomous agents loop without stop rules, and the guardrails that prevent repeat execution and duplicate side effects.

Fast triage table (what to check first)

Mini incident: the 429 storm after deploy

Fixed window vs leaky bucket: why 429 patterns change after deploy

How to diagnose 429 storms: is it fixed window or leaky bucket?

Which rate limiting strategy stops 429 storms: token bucket vs pacing

How to prevent 429 storms: guardrails for multi-instance trading bots

1. Centralize rate limiting per exchange and credential

2. Add a queue, then apply backpressure

3. Retry policy with jitter and stop rules

4. Make deploy behavior boring

5. Validate with a burst test

What to log

Shipped asset: exchange rate limiting package

Exchange Rate Limiting Package

Trading Bot Hardening Suite: Production-Ready Crypto Infrastructure

Checklist (copy/paste)

Tradeoffs and failure modes to plan for

Resources

Troubleshooting Questions Engineers Search

Additional Questions

Coming soon

Axiom (Coming Soon)

Key takeaways

Recommended resources

Related posts

API key suddenly forbidden: why exchange APIs ban trading bots without warning

Retries amplify failures: why exponential backoff without jitter creates storms

Agent keeps calling same tool: why autonomous agents loop forever in production