
Jan 31, 202610 min read
Category:AutomationCrypto
Trading bot keeps getting 429s after deploy: stop rate limit storms
When deploys trigger 429 storms: why synchronized restarts amplify rate limits, how to diagnose fixed window vs leaky bucket, and guardrails that stop repeat incidents.
Download available. Jump to the shipped asset.
429 is not a glitch. In production it becomes a retry storm: orders fail, your bot misses fills, and your deploy window turns into an incident because five instances all hit the same rate limit at once.
This is not a tutorial. It is a playbook for operators running trading bots against exchange APIs. You will leave with a decision framework, stop rules, and logging fields that let you prove you fixed the problem.
This post is in the Crypto Automation hub and the Crypto Automation category.
Mini incident: the 429 storm after deploy
It is 14:03 UTC and a deploy finishes. Five bot instances restart, each does a full resync, and each strategy starts polling the same endpoints.
At 14:04 UTC, you see clusters of 429 responses. By 14:05 UTC, retries are synchronized and the bot is spending more capacity retrying than trading. By 14:07 UTC, the exchange escalates and you start seeing longer cooloffs.
Nothing is "down". Your system is. Rate limiting is backpressure, and your client behavior decides whether backpressure is a small slowdown or a full incident.
Fixed window vs leaky bucket: why 429 patterns change after deploy
Most advice treats "rate limiting" as one thing. It is not. The limiter model affects what failure looks like, how you should pace requests, and what signals confirm progress.
Two common models show up in exchange APIs:
- Fixed window counters. You get a budget for a window, like 1200 weight per 60 seconds. When you hit the cap, you get hard 429s until the window resets.
- Leaky bucket style pacing. Requests drain at a steady rate. Bursts get rejected or delayed, and constant pacing tends to succeed.
The operational difference is the pattern of 429s. Fixed window tends to produce sharp bursts. Leaky bucket tends to spread failures across time as burst pressure drains.
How to diagnose 429 storms: is it fixed window or leaky bucket?
Do these in order. The goal is to identify whether you are over budget, misclassifying errors, or amplifying retries.
- Confirm it is truly 429. Some exchanges embed rate limiting in a JSON error body or custom code even when the HTTP status is 200.
- Capture response headers. Log
Retry-Afterand any vendor headers that expose remaining budget or reset time. - Identify the budget key. Is it per IP, per API key, per account, or per endpoint group. This determines whether scaling out helps or hurts.
- Measure request weight, not request count. If the exchange uses weights, a single call can cost 5-20 units.
- Compare patterns over time. A cluster at the top of the minute suggests fixed window. A smoother bleed suggests leaky bucket or server-side queueing.
- Check concurrency after deploy. Instance count, reconnect logic, and "catch-up" jobs are the usual source of surprise bursts.
If you cannot answer budget key + weight + concurrency, you are still guessing.
Which rate limiting strategy stops 429 storms: token bucket vs pacing
Do not pick a limiter strategy because it is popular. Pick it because it matches the exchange behavior and your bot architecture.
- If you see fixed window bursts, you need burst smoothing. Token bucket at the client edge is a good fit, but you must also coordinate across processes.
- If you see leaky bucket behavior, you need pacing. A steady queue with backpressure can eliminate most 429s without aggressive backoff.
- If the exchange returns
Retry-After, it is telling you the window. Your policy should follow it.
In both cases, your biggest risk is synchronized retry. A bot that retries in lockstep is effectively a self-inflicted denial of service.
How to prevent 429 storms: guardrails for multi-instance trading bots
Your goal is not "never see 429". Your goal is "429 never triggers a retry storm".
1. Centralize rate limiting per exchange and credential
Do not let each caller own its own retry loop. The limiter should live in one place so it can enforce budgets, caps, and stop rules.
Partition by:
- exchange
- account or api key hash
- endpoint group (public, private, trading)
This prevents one noisy strategy from starving everything else.
2. Add a queue, then apply backpressure
If you have multiple strategies, you need a queue even if it is in-memory. The queue gives you a place to apply policy: limit concurrency, drop low value work, and prioritize trading over metrics.
Backpressure rules that work:
- hard cap concurrency per key
- enforce a minimum spacing between requests when budget is low
- reject or defer non-critical calls when remaining budget is under a threshold
3. Retry policy with jitter and stop rules
Retry is a tool, not a default.
Policy that usually holds up:
- 2-3 attempts max on 429
- exponential backoff with jitter
- respect
Retry-Afterwhen present - circuit break when consecutive 429 exceeds a threshold
For .NET HttpClient specifics, see how to honor Retry-After correctly.
Stop rules that keep you safe:
- If
Retry-Afteris large (example: 60+ seconds), enter degrade mode and stop trading actions. - If 429s persist across multiple windows, stop and page. You are not recovering, you are being rate limited by design.
4. Make deploy behavior boring
Most 429 incidents happen right after deploy.
Guardrails:
- random startup delay per instance
- singleflight resync (only one instance performs heavy catch-up)
- warm-up mode that ramps request rate over 2-5 minutes
Startup bursts also affect background jobs that resync after restart.
5. Validate with a burst test
Before you ship changes, run a burst test and record the pattern.
Example procedure:
- send a short burst to a known endpoint group
- observe the shape of 429s
- confirm your limiter spreads retries and settles
Your acceptance criteria is not "no 429". It is "no synchronized retry and no prolonged cooloff".
What to log
If you cannot prove the limiter is working, the incident will repeat.
Log enough fields to answer:
- what budget did we exceed
- what did the limiter decide
- did retries synchronize
- did we respect exchange guidance
{
"ts": "2026-01-27T14:04:22.481Z",
"event": "exchange_rate_limit",
"exchange": "binance",
"account_key_hash": "k_7c9b...",
"endpoint": "/api/v3/order",
"endpoint_group": "trading",
"http_status": 429,
"retry_after_seconds": 5,
"request_weight": 1,
"window_type": "fixed",
"window_seconds": 60,
"limiter_decision": "delay_then_retry",
"attempt": 1,
"backoff_ms": 1200,
"jitter_ms": 430,
"next_retry_at": "2026-01-27T14:04:28Z",
"consecutive_429": 3,
"breaker_state": "closed",
"instance_id": "bot-03",
"request_id": "req_4f1f..."
}With this you can build simple dashboards:
- 429 rate by endpoint group
- consecutive 429 and breaker transitions
- retries per instance during deploy windows
Add correlation IDs to trace which requests triggered rate limit escalation.
Shipped asset: exchange rate limiting package
Exchange Rate Limiting Package
Config template, decision checklist, and a 429 logging schema for trading bots. The download is a real local zip, not a placeholder.
This is intentionally compact here. Full package details are on the resource page.
Included files:
rate-limit-config-template.yamlrate-limit-decision-checklist.mdrate-limit-429-logging-schema.jsonREADME.md
Preview (config excerpt):
rate_limits:
binance:
trading:
limit: 1200
window_seconds: 60
window_type: fixed
retry_strategy: exponential_backoff
max_retries: 3
jitter_enabled: true
respect_retry_after: trueTrading Bot Hardening Suite: Production-Ready Crypto Infrastructure
Running production trading bots? Get exchange-specific rate limiters, signature validation, and incident recovery playbooks. Stop losing money to preventable API failures.
- ✓Exchange-specific rate limiting (Binance, Coinbase, Kraken, Bybit)
- ✓Signature validation & timestamp drift detection
- ✓API ban prevention patterns & key rotation strategies
- ✓Incident runbooks for 429s, signature errors, and reconnection storms
Tradeoffs and failure modes to plan for
Rate limiting policy has costs. Naming them up front makes rollout safer.
- You will delay work. That is the point. But it can cause missed opportunities if you have no degrade mode.
- A strict limiter can hide upstream degradation by slowing everything. That is why breaker state and 429 rate must be visible.
- Multiple instances need coordination. If each instance has its own limiter, you can still exceed a shared IP budget.
The clean solution is boring: centralize policy, cap concurrency, add stop rules, and make deploy traffic predictable.
Resources
This is intentionally compact. Full package details are on the resource page.
- Exchange rate limiting package
- Crypto Automation hub
- Axiom (Coming Soon)
- Backoff + jitter: the simplest reliability win
- Exchange API bans: how to prevent them
External references:
Troubleshooting Questions Engineers Search
Deploys restart all instances at once, creating synchronized traffic bursts. Each instance does a full resync, reconnect, and catch-up, hitting the same endpoints simultaneously. Add random startup delays (0-30s per instance) and ramp request rate over 2-5 minutes after restart.
Check the 429 pattern over time. Fixed window: sharp clusters of 429s, then clean success right after reset (top of minute). Leaky bucket: 429s spread across time as burst pressure drains steadily. Log timestamps and response headers to see the pattern.
Retries without jitter synchronize. If 100 requests get 429 at the same moment and all retry after 5 seconds, you send 100 synchronized requests again. Add jitter (random 0-2s) to spread retries over time and prevent thundering herd.
Only if rate limits are per API key, not per IP or account. Scaling out with shared IP limits makes 429s worse (more instances = more total requests). Check exchange docs for limit scope before scaling. Add centralized rate limiting if shared.
Some exchanges (Binance, Bybit) use weighted rate limiting. One API call can cost 1-20 weight units depending on complexity. A simple balance check might cost 1, while fetching all open orders costs 10. Track weight, not just request count, or you'll hit limits early.
2-3 max with exponential backoff and jitter. If Retry-After header says wait 60+ seconds, don't retry - enter degrade mode and stop non-critical actions. If 429s persist across multiple windows, stop completely and alert. You're not recovering, you're being throttled by design.
Don't assume it works. Some exchanges rate limit by account (multiple keys = same limit), others by IP (multiple keys on same server = same limit). Check exchange terms - bypassing limits with multiple keys may violate ToS and get all keys banned.
Additional Questions
Look at the shape of 429s.
Fixed window often looks like clusters, then clean success right after a reset. Leaky bucket style limiting often looks like failures that spread across time as bursts drain.
Do not rely on intuition. Log headers, record timestamps, and compare patterns around deploy windows.
No. That is how 429 becomes a storm.
Centralize the retry policy and make concurrency part of the policy. If you retry with the same concurrency that caused the limit, you will keep hitting the limit.
Use your own backoff policy and make it visible.
Start with a conservative base delay, add jitter, cap attempts, then escalate to degrade mode if 429 continues.
Do not assume it helps.
Some exchanges count limits per account, not per key. Others use IP-based limits. If you get it wrong you can still be blocked, and you may violate terms.
Deploys change traffic shape.
Restarts create resync bursts, reconnect logic can stampede, and instance count can increase overnight. Put guardrails around startup and ramp request rate.
Not always.
The target is that 429 is contained. A single request might be delayed, but the bot keeps operating and the operator has clear signals.
Coming soon
If this kind of post is useful, the Axiom waitlist is where we ship operational templates (runbooks, decision trees, defaults) that keep automation out of incident mode.
Axiom (Coming Soon)
Get notified when we ship real operational assets (runbooks, templates, benchmarks), not generic tutorials.
Key takeaways
- 429 is backpressure. Your client decides whether it becomes an incident.
- Fixed window and leaky bucket produce different 429 patterns. Diagnose before you tune.
- Centralize rate limiting policy per exchange and credential.
- Add jitter and stop rules. Never retry in lockstep.
- Log limiter decisions so you can prove the fix.
Recommended resources
Download the shipped checklist/templates for this post.
YAML config templates for Binance, Kraken, Coinbase, Bybit + decision checklist + 429 logging schema. Know when you're being rate-limited before your bot crashes.
resource
Related posts

API key suddenly forbidden: why exchange APIs ban trading bots without warning
When API key flips from working to 403 forbidden after bot runs for hours: why exchange APIs ban trading bots for traffic bursts, retry storms, and auth failures, and the client behavior that prevents it.

Retries amplify failures: why exponential backoff without jitter creates storms
When retries make dependency failures worse and 429s multiply: why exponential backoff without jitter creates synchronized waves, and the bounded retry policy that stops amplification.

Agent keeps calling same tool: why autonomous agents loop forever in production
When agent loops burn tokens calling same tool repeatedly and cost spikes: why autonomous agents loop without stop rules, and the guardrails that prevent repeat execution and duplicate side effects.