Trading bot keeps getting 429s after deploy: stop rate limit storms

Jan 31, 202610 min read

Category:AutomationCrypto

Trading bot keeps getting 429s after deploy: stop rate limit storms

When deploys trigger 429 storms: why synchronized restarts amplify rate limits, how to diagnose fixed window vs leaky bucket, and guardrails that stop repeat incidents.

Download available. Jump to the shipped asset.

429 is not a glitch. In production it becomes a retry storm: orders fail, your bot misses fills, and your deploy window turns into an incident because five instances all hit the same rate limit at once.

This is not a tutorial. It is a playbook for operators running trading bots against exchange APIs. You will leave with a decision framework, stop rules, and logging fields that let you prove you fixed the problem.

Mini incident: the 429 storm after deploy

It is 14:03 UTC and a deploy finishes. Five bot instances restart, each does a full resync, and each strategy starts polling the same endpoints.

At 14:04 UTC, you see clusters of 429 responses. By 14:05 UTC, retries are synchronized and the bot is spending more capacity retrying than trading. By 14:07 UTC, the exchange escalates and you start seeing longer cooloffs.

Nothing is "down". Your system is. Rate limiting is backpressure, and your client behavior decides whether backpressure is a small slowdown or a full incident.

Fixed window vs leaky bucket: why 429 patterns change after deploy

Most advice treats "rate limiting" as one thing. It is not. The limiter model affects what failure looks like, how you should pace requests, and what signals confirm progress.

Two common models show up in exchange APIs:

  • Fixed window counters. You get a budget for a window, like 1200 weight per 60 seconds. When you hit the cap, you get hard 429s until the window resets.
  • Leaky bucket style pacing. Requests drain at a steady rate. Bursts get rejected or delayed, and constant pacing tends to succeed.

The operational difference is the pattern of 429s. Fixed window tends to produce sharp bursts. Leaky bucket tends to spread failures across time as burst pressure drains.

How to diagnose 429 storms: is it fixed window or leaky bucket?

Do these in order. The goal is to identify whether you are over budget, misclassifying errors, or amplifying retries.

  1. Confirm it is truly 429. Some exchanges embed rate limiting in a JSON error body or custom code even when the HTTP status is 200.
  2. Capture response headers. Log Retry-After and any vendor headers that expose remaining budget or reset time.
  3. Identify the budget key. Is it per IP, per API key, per account, or per endpoint group. This determines whether scaling out helps or hurts.
  4. Measure request weight, not request count. If the exchange uses weights, a single call can cost 5-20 units.
  5. Compare patterns over time. A cluster at the top of the minute suggests fixed window. A smoother bleed suggests leaky bucket or server-side queueing.
  6. Check concurrency after deploy. Instance count, reconnect logic, and "catch-up" jobs are the usual source of surprise bursts.

If you cannot answer budget key + weight + concurrency, you are still guessing.

Which rate limiting strategy stops 429 storms: token bucket vs pacing

Do not pick a limiter strategy because it is popular. Pick it because it matches the exchange behavior and your bot architecture.

  • If you see fixed window bursts, you need burst smoothing. Token bucket at the client edge is a good fit, but you must also coordinate across processes.
  • If you see leaky bucket behavior, you need pacing. A steady queue with backpressure can eliminate most 429s without aggressive backoff.
  • If the exchange returns Retry-After, it is telling you the window. Your policy should follow it.

In both cases, your biggest risk is synchronized retry. A bot that retries in lockstep is effectively a self-inflicted denial of service.

How to prevent 429 storms: guardrails for multi-instance trading bots

Your goal is not "never see 429". Your goal is "429 never triggers a retry storm".

1. Centralize rate limiting per exchange and credential

Do not let each caller own its own retry loop. The limiter should live in one place so it can enforce budgets, caps, and stop rules.

Partition by:

  • exchange
  • account or api key hash
  • endpoint group (public, private, trading)

This prevents one noisy strategy from starving everything else.

2. Add a queue, then apply backpressure

If you have multiple strategies, you need a queue even if it is in-memory. The queue gives you a place to apply policy: limit concurrency, drop low value work, and prioritize trading over metrics.

Backpressure rules that work:

  • hard cap concurrency per key
  • enforce a minimum spacing between requests when budget is low
  • reject or defer non-critical calls when remaining budget is under a threshold

3. Retry policy with jitter and stop rules

Retry is a tool, not a default.

Policy that usually holds up:

  • 2-3 attempts max on 429
  • exponential backoff with jitter
  • respect Retry-After when present
  • circuit break when consecutive 429 exceeds a threshold

For .NET HttpClient specifics, see how to honor Retry-After correctly.

Stop rules that keep you safe:

  • If Retry-After is large (example: 60+ seconds), enter degrade mode and stop trading actions.
  • If 429s persist across multiple windows, stop and page. You are not recovering, you are being rate limited by design.

4. Make deploy behavior boring

Most 429 incidents happen right after deploy.

Guardrails:

  • random startup delay per instance
  • singleflight resync (only one instance performs heavy catch-up)
  • warm-up mode that ramps request rate over 2-5 minutes

Startup bursts also affect background jobs that resync after restart.

5. Validate with a burst test

Before you ship changes, run a burst test and record the pattern.

Example procedure:

  • send a short burst to a known endpoint group
  • observe the shape of 429s
  • confirm your limiter spreads retries and settles

Your acceptance criteria is not "no 429". It is "no synchronized retry and no prolonged cooloff".

What to log

If you cannot prove the limiter is working, the incident will repeat.

Log enough fields to answer:

  • what budget did we exceed
  • what did the limiter decide
  • did retries synchronize
  • did we respect exchange guidance
json
{
  "ts": "2026-01-27T14:04:22.481Z",
  "event": "exchange_rate_limit",
  "exchange": "binance",
  "account_key_hash": "k_7c9b...",
  "endpoint": "/api/v3/order",
  "endpoint_group": "trading",
  "http_status": 429,
  "retry_after_seconds": 5,
  "request_weight": 1,
  "window_type": "fixed",
  "window_seconds": 60,
  "limiter_decision": "delay_then_retry",
  "attempt": 1,
  "backoff_ms": 1200,
  "jitter_ms": 430,
  "next_retry_at": "2026-01-27T14:04:28Z",
  "consecutive_429": 3,
  "breaker_state": "closed",
  "instance_id": "bot-03",
  "request_id": "req_4f1f..."
}

With this you can build simple dashboards:

  • 429 rate by endpoint group
  • consecutive 429 and breaker transitions
  • retries per instance during deploy windows

Add correlation IDs to trace which requests triggered rate limit escalation.

Shipped asset: exchange rate limiting package

Download
Free

Exchange Rate Limiting Package

Config template, decision checklist, and a 429 logging schema for trading bots. The download is a real local zip, not a placeholder.

This is intentionally compact here. Full package details are on the resource page.

Included files:

  • rate-limit-config-template.yaml
  • rate-limit-decision-checklist.md
  • rate-limit-429-logging-schema.json
  • README.md

Preview (config excerpt):

yaml
rate_limits:
  binance:
    trading:
      limit: 1200
      window_seconds: 60
      window_type: fixed
      retry_strategy: exponential_backoff
      max_retries: 3
      jitter_enabled: true
      respect_retry_after: true
Axiom Pack
$99

Trading Bot Hardening Suite: Production-Ready Crypto Infrastructure

Running production trading bots? Get exchange-specific rate limiters, signature validation, and incident recovery playbooks. Stop losing money to preventable API failures.

  • Exchange-specific rate limiting (Binance, Coinbase, Kraken, Bybit)
  • Signature validation & timestamp drift detection
  • API ban prevention patterns & key rotation strategies
  • Incident runbooks for 429s, signature errors, and reconnection storms
Coming soon

Tradeoffs and failure modes to plan for

Rate limiting policy has costs. Naming them up front makes rollout safer.

  • You will delay work. That is the point. But it can cause missed opportunities if you have no degrade mode.
  • A strict limiter can hide upstream degradation by slowing everything. That is why breaker state and 429 rate must be visible.
  • Multiple instances need coordination. If each instance has its own limiter, you can still exceed a shared IP budget.

The clean solution is boring: centralize policy, cap concurrency, add stop rules, and make deploy traffic predictable.


Resources

This is intentionally compact. Full package details are on the resource page.

External references:


Deploys restart all instances at once, creating synchronized traffic bursts. Each instance does a full resync, reconnect, and catch-up, hitting the same endpoints simultaneously. Add random startup delays (0-30s per instance) and ramp request rate over 2-5 minutes after restart.

Check the 429 pattern over time. Fixed window: sharp clusters of 429s, then clean success right after reset (top of minute). Leaky bucket: 429s spread across time as burst pressure drains steadily. Log timestamps and response headers to see the pattern.

Retries without jitter synchronize. If 100 requests get 429 at the same moment and all retry after 5 seconds, you send 100 synchronized requests again. Add jitter (random 0-2s) to spread retries over time and prevent thundering herd.

Only if rate limits are per API key, not per IP or account. Scaling out with shared IP limits makes 429s worse (more instances = more total requests). Check exchange docs for limit scope before scaling. Add centralized rate limiting if shared.

Some exchanges (Binance, Bybit) use weighted rate limiting. One API call can cost 1-20 weight units depending on complexity. A simple balance check might cost 1, while fetching all open orders costs 10. Track weight, not just request count, or you'll hit limits early.

2-3 max with exponential backoff and jitter. If Retry-After header says wait 60+ seconds, don't retry - enter degrade mode and stop non-critical actions. If 429s persist across multiple windows, stop completely and alert. You're not recovering, you're being throttled by design.

Don't assume it works. Some exchanges rate limit by account (multiple keys = same limit), others by IP (multiple keys on same server = same limit). Check exchange terms - bypassing limits with multiple keys may violate ToS and get all keys banned.


Additional Questions

Look at the shape of 429s.

Fixed window often looks like clusters, then clean success right after a reset. Leaky bucket style limiting often looks like failures that spread across time as bursts drain.

Do not rely on intuition. Log headers, record timestamps, and compare patterns around deploy windows.

No. That is how 429 becomes a storm.

Centralize the retry policy and make concurrency part of the policy. If you retry with the same concurrency that caused the limit, you will keep hitting the limit.

Use your own backoff policy and make it visible.

Start with a conservative base delay, add jitter, cap attempts, then escalate to degrade mode if 429 continues.

Do not assume it helps.

Some exchanges count limits per account, not per key. Others use IP-based limits. If you get it wrong you can still be blocked, and you may violate terms.

Deploys change traffic shape.

Restarts create resync bursts, reconnect logic can stampede, and instance count can increase overnight. Put guardrails around startup and ramp request rate.

Not always.

The target is that 429 is contained. A single request might be delayed, but the bot keeps operating and the operator has clear signals.


Coming soon

If this kind of post is useful, the Axiom waitlist is where we ship operational templates (runbooks, decision trees, defaults) that keep automation out of incident mode.

Coming soon

Axiom (Coming Soon)

Get notified when we ship real operational assets (runbooks, templates, benchmarks), not generic tutorials.


Key takeaways

  • 429 is backpressure. Your client decides whether it becomes an incident.
  • Fixed window and leaky bucket produce different 429 patterns. Diagnose before you tune.
  • Centralize rate limiting policy per exchange and credential.
  • Add jitter and stop rules. Never retry in lockstep.
  • Log limiter decisions so you can prove the fix.

Recommended resources

Download the shipped checklist/templates for this post.

YAML config templates for Binance, Kraken, Coinbase, Bybit + decision checklist + 429 logging schema. Know when you're being rate-limited before your bot crashes.

resource

Related posts