Crypto exchange rate limiting: fixed window vs leaky bucket (stop 429s)

Jan 31, 20268 min read

Category:AutomationCrypto

Crypto exchange rate limiting: fixed window vs leaky bucket (stop 429s)

A production-first playbook to stop 429 storms: diagnose the limiter type, add guardrails, and log the signals you need to stop guessing.

Download available. Jump to the shipped asset.

429 is not a glitch. In production it becomes a retry storm: orders fail, your bot misses fills, and your deploy window turns into an incident because five instances all hit the same rate limit at once.

This is not a tutorial. It is a playbook for operators running trading bots against exchange APIs. You will leave with a decision framework, stop rules, and logging fields that let you prove you fixed the problem.

Mini incident: the 429 storm after deploy

It is 14:03 UTC and a deploy finishes. Five bot instances restart, each does a full resync, and each strategy starts polling the same endpoints.

At 14:04 UTC, you see clusters of 429 responses. By 14:05 UTC, retries are synchronized and the bot is spending more capacity retrying than trading. By 14:07 UTC, the exchange escalates and you start seeing longer cooloffs.

Nothing is "down". Your system is. Rate limiting is backpressure, and your client behavior decides whether backpressure is a small slowdown or a full incident.

What fixed window and leaky bucket really change

Most advice treats "rate limiting" as one thing. It is not. The limiter model affects what failure looks like, how you should pace requests, and what signals confirm progress.

Two common models show up in exchange APIs:

  • Fixed window counters. You get a budget for a window, like 1200 weight per 60 seconds. When you hit the cap, you get hard 429s until the window resets.
  • Leaky bucket style pacing. Requests drain at a steady rate. Bursts get rejected or delayed, and constant pacing tends to succeed.

The operational difference is the pattern of 429s. Fixed window tends to produce sharp bursts. Leaky bucket tends to spread failures across time as burst pressure drains.

Diagnosis ladder (fast checks first)

Do these in order. The goal is to identify whether you are over budget, misclassifying errors, or amplifying retries.

  1. Confirm it is truly 429. Some exchanges embed rate limiting in a JSON error body or custom code even when the HTTP status is 200.
  2. Capture response headers. Log Retry-After and any vendor headers that expose remaining budget or reset time.
  3. Identify the budget key. Is it per IP, per API key, per account, or per endpoint group. This determines whether scaling out helps or hurts.
  4. Measure request weight, not request count. If the exchange uses weights, a single call can cost 5-20 units.
  5. Compare patterns over time. A cluster at the top of the minute suggests fixed window. A smoother bleed suggests leaky bucket or server-side queueing.
  6. Check concurrency after deploy. Instance count, reconnect logic, and "catch-up" jobs are the usual source of surprise bursts.

If you cannot answer budget key + weight + concurrency, you are still guessing.

Decision framework: what strategy matches the limiter

Do not pick a limiter strategy because it is popular. Pick it because it matches the exchange behavior and your bot architecture.

  • If you see fixed window bursts, you need burst smoothing. Token bucket at the client edge is a good fit, but you must also coordinate across processes.
  • If you see leaky bucket behavior, you need pacing. A steady queue with backpressure can eliminate most 429s without aggressive backoff.
  • If the exchange returns Retry-After, it is telling you the window. Your policy should follow it.

In both cases, your biggest risk is synchronized retry. A bot that retries in lockstep is effectively a self-inflicted denial of service.

Prevention plan: stop 429 from becoming a repeat incident

Your goal is not "never see 429". Your goal is "429 never triggers a retry storm".

1. Centralize rate limiting per exchange and credential

Do not let each caller own its own retry loop. The limiter should live in one place so it can enforce budgets, caps, and stop rules.

Partition by:

  • exchange
  • account or api key hash
  • endpoint group (public, private, trading)

This prevents one noisy strategy from starving everything else.

2. Add a queue, then apply backpressure

If you have multiple strategies, you need a queue even if it is in-memory. The queue gives you a place to apply policy: limit concurrency, drop low value work, and prioritize trading over metrics.

Backpressure rules that work:

  • hard cap concurrency per key
  • enforce a minimum spacing between requests when budget is low
  • reject or defer non-critical calls when remaining budget is under a threshold

3. Retry policy with jitter and stop rules

Retry is a tool, not a default.

Policy that usually holds up:

  • 2-3 attempts max on 429
  • exponential backoff with jitter
  • respect Retry-After when present
  • circuit break when consecutive 429 exceeds a threshold

Stop rules that keep you safe:

  • If Retry-After is large (example: 60+ seconds), enter degrade mode and stop trading actions.
  • If 429s persist across multiple windows, stop and page. You are not recovering, you are being rate limited by design.

4. Make deploy behavior boring

Most 429 incidents happen right after deploy.

Guardrails:

  • random startup delay per instance
  • singleflight resync (only one instance performs heavy catch-up)
  • warm-up mode that ramps request rate over 2-5 minutes

5. Validate with a burst test

Before you ship changes, run a burst test and record the pattern.

Example procedure:

  • send a short burst to a known endpoint group
  • observe the shape of 429s
  • confirm your limiter spreads retries and settles

Your acceptance criteria is not "no 429". It is "no synchronized retry and no prolonged cooloff".

What to log

If you cannot prove the limiter is working, the incident will repeat.

Log enough fields to answer:

  • what budget did we exceed
  • what did the limiter decide
  • did retries synchronize
  • did we respect exchange guidance
json
{
  "ts": "2026-01-27T14:04:22.481Z",
  "event": "exchange_rate_limit",
  "exchange": "binance",
  "account_key_hash": "k_7c9b...",
  "endpoint": "/api/v3/order",
  "endpoint_group": "trading",
  "http_status": 429,
  "retry_after_seconds": 5,
  "request_weight": 1,
  "window_type": "fixed",
  "window_seconds": 60,
  "limiter_decision": "delay_then_retry",
  "attempt": 1,
  "backoff_ms": 1200,
  "jitter_ms": 430,
  "next_retry_at": "2026-01-27T14:04:28Z",
  "consecutive_429": 3,
  "breaker_state": "closed",
  "instance_id": "bot-03",
  "request_id": "req_4f1f..."
}

With this you can build simple dashboards:

  • 429 rate by endpoint group
  • consecutive 429 and breaker transitions
  • retries per instance during deploy windows

Shipped asset: exchange rate limiting package

Download

Exchange rate limiting package

Config template, decision checklist, and a 429 logging schema for trading bots. The download is a real local zip, not a placeholder.

This is intentionally compact here. Full package details are on the resource page.

Included files:

  • rate-limit-config-template.yaml
  • rate-limit-decision-checklist.md
  • rate-limit-429-logging-schema.json
  • README.md

Preview (config excerpt):

yaml
rate_limits:
  binance:
    trading:
      limit: 1200
      window_seconds: 60
      window_type: fixed
      retry_strategy: exponential_backoff
      max_retries: 3
      jitter_enabled: true
      respect_retry_after: true

Tradeoffs and failure modes to plan for

Rate limiting policy has costs. Naming them up front makes rollout safer.

  • You will delay work. That is the point. But it can cause missed opportunities if you have no degrade mode.
  • A strict limiter can hide upstream degradation by slowing everything. That is why breaker state and 429 rate must be visible.
  • Multiple instances need coordination. If each instance has its own limiter, you can still exceed a shared IP budget.

The clean solution is boring: centralize policy, cap concurrency, add stop rules, and make deploy traffic predictable.


Resources

This is intentionally compact. Full package details are on the resource page.

External references:


FAQ

Look at the shape of 429s.

Fixed window often looks like clusters, then clean success right after a reset. Leaky bucket style limiting often looks like failures that spread across time as bursts drain.

Do not rely on intuition. Log headers, record timestamps, and compare patterns around deploy windows.

No. That is how 429 becomes a storm.

Centralize the retry policy and make concurrency part of the policy. If you retry with the same concurrency that caused the limit, you will keep hitting the limit.

Use your own backoff policy and make it visible.

Start with a conservative base delay, add jitter, cap attempts, then escalate to degrade mode if 429 continues.

Do not assume it helps.

Some exchanges count limits per account, not per key. Others use IP-based limits. If you get it wrong you can still be blocked, and you may violate terms.

Deploys change traffic shape.

Restarts create resync bursts, reconnect logic can stampede, and instance count can increase overnight. Put guardrails around startup and ramp request rate.

Not always.

The target is that 429 is contained. A single request might be delayed, but the bot keeps operating and the operator has clear signals.


Coming soon

If this kind of post is useful, the Axiom waitlist is where we ship operational templates (runbooks, decision trees, defaults) that keep automation out of incident mode.

Coming soon

Axiom (Coming Soon)

Get notified when we ship real operational assets (runbooks, templates, benchmarks), not generic tutorials.


Key takeaways

  • 429 is backpressure. Your client decides whether it becomes an incident.
  • Fixed window and leaky bucket produce different 429 patterns. Diagnose before you tune.
  • Centralize rate limiting policy per exchange and credential.
  • Add jitter and stop rules. Never retry in lockstep.
  • Log limiter decisions so you can prove the fix.

Related posts