
Category:AutomationCrypto
Crash Recovery: Reconciliation Loops That Prevent Double Orders
Build crash-proof trading bots with reconciliation loops that detect and correct out-of-sync state on restart—preventing double orders and orphan positions.
Free download: Crash Recovery Reconciliation Kit. Jump to the download section.
Your trading bot crashed at 3 AM. When it restarts, it doesn't know if that last order went through. Place it again? Maybe. But if the previous order succeeded, you've just doubled your position size.
This is the crash recovery problem. Solved by reconciliation loops—recovery routines that compare local state to exchange reality and fix any drift before resuming trading.
This post is in the Crypto Automation hub and the Crypto Automation category.
- Run reconciliation on every startup before enabling trading.
- Detect all three failure modes: orphan orders, ghost orders, and stale fills.
- Trust the exchange as source of truth. Your local state is just a cache.
Fast Triage: Recovery Pattern Selection
| Scenario | Pattern | Complexity | Recovery Time |
|---|---|---|---|
| Order sent, ack unknown | Idempotency key lookup | Low | < 100ms |
| Bot crashed mid-order | Order status reconciliation | Low | 200-500ms |
| Position drift detected | Position reconciliation | Medium | 500ms-2s |
| Fill notifications missed | Fill backfill loop | Medium | 1-5s |
| Full state corruption | Complete state rebuild | High | 10-30s |
Start with order status reconciliation. It handles 80% of crash scenarios.
The Three Failure Modes
Every crash creates one of three state mismatches:
1. Orphan Orders
What happened: Bot crashed after order reached exchange but before recording locally.
Risk: Double orders on restart (you place again, now have 2).
Detection:
async function findOrphanOrders(exchange, localOrderIds: Set<string>) {
const exchangeOrders = await exchange.fetchOpenOrders();
return exchangeOrders.filter(o => !localOrderIds.has(o.clientOrderId));
}2. Ghost Orders
What happened: Bot recorded order locally but request never reached exchange.
Risk: Bot thinks position is X, but it's actually Y.
Detection:
async function findGhostOrders(exchange, localOrders: Order[]) {
const ghostOrders: Order[] = [];
for (const local of localOrders) {
const remote = await exchange.fetchOrder(local.clientOrderId);
if (!remote || remote.status === 'not_found') {
ghostOrders.push(local);
}
}
return ghostOrders;
}3. Stale Fills
What happened: Order filled on exchange, but fill notification never processed.
Risk: Position tracking completely wrong, risk calculations invalid.
Detection:
async function findStaleFills(exchange, localOrders: Order[]) {
const staleFills: Order[] = [];
for (const local of localOrders) {
if (local.status !== 'filled') {
const remote = await exchange.fetchOrder(local.clientOrderId);
if (remote?.status === 'filled') {
staleFills.push({ local, remote });
}
}
}
return staleFills;
}The Reconciliation Loop
Run this on every startup, before any new trading activity:
async function reconcileOnStartup(
exchange: Exchange,
state: TradingState
): Promise<ReconciliationResult> {
const result: ReconciliationResult = {
orphansFound: 0,
ghostsRemoved: 0,
fillsBackfilled: 0,
positionCorrected: false,
};
// Phase 1: Detect orphan orders (exchange has, we don't)
const orphans = await findOrphanOrders(exchange, state.orderIds);
for (const orphan of orphans) {
// Option A: Cancel if strategy no longer valid
// Option B: Adopt and track
await state.adoptOrder(orphan);
result.orphansFound++;
}
// Phase 2: Remove ghost orders (we have, exchange doesn't)
const ghosts = await findGhostOrders(exchange, state.openOrders);
for (const ghost of ghosts) {
await state.removeOrder(ghost.clientOrderId);
result.ghostsRemoved++;
}
// Phase 3: Backfill stale fills
const staleFills = await findStaleFills(exchange, state.openOrders);
for (const { local, remote } of staleFills) {
await state.processFill(local.clientOrderId, remote.filled, remote.price);
result.fillsBackfilled++;
}
// Phase 4: Verify position accuracy
const exchangePosition = await exchange.fetchPosition(state.symbol);
if (Math.abs(exchangePosition.size - state.position.size) > 0.0001) {
state.position.size = exchangePosition.size;
result.positionCorrected = true;
}
return result;
}Idempotency Keys Are Your Safety Net
Order reconciliation depends on client order IDs (idempotency keys). Without them, you can't ask the exchange "did this specific order go through?"
function generateClientOrderId(
strategy: string,
symbol: string,
timestamp: number
): string {
// Deterministic: same inputs always produce same ID
// Unique: different orders get different IDs
return `${strategy}-${symbol}-${timestamp}-${randomSuffix()}`;
}Every order must include this ID. Most exchanges support it:
| Exchange | Field Name | Max Length |
|---|---|---|
| Binance | newClientOrderId | 36 chars |
| Bybit | orderLinkId | 36 chars |
| OKX | clOrdId | 32 chars |
| Kraken | userref | 32-bit int |
Check your exchange's API docs for the exact field.
Position Reconciliation: The Final Check
Order-level reconciliation handles individual order drift. But position can still be wrong if:
- Fills came through WebSocket that crashed
- Manual trades were placed outside the bot
- Exchange corrections adjusted fills retroactively
Always reconcile position as the final step:
async function reconcilePosition(
exchange: Exchange,
state: TradingState
): Promise<void> {
const remote = await exchange.fetchPosition(state.symbol);
const local = state.position;
const driftSize = Math.abs(remote.size - local.size);
const driftPct = (driftSize / Math.abs(local.size || 1)) * 100;
if (driftPct > 0.1) { // More than 0.1% drift
console.warn(`Position drift detected: local=${local.size}, remote=${remote.size}`);
// Update local state to match exchange reality
state.position.size = remote.size;
state.position.entryPrice = remote.entryPrice;
// Log for audit
await auditLog({
event: 'position_reconciled',
drift: driftSize,
localBefore: local.size,
remoteTruth: remote.size,
});
}
}Exchange always wins. Your local state is just a cache.
Startup Sequence: Order Matters
- Load local state from disk/database
- Run reconciliation loop (the code above)
- Update risk limits based on corrected position
- Resume WebSocket streams for live updates
- Enable trading only after steps 1-4 complete
async function startupSequence(config: BotConfig): Promise<void> {
// 1. Load state
const state = await loadState(config.stateFile);
// 2. Create exchange connection (but don't trade yet)
const exchange = createExchange(config, { tradingEnabled: false });
// 3. Reconcile
const reconciled = await reconcileOnStartup(exchange, state);
console.log('Reconciliation complete:', reconciled);
// 4. Verify risk is within limits post-reconciliation
if (!isWithinRiskLimits(state.position, config.limits)) {
throw new Error('Position exceeds risk limits after reconciliation');
}
// 5. Enable trading
exchange.enableTrading();
// 6. Connect live data
await exchange.connectWebSocket();
console.log('Trading resumed');
}Never skip step 4. A position that drifted during downtime might exceed your risk limits.
Shipped asset: crash recovery reconciliation kit
Crash Recovery Reconciliation Kit
TypeScript reconciliation loop template and startup sequence checklist for trading bots. Detects orphan orders, ghost orders, and stale fills on every restart.
- Your bot stores order state locally.
- You run strategies where a crash would make state ambiguous.
- Position accuracy is critical (it always is).
- Stateless order placement (fire and forget).
- You query exchange state on every decision anyway.
- Development or paper trading only.
Included files:
reconciliation-loop-template.ts- Full TypeScript implementationstartup-sequence-checklist.md- Step-by-step startup verificationREADME.md- Integration guide
Error Handling: When Reconciliation Fails
Reconciliation itself can fail. Handle these cases:
async function safeReconciliation(
exchange: Exchange,
state: TradingState
): Promise<ReconciliationResult | null> {
try {
return await reconcileOnStartup(exchange, state);
} catch (error) {
if (error.code === 'RATE_LIMITED') {
// Wait and retry
await delay(60_000);
return await reconcileOnStartup(exchange, state);
}
if (error.code === 'EXCHANGE_DOWN') {
// Cannot reconcile—refuse to start
console.error('Cannot reconcile: exchange unreachable');
process.exit(1);
}
// Unknown error—refuse to trade
console.error('Reconciliation failed:', error);
return null;
}
}If reconciliation fails, don't trade. Operating with unknown state is how you blow up accounts.
Scheduled Reconciliation: Not Just Startup
Run reconciliation periodically during operation too:
// Run light reconciliation every 5 minutes
setInterval(async () => {
const drift = await checkPositionDrift(exchange, state);
if (drift > config.driftThreshold) {
await reconcilePosition(exchange, state);
}
}, 5 * 60 * 1000);This catches drift from:
- WebSocket message loss
- Network partitions you didn't notice
- Exchange corrections
Checklist (copy/paste)
Idempotency setup:
- All orders include client order ID
- IDs are deterministic (reproducible from order params)
- IDs are stored before order placed
Reconciliation loop:
- Orphan order detection implemented
- Ghost order cleanup implemented
- Stale fill backfill implemented
- Position reconciliation as final step
Startup sequence:
- State loaded before exchange connection
- Reconciliation completes before trading enabled
- Risk limits checked after reconciliation
- WebSocket connected after reconciliation
Periodic maintenance:
- Position drift checked every N minutes
- Full reconciliation on any WebSocket reconnect
- Drift logged for monitoring
Failure handling:
- Reconciliation timeout defined
- Exchange-down scenario handled
- Manual intervention trigger defined
Under 10 seconds for most accounts. If you have hundreds of open orders, it might take longer due to rate limits. Batch your order queries and respect exchange rate limits. If reconciliation consistently takes more than 30 seconds, you likely have too many open orders.
Exchange state is the source of truth for positions and orders. Your local state might track things the exchange doesn't (strategy metadata, alerts, etc.), but for anything the exchange knows about, trust the exchange. If the exchange is wrong, that's a support ticket, not a code fix.
Depends on your strategy's time horizon. If orders are good for seconds (scalping), cancel orphans—the opportunity passed. If orders are good for hours (swing), adopt them and let strategy logic decide. Default to adoption with position recalculation.
Yes. WebSocket connections drop. Messages get lost. TCP doesn't guarantee delivery order. Reconciliation catches what WebSocket missed. Think of WebSocket as fast path, REST reconciliation as verification path.
Resources
- Reconciliation Loop Template
- Idempotency Keys for Exchange APIs
- Rate Limiting Strategies for Exchange APIs
Axiom is coming
Join the waitlist and get notified when we ship real, operational tooling (not tutorials).
Recommended resources
Download the shipped checklist/templates for this post.
Reconciliation loop template for trading bots—detect and correct state drift on startup to prevent double orders and orphan positions.
resource
Related posts

WebSocket Disconnects in Trading Bots: Reconnection That Actually Works
Handle WebSocket disconnects in trading bots with automatic reconnection, message gap detection, and state recovery—without missing fills or duplicating orders.

Agent keeps calling same tool: why autonomous agents loop forever in production
When agent loops burn tokens calling same tool repeatedly and cost spikes: why autonomous agents loop without stop rules, and the guardrails that prevent repeat execution and duplicate side effects.

Trading bot keeps getting 429s after deploy: stop rate limit storms
When deploys trigger 429 storms: why synchronized restarts amplify rate limits, how to diagnose fixed window vs leaky bucket, and guardrails that stop repeat incidents.