Error Index

Production incident resolution database. Search by error code, vendor, or symptom — or browse below. Each entry includes the root cause, quick fix, related tools, and step-by-step troubleshooting.

⌘K

Vendor

Symptom

13 errors

Browse by vendor

Quick find

Crypto AutomationFeb 25, 2026

WebSocket disconnects cause stale bot state

Connections recover but position and order state drift from reality, leading to unsafe decisions. Reliable reconnection, gap detection, and replay boundaries are needed to stabilize.

websocket disconnects trading botbot reconnects but misses eventsstale state after reconnect

.NETFeb 24, 2026

Database write succeeded but event was never published

Teams observe data committed but downstream systems never receive the event. This is a classic transactional boundary failure where outbox pattern adoption closes the gap.

write succeeded event missingevent lost after db commitoutbox pattern needed

Crypto AutomationFeb 23, 2026

Bot restart causes duplicate orders or orphan state

After crash or restart, reconciliation is incomplete and live state diverges from exchange truth. Without startup guards and idempotent reconciliation loops, duplicates are likely.

double orders after bot restartorphan position after crashstartup reconciliation missing

.NETFeb 04, 2026

Duplicate orders and emails from retried API calls

Under retry pressure, side effects execute more than once and teams lose trust in automation. Idempotency keys and contract rules are required before scaling retries.

duplicate orders after retryduplicate emails apiidempotency missing for writes

Crypto AutomationJan 31, 2026

Trading bot 429 storm immediately after deploy

Multiple instances restart together, resync bursts hit the same budget, and retries amplify load. The path out is coordinated rate limiting, Retry-After handling, and startup ramp controls.

trading bot keeps getting 429 after deployrate limit storm after restartscaling out made 429 worse

.NETJan 30, 2026

HttpClient 429 spikes after deploy in .NET

Deploy finishes, traffic looks normal, then 429 errors spike and retries make it worse. This usually means Retry-After is ignored, concurrency ramps too fast, or retry budgets are too loose.

httpclient keeps getting 429dotnet retries amplify rate limitingretry-after not honored

.NETJan 28, 2026

Cannot trace a request across services and jobs

Teams cannot answer where a request failed because IDs are lost at service boundaries. Correlation discipline and propagation rules are usually the missing control.

cannot trace request across servicescorrelation id lostno end to end request visibility

.NETJan 26, 2026

Retries making outages worse in .NET services

Service owners add retries to recover, but outage duration increases and queue pressure grows. The hidden issue is usually layered retries, missing stop rules, and no total time budget.

retry storm during outageretries increase latencyresilience policy causing cascade

.NETJan 24, 2026

Requests time out but CPU is normal (thread pool starvation)

Latency climbs, requests queue, and CPU looks deceptively normal. This usually points to blocked worker threads, sync-over-async hotspots, or retry/time budget behavior pinning resources.

requests timing out cpu normalaspnet thread pool starvationqueue grows but cpu low

.NETJan 21, 2026

Requests hang forever in .NET (missing or unsafe timeouts)

Incidents repeat because calls wait too long and cancellation never propagates. The fix is usually a timeout matrix and total-budget discipline across HTTP, DB, and jobs.

requests hang forever dotnetmissing timeout budgetinfinite waits recurring outage

.NETJan 19, 2026

Background jobs stuck but dashboards look healthy

Workers look alive but business tasks silently stop progressing. This often comes from missing progress telemetry, weak liveness checks, and unclear stop/restart rules.

background jobs hang foreverworkers alive but no progressstuck jobs no alert

Crypto AutomationJan 11, 2026

Exchange API key banned after bot deploy

Automation appears healthy, then keys are throttled or blocked with little warning. Typical causes are bursty startup behavior, weak limiter coordination, and policy drift across instances.

exchange api key suddenly forbiddenbot banned after deployapi forbidden without warning

Crypto AutomationJan 09, 2026

Signature invalid after bot worked for hours

Private endpoints suddenly fail with timestamp or signature errors even though no signing code changed. Clock drift and startup calibration gaps are usually the real cause.

signature invalid bot was workingtimestamp out of range recvwindowbot 401 after working hours