Question 1

How is this different from a documentation runbook?

Accepted Answer

Documentation runbooks are written once and go stale. This builder generates runbooks with decision trees — if/then logic that guides the responder step by step. Each runbook maps to a specific incident type with concrete detection criteria, not vague 'check the logs' instructions. The output is copy-paste ready for Slack, PagerDuty, or Notion.

Question 2

What severity should I assign to each incident type?

Accepted Answer

P1 (Critical): revenue-impacting — order failures, reconciliation drift. P2 (High): degraded service — WebSocket drops, rate limits. P3 (Medium): minor impact — elevated latency, intermittent errors. P4 (Low): cosmetic. The builder sets sensible defaults but you should tune them based on your bot's PnL sensitivity and trading frequency.

Question 3

How do I keep runbooks updated?

Accepted Answer

Re-run the builder after every production incident to incorporate lessons learned. The post-incident checklist items should feed back into the runbook. Store runbooks in version control (git) alongside your bot code. Review and update quarterly even if no incidents occurred.

Question 4

Can I customize the runbook for my specific bot?

Accepted Answer

Yes — copy the Markdown output and edit it. Add your specific alert thresholds, Slack channel names, on-call rotation details, and any bot-specific checks. The builder gives you the structure and decision logic; you customize the specifics.

Question 5

What's the most important part of an incident runbook?

Accepted Answer

The decision tree. When an incident hits, the responder is stressed and sleep-deprived. A clear if/then tree prevents panic-driven decisions (like restarting the bot without checking state, or retrying orders without checking idempotency). The first decision should always be: 'Stop the bleeding, then investigate.'

Incident Runbook Builder

Builder

Detection Criteria

Response Steps

Decision Tree

Escalation Path

Post-Incident Checklist

Related tools

Log Field Checklist Builder

Exchange Error Lookup

Retry Policy Generator

Frequently asked questions

Frequently asked questions

What engineers say

What engineers say

Weekly engineering insights