# Retry budget template

Fill this in per dependency and per call path. The budget is the contract.

## 1) Define the call path

- Call path: [web request | message handler | scheduled job step]
- Dependency: [name]
- Criticality: [high | medium | low]

## 2) Choose the total time budget

Total time budget is the maximum time you will spend including retries.

- totalBudgetMs: [ ]

Guidance:

- Web requests are usually single digit seconds end to end.
- Background jobs can be longer, but still bounded.

## 3) Choose the per-attempt timeout

Per-attempt timeout is the maximum time for one call attempt.

- timeoutMs: [ ]

Rule of thumb:

- timeoutMs should be less than totalBudgetMs.
- Do not set timeoutMs to infinity. It turns retries into stacked waits.

## 4) Choose max attempts

- maxAttempts: [ ]

Guidance:

- 2-3 total attempts is a common safe baseline.
- If you need more attempts often, fix timeouts, reduce fan-out, add caching, or reduce concurrency.

## 5) Backoff and jitter

Use exponential backoff with jitter to avoid synchronized retries.

Template:

- baseDelayMs: [ ]
- maxDelayMs: [ ]

Delay function (example):

- delay = min(maxDelayMs, baseDelayMs * 2^(attempt-1))
- jitteredDelay = delay * random(0.7, 1.3)

## 6) Classification rules

Write these as explicit rules so they are testable.

STOP examples:

- 400, 401, 403
- 409, 412
- 404 when it means "does not exist"

RETRY examples:

- 429 with Retry-After
- 5xx, connection reset
- timeouts when the dependency is known to recover quickly

ESCALATE examples:

- budget exhausted
- max attempts reached
- sustained throttling

## 7) Invariants (what must be true)

- total elapsed time never exceeds totalBudgetMs
- each attempt never exceeds timeoutMs
- decisions are logged (stop, retry, escalate)
- Retry-After is respected when present
- layered retries are removed so there is one retry authority

## 8) What to log per attempt

Minimum fields:

- correlationId
- dependency
- attempt
- delayMs
- timeoutMs
- elapsedMs
- totalBudgetMs
- statusCode or exceptionType
- decision
- reason
