Estimate Token Spend When AI Agents Retry on Failure
When an agent step can fail and retry, your real cost is a probability-weighted sum across attempts — not just one call. Enter your error rate, retry limit, and per-attempt tokens to see the expected cost, attempt count, and worst case.
How the retry cost model works
A naive estimate multiplies one call's cost by your task count. That undercounts any pipeline where a failed attempt is retried, because every retry consumes the full input and output tokens again. This calculator computes the expected number of attempts per task as a finite geometric series, then scales tokens and price by it.
p = failure probability per attempt, R = max retries.Expected attempts per task =
1 + p + p² + … + p^R = (1 − p^(R+1)) / (1 − p).Expected tokens = attempts × (input + output) per attempt.
Final-failure probability =
p^(R+1) (every attempt failed).
The series captures the diminishing likelihood of each successive retry: the second attempt only runs with probability p, the third with p², and so on. As p approaches 0 the expected attempts approach exactly 1; as p approaches 1 it approaches R+1, the hard ceiling. Because attempts are bounded by the retry limit, the geometric sum is finite even at high error rates — that bound is what makes the worst-case column meaningful for budgeting.
Two figures matter most for an orchestration budget. The expected cost tells you the average bill once retries are amortized over many tasks. The worst-case cost assumes every task exhausts all R+1 attempts — the ceiling your concurrency and rate-limit headroom must survive during an incident. The final-failure rate (p^(R+1)) is the share of tasks that still fail after all retries; those need a dead-letter path, human escalation, or a fallback model, and they are pure sunk cost since they consumed every attempt without producing a usable result. Tune the retry limit against this number: raising R shrinks final failures but inflates worst-case spend, so the right setting balances reliability against the tail of your token bill.
Related Tools
- Agent Loop Cost Calculator — multi-step reasoning loop spend.
- Agent Cost Per Task Calculator — unit economics of a single agent run.
- Tool-Use Cost Estimator — token overhead of function-calling round trips.