# Mean Time To Recovery (MTTR)
MTTR measures the average time it takes to restore a system to full operation after a failure or incident — from the moment the failure occurs to the moment normal service resumes.
It is one of the four core **DORA metrics** used to assess software delivery and operational performance, alongside:
- **Deployment frequency** — how often code is deployed to production
- **Lead time for changes** — time from commit to production
- **Change failure rate** — % of deployments causing incidents
Lower MTTR = more resilient, better-operated system.
## Formula
```
MTTR = Total downtime / Number of incidents
```
## What drives MTTR up
- Poor observability (hard to detect and diagnose failures)
- Unfamiliar or opaque code (no one knows how it works)
- Lack of runbooks or incident playbooks
- Tightly coupled systems (blast radius of failures is wide)
- Manual deployment and rollback processes
- High [[Cognitive load]] on the team during incidents
## What drives MTTR down
- Strong observability (logs, traces, metrics)
- Clear ownership and on-call rotation
- Automated rollbacks and feature flags
- Well-understood codebases
- Blameless post-mortems that improve playbooks
## MTTR and AI-generated code
MTTR for AI-generated code tends to be significantly higher than for hand-written code — particularly when developers accept code without fully understanding it.
When an incident occurs in a system containing opaque AI-generated code:
- The debugging surface is larger (the developer didn't write it, didn't review it deeply)
- Mental models of the code are weaker or absent
- Root cause analysis takes longer because the code's intent isn't obvious
- Confidence in fixes is lower, slowing decision-making
This directly undercuts a core argument for AI-assisted development: the velocity gains. If you ship faster but recover slower (and more rarely, more catastrophically), the net productivity story degrades — especially in complex, evolving systems where incidents compound and the codebase drifts further from anyone's mental model over time.
The implication: **velocity metrics alone don't capture the true cost of AI-generated code**. MTTR, change failure rate, and [[Technical debt]] accumulation are the counter-weights that reveal the real trade-off.
## References
- DORA metrics: https://dora.dev/guides/dora-metrics-four-keys/
## Related
- [[DevOps]]
- [[Technical debt]]
- [[Cognitive load]]
- [[Large Language Models (LLMs)]]
- [[AI Agents]]