Skip to main content

What Are DORA Metrics? A Plain-English Glossary Guide

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

DORA metrics are the four numbers that predict how well a software team ships code. Not opinions, not surveys — four hard signals: how often you deploy, how long changes take to reach production, how often deploys break things, and how fast you recover. The 2023 DORA report by Google Cloud, built on 10 years of research and 36,000+ respondents, is the largest dataset ever assembled on software delivery — and it keeps finding the same pattern.

This glossary explains each metric in plain English, with formulas and the benchmarks that separate elite teams from low performers. Read it once, keep it as a reference.

{/* truncate */}

What Are DORA Metrics? (Short Definition)

DORA metrics are four delivery-performance indicators defined by the DevOps Research and Assessment (DORA) team — now part of Google Cloud. They were popularized by Nicole Forsgren, Jez Humble, and Gene Kim in the 2018 book Accelerate, which showed that teams scoring well on these four metrics also outperform on profitability, market share, and employee retention.

The four metrics split cleanly into two pairs:

  • Throughput — Deployment Frequency, Lead Time for Changes
  • Stability — Change Failure Rate, Mean Time to Restore (MTTR)

The headline finding from Accelerate and every subsequent State of DevOps report: speed and stability aren't trade-offs. Elite teams are fast and safe. Low performers are slow and fragile.

The 4 DORA Metrics

1. Deployment Frequency

What it measures: How often your team deploys code to production.

Formula: Number of production deployments ÷ Time window

It sounds simple. The trick: a deploy is not a merge. If your main branch updates 20 times a day but production updates once a week, your deployment frequency is weekly, not 20/day.

Performance levelBenchmark (DORA 2023)
EliteOn-demand (multiple times per day)
HighOnce per day to once per week
MediumOnce per week to once per month
LowLess than once per month

Why it matters: Smaller deploys carry less risk per release. A team shipping 50 changes a week with daily deploys gets feedback in hours; a team shipping the same 50 changes in a monthly release gets feedback in weeks — by which time the author has forgotten what the change did.

Read more: How Teams Ship 50+ Deploys/Day: Preply, Etsy, Spotify Patterns.

2. Lead Time for Changes

What it measures: Time from first commit to that code running in production.

Formula: Deploy timestamp − First commit timestamp (averaged or median across changes)

Performance levelBenchmark (DORA 2023)
EliteLess than one hour
HighOne day to one week
MediumOne week to one month
LowMore than one month

A single lead-time number hides the actual bottleneck. Most teams have one of four problems:

StageTime goes here whenFix
CodingTasks are too bigBreak stories down
Pickup (waiting for review)Reviewers ignore the PR queueSet a review SLA
ReviewToo many back-and-forth cyclesClarify standards upfront
DeployManual approvals, slow CIAutomate gates

PanDev Metrics splits Lead Time into these four stages automatically from Git events, so you see exactly which stage owns your slowness.

3. Mean Time to Restore (MTTR)

What it measures: How long it takes to recover from a production failure.

Formula: Sum of incident durations ÷ Number of incidents

Performance levelBenchmark (DORA 2023)
EliteLess than one hour
HighLess than one day
MediumOne day to one week
LowMore than one week

MTTR is not about preventing failures. Failures happen. MTTR measures the muscle of recovery — feature flags, rollback automation, on-call paging, observability. Teams with sub-hour MTTR usually don't have fewer outages; they have rollback paths built in from day one.

See: MTTR Targets 2026: Realistic Speed of Recovery Benchmarks.

4. Change Failure Rate

What it measures: Percentage of deployments that cause a production incident, hotfix, or rollback.

Formula: Failed deployments ÷ Total deployments × 100%

Performance levelBenchmark (DORA 2023)
Elite0–5%
High5–10%
Medium10–15%
LowMore than 15%

Here's the contrarian claim most teams miss: a Change Failure Rate of 0% is a red flag, not an achievement. It almost always means one of three things — failures aren't being detected, deploys are so rare they get tested for weeks, or the team is hiding rollbacks to look good. A healthy elite team sits around 5%. Zero means broken instrumentation.

Detailed treatment: Change Failure Rate: Why 15% Is Normal and 0% Is a Red Flag.

DORA vs SPACE vs DevEx

DORA isn't the only productivity framework. The three you'll see most often:

FrameworkYearMeasuresBest for
DORA2014–presentDelivery throughput + stabilityPipeline health, executive reporting
SPACE2021 (Forsgren et al.)Satisfaction, Performance, Activity, Collaboration, EfficiencyHuman signals, team health
DevEx2023 (DX, Inc.)Flow, feedback loops, cognitive loadFriction and developer happiness

DORA tells you whether your delivery system works. SPACE and DevEx tell you whether your developers are okay. Most mature engineering orgs use a blend — DORA for the board, SPACE/DevEx for retros.

Full comparison: DORA vs SPACE vs DevEx 2026: Which Framework Wins.

How to Start Measuring DORA in Your Team

You don't need a 90-day program. A practical first pass:

  1. Pick a single product or team. Org-wide DORA is a year-long project. One team is a week.
  2. Define what "production" means. Pick one environment. Tag deploys to it. Everything else is staging.
  3. Hook up Git and CI/CD. Deploy events from GitHub Actions, GitLab CI, or Jenkins give you Deployment Frequency and Lead Time for free.
  4. Pick a failure signal. Either incident tickets in Jira/PagerDuty, or rollback commits in Git. Pick one definition and stick to it for Change Failure Rate and MTTR.
  5. Read the numbers, don't argue with them. First-month DORA data is almost always embarrassing. That's the point — a baseline you'd be happy with is a baseline that's lying.

PanDev Metrics collects all four DORA metrics automatically from Git events, CI/CD webhooks, and Jira incident links — no manual spreadsheets, no surveys. The platform pulls the events, computes the math, and shows the four numbers with the right elite/high/medium/low coloring per the 2023 DORA report.

For a deeper implementation walkthrough, see the DORA Metrics Complete Guide 2026.

What DORA Doesn't Measure (Honest Limit)

DORA tells you the delivery system is healthy. It does not tell you the product is good. A team can hit elite DORA scores while shipping features no customer wants. DORA also doesn't measure code quality, technical debt, or developer happiness — those need SPACE, DevEx, or direct conversation.

The other limit: DORA is built for teams that own a deployable service. If your team is research-heavy, building a desktop installer, or doing client work where releases ship via email — the DORA model fits awkwardly. Use it as a signal, not a verdict.

FAQ

What are DORA metrics in plain English?

Four numbers that show whether your engineering team ships software fast and reliably: how often you deploy, how long it takes for code to reach production, how often deploys cause problems, and how fast you recover when they do.

How many DORA metrics are there?

Four. Deployment Frequency, Lead Time for Changes, Change Failure Rate, and Mean Time to Restore (MTTR). The 2021 State of DevOps report added a fifth — Reliability — but most practitioners still use the original four.

What is an elite performer in DORA terms?

A team that deploys multiple times per day, has Lead Time under one hour, recovers from failures in under an hour, and has a Change Failure Rate under 5%. Per the 2023 DORA report, roughly 18% of surveyed teams hit elite status.

Is DORA the same as DevOps?

No. DORA is a measurement framework for DevOps outcomes. DevOps is the practice; DORA is how you tell if the practice is working. You can do DevOps badly and have terrible DORA numbers; you can also hit good DORA numbers without calling it DevOps.

Why do DORA metrics matter?

Because they're the only widely accepted, peer-reviewed framework for measuring software delivery performance — and Google Cloud's research consistently links them to business outcomes like profit and retention. They turn engineering performance from opinion into something you can defend in a budget meeting.

How are DORA and SPACE different?

DORA measures the pipeline — what comes out and how reliably. SPACE measures the people — satisfaction, collaboration, focus. DORA answers "is the system fast and safe?" SPACE answers "are the humans okay?" Most teams need both.


Sources: Accelerate State of DevOps Report (Google Cloud, 2023); Forsgren, Humble, Kim — Accelerate (IT Revolution Press, 2018); Forsgren et al., "The SPACE of Developer Productivity" (ACM Queue, 2021).

Ready to see your team's real metrics?

30-minute personalized demo. We'll show how PanDev Metrics solves your team's specific challenges.

Book a Demo