Skip to main content

Terraform Adoption: Metrics for Infrastructure Teams

· 8 min read
Artur Pan
CTO & Co-Founder at PanDev

The team adopted Terraform 18 months ago. Deploys are slower than the old click-ops setup, reviews take longer, and three of your best engineers now spend a full day per week on Terraform plan output. Senior leadership asks whether the migration was worth it, and nobody has a clean answer. The honest one is: you never defined what "worth it" looks like in metrics. HashiCorp's 2024 State of Cloud Strategy reported that 76% of enterprises adopted IaC, but only 31% measured its outcomes against pre-adoption baselines. The CNCF's 2023 Annual Survey found a similar gap for infrastructure-as-code tooling generally.

This article is a measurement framework for infrastructure teams already using Terraform, OpenTofu, or Pulumi. It doesn't debate whether IaC is worthwhile — that ship sailed. It defines six metrics that show whether your adoption is healthy or decaying, plus the benchmark ranges from 37 companies in our dataset that run Terraform in production.

{/* truncate */}

The problem

Most IaC "success" claims are based on end states (we use modules now!) rather than trajectories. Trajectory matters more. A team with 200 modules and a 15% reuse rate is worse off than a team with 40 modules and an 80% reuse rate, even though the first looks more "adopted" on a slide.

The failure mode we see most: Terraform sprawl without consolidation. After 18-24 months, teams accumulate hundreds of .tf files, dozens of loosely-shared modules, and an apply-time that creeps past 20 minutes. At that point the IaC is slowing the team instead of speeding it. Thoughtworks' Technology Radar has flagged "Terraform sprawl" as an anti-pattern since 2022.

The 6 metrics to track

Metric 1 — Module reuse rate

Percent of resources created inside a shared module vs. inline in a root config. Low reuse = every team is reinventing aws_s3_bucket with slightly different tags and lifecycles.

Target: 65-80% on infrastructure older than 12 months. Red flag: below 40% after 18 months of adoption.

How to measure: count resource blocks, check if each is inside a module. path or at root. Scriptable in ~30 lines of Terraform parser.

Metric 2 — Plan-to-apply ratio

Ratio of terraform plan runs to terraform apply runs in a given week. Runs that never apply = reviewer fatigue, over-cautious processes, or plan-timeout issues.

Healthy range: 3:1 to 5:1 (you plan more than you apply, that's fine). Red flag: above 10:1 (plans are noise, not signal) or below 1.5:1 (you're applying too much without review).

Metric 3 — Apply duration (p50 + p95)

Median and tail latency of a production apply. This is your infrastructure deploy frequency — DORA's deployment frequency metric applied to IaC. An apply that takes 45 minutes turns into a team policy to batch changes, which is how you end up with risky Friday afternoon changes.

Team sizeHealthy p50Healthy p95Red flag p95
<10 engineers on infra2-5 min10 min30+ min
10-30 engineers5-10 min20 min45+ min
30+ engineers5-15 min25 min60+ min

Metric 4 — Plan-failure rate

Share of plans that fail to run (state lock, provider errors, drift). This is IaC's equivalent of change failure rate. A small constant rate (2-5%) is healthy. A climbing trend signals infrastructure that has drifted from code.

Our data across 37 Terraform-using companies:

Adoption stageMedian plan failure rate90th percentile
First 6 months8.2%18%
6-18 months4.1%11%
18+ months (mature)2.7%7%
18+ months (sprawl)9.4%22%

The "sprawl" segment is the unpleasant one: teams at 18+ months who got worse, not better. Cause is almost always the same — modules multiplied faster than they were consolidated.

Bar chart showing module reuse rates across five teams in our dataset Module reuse rate separates healthy Terraform adoption from sprawl. A team at 12% is not benefiting from IaC — they're writing verbose boilerplate in HCL instead of a real language.

Metric 5 — Drift detection cadence

How often you scan production for state drift (terraform plan with no intended changes, detecting unexpected diffs). Teams with no drift detection discover infra reality during an incident.

Healthy: daily drift scan, with unexpected diffs alerted to the owning team. Acceptable: weekly. Red flag: ad-hoc / on-demand only.

Drift scans are the feedback loop that keeps Terraform honest. Without them, someone made a console change on Tuesday and the code lies about prod until Friday's release breaks.

Metric 6 — Engineer time per infra task

This is where IDE telemetry becomes useful. Track time spent inside .tf, .hcl, .tfvars files — our language-distribution reporting breaks this out automatically. If IaC work grew from 8% of infrastructure team time (pre-adoption) to 45% (year 1) to 22% (year 3), you have a healthy S-curve. If it stayed at 45% or climbed past it, you're in sprawl.

PhaseExpected infra-team time on IaC
Pre-adoption (click ops)<5%
Adoption year 130-50%
Adoption year 220-30%
Steady state (mature)10-20%

The steady-state number matters. IaC adoption that never gets back below 30% is not paying off. Either the codebase is too complex for the team's scale, or modules never consolidated.

Common adoption mistakes

MistakeWhy it hurtsFix
One giant state fileApply time explodes, one change blocks allSplit by environment + service, use remote state + data sources
Copy-paste modules instead of sharingReuse rate drops, security hardening never propagatesMandate shared-module registry, block new resource blocks outside modules in CI
No drift detectionProd reality diverges from code, next deploy breaksDaily drift scan, alert on diffs
terraform apply from local machinesNo audit trail, credentials sprawlMandate CI/CD apply with OIDC, not long-lived keys
Versionless modulesBreaking changes land silently in downstream consumersRequire version = pin in all module calls

How to measure success with PanDev Metrics

Infrastructure engineers show up differently in our IDE-heartbeat data than feature engineers. Their language distribution is HCL-heavy, their active coding time is lower (plans take time), and their context-switching is often triggered by alerts rather than task assignments. PanDev Metrics' language-distribution view separates HCL / YAML / Starlark time from Python / Go / Terraform-module-Go so you can see the actual IaC share of engineer effort.

Three views that matter:

  • HCL time as % of infrastructure team time — the Metric 6 signal above.
  • Context switching per infra engineer — high values correlate with drift-driven firefighting.
  • Cost per infrastructure change — using our cost-per-feature view with infrastructure projects to get dollar values on IaC engineer time.

The checklist

  • Module reuse rate measured monthly, target ≥65% by month 18
  • Plan-to-apply ratio visible to team leads, alert on >10:1
  • Apply p50 and p95 tracked per environment
  • Plan failure rate trended; investigate any climb after month 12
  • Drift detection runs daily, unexpected diffs are incidents
  • Infra-team HCL time is bounded; climbing values trigger consolidation sprint
  • New modules require version pin, README, and an owner
  • Apply runs from CI only, no local creds

When Terraform is the wrong tool

This framework assumes Terraform (or OpenTofu, same shape) is right for the team. It isn't always. Cases where you should reconsider:

  • Single-cloud, mostly serverless, small team (<5 engineers): AWS CDK or Pulumi in a real language often wins. Terraform's value scales with multi-cloud and module reuse across teams.
  • Rapidly-evolving prototype phase: click ops isn't a sin at <3 months. Premature IaC burns weeks on abstractions that get deleted.
  • Kubernetes-only estate: ArgoCD + Helm + Kustomize often covers 90% of what a team would use Terraform for. Adding Terraform on top adds a second source of truth.

The contrarian point: Terraform adoption is often sold as a monotonic win. It's not. Teams in our dataset that adopted Terraform and stayed at low module-reuse for 2+ years produced more infrastructure bugs than the teams that never adopted. The tool without the module discipline is a worse outcome than the click-ops baseline.

Try it yourself — free

Connect your IDE plugin in 2 minutes and see your real metrics. No credit card, no commitment.

Try Free