Skip to main content

Design Docs: When to Write, When to Skip (Decision Framework)

· 9 min read
Artur Pan
CTO & Co-Founder at PanDev

A mid-size team I advised last year had a standing rule: every ticket above 3 story points needed a design doc. Eight engineers, roughly four docs per week, each doc eating a half-day to write and another half-day in review cycles. That's 32 engineering hours per week — four full working days a week, spent on documents that most people scanned once and never reopened. The CTO thought they were a high-discipline shop. The data said they were documentation-heavy and velocity-poor.

The opposite extreme is worse. A 2019 report from Stack Overflow's Developer Survey listed "poor documentation of internal systems" as the #2 productivity blocker after technical debt itself. Skipping design docs entirely means every sixth-month refactor is an archaeology dig.

This is the framework I use to decide which changes deserve a doc, which deserve a 3-sentence RFC comment, and which deserve nothing at all.

{/* truncate */}

The problem: docs written for the wrong reason

Design docs fail in two ways, both avoidable.

Failure 1: Written as a ritual. The doc exists because the process says it should exist. Nobody opened it after approval. The writer spent half a day; the reviewers spent an hour of calendar time; the artifact added zero decision value.

Failure 2: Not written when they should have been. Something gets built, breaks in a non-obvious way six months later, and the post-mortem asks "why wasn't this decision documented?" The senior who made the call has left. The reasoning is lost. The rebuild costs 3x the original effort.

Google's engineering-practices handbook puts the purpose bluntly: a design doc exists to surface disagreement before code exists, because the cost of disagreement at design time is minutes, and the cost at code-review time is days. If your doc isn't surfacing disagreement, you either wrote it too late, or you didn't need it.

The academic case is supportive but limited. Ernst et al. (2015), Measure It? Manage It? Ignore It? (an IEEE Software paper on design documentation practices) found that teams producing targeted design docs shipped with 20-30% fewer post-release defects than teams relying on ad-hoc documentation — but only when the doc captured tradeoffs, not when it re-described the code. A doc that restates what the PR shows is worse than no doc: it adds calendar friction without adding reasoning.

The 3-question decision framework

Flow diagram: three yes/no questions leading to either a full design doc, a quick RFC comment, or skipping documentation entirely. Three questions, three possible outcomes. Most changes land at "skip" or "RFC comment". Only a minority need a full doc.

Question 1: Is the scope more than 1 engineer-week?

If a single engineer can finish the change in a week, they should usually just build it and let the PR description carry the context. A PR description is a lightweight design doc with 100% read-through because reviewers can't merge without opening it.

The week threshold is not arbitrary. Work that takes longer than a week almost always crosses module boundaries, and once module boundaries are crossed, multiple people have opinions about the interface. That's when writing compresses into review time.

Question 2: Does the change affect more than one team?

Cross-team changes need a doc regardless of scope. A half-day schema migration that touches two services needs a 1-page design doc, because the two teams won't meet until the PR, and a PR is the wrong venue for interface disagreement.

Single-team changes — even large ones — can often be handled in a shared /rfcs/ channel or pull-request thread. The reviewer set is small and already context-sharing, so a long written artifact has diminishing returns.

Question 3: Is the change reversible within one business day?

This is the "blast radius" test borrowed from site-reliability practice. If rolling the change back takes one engineer one day, write it up after, don't design-doc it before. If rollback would take a week, or is physically impossible (a schema migration, a public API change, a data-format change), the decision is worth the hour of writing.

"Reversible in a day" is a useful shortcut because it maps cleanly to MTTR data. In teams we've measured, an average deploy recovers in under 4 hours when the change has a clear rollback path. Changes without rollback paths are the ones that dominate MTTR numbers — and those are exactly the changes worth writing up.

The decision matrix

Scope > 1 weekCross-teamReversible in a dayWhat to do
NoNoYesSkip — ship with a good PR description
NoNoNoShort RFC comment (200-400 words)
NoYesEither1-page design doc
YesNoYesRFC thread in shared channel
YesYesEitherFull design doc, required
YesNoNoFull design doc, required

Six combinations, three written outcomes. The matrix prevents the "every ticket gets a doc" regression, and it prevents the "no doc, ever" regression.

The anatomy of a design doc that gets used

A doc that gets reopened six months later has these five sections and no others. Sections beyond the five are where LLM-authored slop hides.

1. Context (3-5 sentences)

What triggered the work. The ticket, the incident, the strategic goal. One paragraph. If it's longer, the ticket title is doing the work for you.

2. Decision (1-3 bullet points)

What we're going to build, stated flat. Not "options we're considering." The decision. Options live in section 4.

3. Why this and not the alternatives (the heart of the doc)

Two or three alternatives considered, each with a one-paragraph explanation of why it was rejected. This is the only section that justifies the doc's existence. If you can't name alternatives, you didn't explore the space and the doc is premature.

4. Risks and rollback plan

What could go wrong. How we'd undo this. If rollback is "revert the commit," say that in 5 words. If rollback is "migrate data back with a 12-step runbook," write the runbook.

5. Open questions

The questions you couldn't answer alone. This is the section reviewers actually use. An open question is an invitation; an answered question is a signal you should have asked earlier.

That's it. Five sections. If your design doc template has a "success criteria" section, a "user impact" section, and a "future work" section, they are performing ritual rather than decision capture.

Common mistakes to avoid

MistakeWhy it hurtsFix
Doc written after code is half-builtReview becomes rubber-stamp — the decision is already madeWrite at the point of genuine uncertainty, not after
Doc restates the PR without new reasoningAdds calendar time, zero decision valueIf it's describing code, make it a PR description instead
Doc with 20 "options considered"Analysis theater — no real contender rankingCap at 3 alternatives, brutal one-para rejection each
Doc in a Google Doc nobody can findKnowledge evaporates in 6 monthsStore in repo under /docs/design/ or /rfcs/ — searchable, grep-able, PR-reviewable
Everyone approves, nobody reviewsSocial approval without decision pressureRequire at least one named "reviewer" who must leave a written comment

How to measure if this is working

The metric is not "number of design docs written." That's a vanity count. Track two things instead:

  • Fraction of docs that got reopened at least once after approval — if under 30%, the docs aren't being used; they're being performed. The template is too heavy or the scope filter is too loose.
  • Post-decision rework rate — how often a decision made in a design doc gets revisited or reversed within 90 days. Under 15% means the doc is catching design errors; over 30% means the doc is rubber-stamping decisions that should have stayed in RFC form.

At PanDev Metrics we track document-linked ticket activity — when an engineer switches IDE focus to a file referenced in a design doc, that's a signal the doc is doing what it's supposed to do. If a doc is never visited after merge, it's a candidate for deletion on the next housekeeping pass. This is data we get from IDE heartbeat telemetry cross-referenced with Git and tracker signals.

The honest limit of our data: we see file-open and IDE-focus events, but we don't see whether an engineer read and understood the doc versus scrolled through it for 30 seconds. A doc that gets scanned isn't much better than a doc that's never opened. The qualitative signal still matters.

The contrarian position

Most engineering blogs recommend writing more design docs. I recommend writing fewer, shorter, and earlier.

Fewer, because ritual documentation is worse than none — it trains engineers that docs are overhead, which makes them skip the ones that genuinely matter. Shorter, because a 5-page doc gets 3 readers; a 1-page doc gets 10. Earlier, because a doc written mid-implementation is political cover, not decision capture.

If your team is in the "document everything" failure mode, reduce scope rules until at least one change per sprint is non-documented. If your team is in the "document nothing" mode, the cross-team + irreversible combination is where to start. Both extremes respond to the same 3-question framework; the difference is which direction you're tightening.

When this framework doesn't fit

Regulated environments — medtech, avionics, defense — often have mandatory documentation that the framework's scope filter won't override. A 1-day reversible change in a FDA-regulated codebase still needs a traceable design record. The framework is optimized for B2B SaaS engineering; if you're in a world where an auditor will read the doc, write the doc regardless of the scope matrix.

Research organizations also operate under different rules. A 2-week exploratory prototype doesn't need a design doc; a 2-week production-bound prototype does. If the output is a paper, the paper is the doc.

Ready to see your team's real metrics?

30-minute personalized demo. We'll show how PanDev Metrics solves your team's specific challenges.

Book a Demo