Chapter 1 · The Ratchet & the Practice Loop · Lesson 1.4

Spec Before Code

Spend five minutes pinning down the decisions before the agent writes anything - it's the cheapest quality lever there is.

Sprint Zero taught the discipline; this is the daily habit

Back in Lesson 0.3 you met spec-driven development as an idea, plus the tools that package it (Spec Kit, OpenSpec, agent-skills). This lesson is smaller and more personal: the hands-on habit you run yourself on every non-trivial task, no matter which tool - or no tool - you happen to have open. Before you prompt, you write a tight little spec. That's it.

Why a vague ask is expensive

When you type a wish and hope, the agent has to guess at everything you didn't say - and a guess in step three becomes a wrong assumption that step ten is built on. Small guesses compound into wrong code. The fix is to resolve those decisions before any code exists, and the payoff is large:

Resolving roughly 15 of 20 decisions before writing any code is worth about a 33× quality improvement over one-shot prompting - from planning alone. My Experiments With AI, Agentic Engineering: Why the Harness

Thirty-three times, and it costs you five minutes of typing. That is the trade this whole lesson is about.

Specify states, not activities

The single biggest upgrade to how you write a spec: describe the end state you want, not the steps you want the agent to take. Don't say "loop over the list, sort it, then render each row." Say what done looks like: "the list shows newest first; when it's empty, it shows a friendly message instead of a blank box." The agent is good at figuring out the steps - what it can't read from your mind is the destination. Naming states over activities is the same guidance the substack write-up pushes, and it's baked into the glossary's spec-before-code entry.

Turn vague wishes into testable conditions

A vague goal is one nobody can check, so the agent has no way to know when it's finished. The move is to rewrite it as a success criterion you could actually test. From the agent-skills spec-driven-development skill:

"make the dashboard faster"
   ↓  becomes
"dashboard loads in under 2.5s on a slow
 connection; no layout shift while it loads"

The first is a mood. The second is a target the agent can loop toward and stop at. Rule of thumb: a goal you can test is a goal the agent can hit.

The per-task spec: six short fields

You don't need the full ceremony from Lesson 0.3 for a routine task. This is the lightweight version - six lines you can write in a scratch note before you prompt:

The spec template
Objective:     what you're building + why (one line)
In scope:      what to build
Out of scope:  what NOT to touch (boundaries)
Architecture:  key decisions already made
Test plan:     how it gets checked
Done-when:     a testable condition

The Out of scope line does more work than it looks - it's how you stop the agent wandering into files it had no business editing. And Done-when is where you apply the reframe above: make it something you could verify, not a vibe.

Let the agent plan first, then review it

Most modern harnesses have a plan mode: the agent proposes what it intends to do, you read and correct it, and only then does it touch code. Use it. It's spec-review before implementation - you catch the wrong assumption while it's still a sentence, not after it's three files of code.

Have the agent draft a plan, review and refine that plan, and only then let it implement. Simon Willison, Agentic Engineering Patterns

Your six-field spec is what you check that proposed plan against. Spec first, then plan, then code - each gate cheaper than the mistake it prevents.

One more payoff, further down the loop: a spec you reviewed up front gives your later review pass (Lesson 1.6) something concrete to hold the finished code against. "Does this match what we agreed?" only works if you wrote down what you agreed.

Check yourself

A vague request forces the model to -

Every gap you leave, the model fills with a guess, and an early wrong guess becomes the foundation for later steps. Resolving the decisions up front is worth roughly a 33× improvement over one-shot prompting.

A good spec should describe the -

Specify states, not activities: name what "done" looks like and let the agent work out the steps. It can find a path; it can't read your intended destination.

A vague goal becomes useful once -

"Make it faster" is a mood; "loads in under 2.5s, no layout shift" is a target the agent can loop toward and know when it's hit. A goal you can test is a goal the agent can reach.

Do this now (5 min)

Take one small, real task you'd hand an agent. Before you prompt:

  1. Write the six-field spec (Objective, In scope, Out of scope, Architecture, Test plan, Done-when).
  2. Find the one vague requirement in it and rewrite it as a testable success criterion - something you could actually check.

Then prompt. Notice how much guessing you just took off the agent's plate - and bring the spec back if you want it pressure-tested.

I'm your teacher - ask freely. Got a requirement that's still fuzzy? Paste it and I'll help you turn it into a testable success criterion you can drop straight into the Done-when line. That reframe is the whole skill.

Go deeper

Primary source (read this): My Experiments With AI - Agentic Engineering: Why the Harness, the P2 section on resolving decisions before coding.

Secondary: Simon Willison - Agentic Engineering Patterns, for plan-then-implement and other everyday habits.

Wisdom (test it on people): r/ChatGPTCoding - live debate on how much spec is enough before you code.