Chapter 6 · Capstone: Build Your Harness · Lesson 6.1

Your Starter Harness

The win: the minimum viable harness you can stand up today - one piece borrowed from each earlier chapter.

Chapter 0 · Sprint Zero
Chapter 1 · The ratchet & the practice loop
Chapter 2 · Spec-driven development in depth
Chapter 3 · Scaling & trusting the harness
Chapter 4 · Measuring & evolving the harness
Chapter 5 · Multi-agent & team harnesses
6.1 · Your starter harness
6.2 · One feature, end to end
6.3 · The first-week ratchet plan
6.4 · Adapt, don't copy
6.5 · Staying current & your capstone

You already have all the parts

Every earlier chapter handed you one piece of a working harness - the scaffolding you wrap around a coding agent. This lesson does the assembly. The goal is not the biggest harness you can imagine; it is the smallest one worth having, standing up on a real project today.

The rule that governs the whole build is Lesson 1.1's: the ratchet - a mechanism that only ever tightens. You start with almost nothing and add a piece only after a real failure earns it. So do not build it all up front. Stand up the five slots below mostly empty, then let real runs fill them.

The five-piece starter kit

Each piece is deliberately tiny. Each one traces back to a lesson you have already done:

The starter kit

A rules file with 3-5 earned lines, or empty to start - the rules file the agent reads every turn (Lesson 1.2). Empty is fine; unearned lines just spend attention.
One hook - a small script that runs your tests automatically after each edit, so a break is caught 100% of the time, not 70-90% (hooks, Lesson 1.3).
A clean git branch and a safe place to run - a fresh branch plus a sandbox, an isolated spot to run code without risking your real machine (Lesson 3.3).
A tiny eval set of 3 golden tasks - a fixed eval set of real tasks with known-good outcomes, so you can tell if a harness change actually helped (Lesson 4.2).
A review habit - one fresh-session second pass over the work, a cross-agent review that catches what the first pass missed (Lesson 1.6).

Why start this small

The temptation is to front-load: fifty rules, ten hooks, a rich config, all before the agent has done a single task. That harness rots. Half those rules were never earned, and unearned scaffolding just competes for the model's attention and goes stale on the next model release.

"Start with almost nothing, and let real failures fill it." after Addy Osmani, Agent Harness Engineering

An empty-but-real harness that you ratchet beats a big speculative one every time. Osmani's lean-and-earned discipline is the same idea pointed at a fresh project: five slots, each nearly empty, each ready to tighten the moment a run fails.

Check yourself

The right way to start is -

Stand up the five slots mostly empty, then let real failures earn each addition. A big speculative harness rots; an empty-but-real one that ratchets keeps improving.

The one starter hook should -

A hook fires deterministically every time, so a broken test is caught 100% of the time rather than the ~70-90% you get from an instruction alone.

Your tiny eval set holds -

A small fixed set of real tasks with known-good outcomes lets you measure whether a harness change actually helped - your own benchmark, run after every change.

Do this now (15 min)

Pick one real project and stand up the five-piece kit:

Create the rules file (CLAUDE.md / AGENTS.md) - even near-empty.
Add one hook that runs your tests after an edit.
Make a clean git branch to work on.
Write down 3 golden tasks with clear done-when outcomes.
Commit to doing one fresh-session review pass.

That is it. You now have a real harness - small, honest, and ready to ratchet as the project teaches you what to add.

I'm your teacher - ask freely. Not sure what your three golden tasks should be, or which failure should earn your first rule? Tell me your project and I'll help you pick. Getting the first pieces right is the lesson.

Go deeper

Primary source (read this): Addy Osmani - Agent Harness Engineering. The source of the lean, earned discipline this kit is built on.

Wisdom (test it on people): the HumanLayer community - a good place to have your starter harness pressure-tested by people running the same loop.

← 5.5 Fighting drift & governance Course map Glossary Next: 6.2 One feature, end to end →