Chapter 0 · Sprint Zero · Lesson 0.2
What a Harness Is (and How It Evolved)
The model is the engine. The harness is the rest of the car - and it's what turns a clever text predictor into something that ships code.
- 0.1 · The model landscape
- 0.2 · What a harness is (and how it evolved)
- 0.3 · Spec-driven development & the toolkit trio
- 1.1 · The ratchet
- 1.2+ · CLAUDE.md, hooks, spec-before-code, review
The one equation
The whole field rests on a single formula, coined by Viv Trivedy at LangChain:
A raw model just predicts text. It can't open a file, run a test, or remember what it did five minutes ago. The harness is everything that isn't the model: the prompts, the tools it can call, the memory, the safety checks, and the loop that runs them. A tool like Claude Code, Cursor, or Aider is a harness with a model plugged into it.
The model is a car engine - powerful, but on its own it just sits there revving. The harness is the steering wheel, brakes, dashboard, seatbelts, and GPS. Same engine, different car, wildly different drive. Swap a great engine into a car with no brakes and you crash; a modest engine in a well-built car gets you there.
How the model becomes an agent: the loop
Under the hood, every coding agent runs the same simple cycle - reason, act, observe, repeat:
while (model asks to use a tool) {
run the tool // e.g. read a file, run the tests
capture the result // success? error message?
give it back to model // now it reasons again
}
That's it. As Simon Willison puts it, a coding agent "runs tools in a loop to achieve a goal", per his Agentic Engineering Patterns guide. The harness is what decides which tools exist, what the model is told, and what happens after each step.
Why the harness exists
Every part of a harness is there to cover something the model can't do alone. Anthropic states the principle cleanly: "every component in a harness encodes an assumption about what the model can't do on its own." Can't remember? Add a memory file. Might delete something? Add a check that blocks it. Can't see if its code works? Give it a way to run the tests. You work backwards from a behaviour the model lacks.
How harnesses evolved
Harnesses are not static - they've been shrinking as models improved:
2023: models were weak, so harnesses were heavy - lots of hand-written scaffolding to prop them up. 2026: models are strong, so much of that scaffolding is dead code. A concrete example from Addy Osmani: older models used to stop work early because they thought they were running out of room, so people wrote "don't panic" scaffolding - a newer model "largely killed" that failure, making the scaffolding pointless. The lesson: a good 2026 harness is light and disposable. You add structure where today's model is weak, and rip it out when tomorrow's model no longer needs it.
Why it matters - the proof
Same model, different harness, dramatically different results:
- LangChain kept the model fixed and rebuilt only the harness: their coding agent jumped from 52.8% to 66.5% on a benchmark - outside the top 30 to rank 5, per LangChain.
- An independent test found the same Claude Opus scored 77% in one harness and 93% in another on identical tasks - a 16-point swing from the wrapper alone.
- The same model in two harnesses can cost up to 32× more for near-identical code, per the Artificial Analysis index.
That's the thesis of the whole course: you can't buy your way to good agentic coding with a bigger model. You engineer the harness.
Check yourself
In "Agent = Model + Harness", the harness is -
The harness is all the code, tools, memory, prompts, and checks around the model - plus the loop that runs them. The model alone just predicts text; the harness makes it act.
A coding agent turns a model into an agent by -
The core mechanism is a loop: the model asks to use a tool, the harness runs it, feeds the result back, and the model reasons again - reason, act, observe, repeat.
As models get stronger, a good harness should -
Each component compensates for a model weakness. When the model outgrows that weakness, the component becomes dead code - the best 2026 harnesses stay lean and disposable.
Do this now (3 min)
Think about any AI coding tool you've used. Write down three things it does that are NOT the model - for example: reads your files, runs your tests, remembers project rules, undoes a bad edit, asks before deleting. Congratulations: you've just listed part of its harness. Next lesson (0.3) is about one deliberate way to shape that harness - spec-driven development.
Go deeper
Primary source: Simon Willison - "How coding agents work". The clearest plain-English walkthrough of the model + prompt + tools loop.
Wisdom: the HumanLayer community - where much of this "harness" vocabulary was coined and is still argued over.