Chapter 4 · Measuring & Evolving the Harness · Lesson 4.4
The Self-Improving Loop
The win: the ratchet you turn by hand can be automated - the agent edits its own harness from what it learns each run.
- Chapter 0 · Sprint Zero
- Chapter 1 · The ratchet & the practice loop
- Chapter 2 · Spec-driven development in depth
- Chapter 3 · Scaling & trusting the harness
- 4.1 · Why vibes aren't enough
- 4.2 · Building an eval set
- 4.3 · Reading failure, not pass rates
- 4.4 · The self-improving loop
- 4.5 · Knowing a change helped
The ratchet, by hand
You already know the ratchet: the agent fails at something, you add a durable fix - a rule, a hook, a reviewer - and that fix pays off on every run after it, not just the one that broke. Because each fix stays in place, the wins stack up instead of resetting: that stacking is the compound engineering loop - "the model does not get smarter, the harness does." So far every turn of that loop has needed a human hand on the lever. This lesson is about taking your hand off it.
Now let the agent turn it
Agentic Harness Engineering (AHE), also called the self-improving loop, is exactly that: a loop where the agent edits its own scaffolding - the system prompt, the tools, the memory, the middleware - straight from what it sees in its own runs. No human has to hand-write the fix each time. The point is to keep the harness in step with each new model release automatically, instead of someone re-editing the rules file every time a better model ships. Addy Osmani frames this as observability-driven evolution: the harness watches its real runs and tightens itself from the failures it sees (Agent Harness Engineering), and My Experiments With AI makes the same case - the ratchet, but automated.
Proof it works
This is not just a nice idea. Stanford's IRIS Lab paired a model with an automated harness-evolution system they call a "Meta-Harness" - one that rewrites its own scaffolding between runs - and measured it on a public coding benchmark:
The same model that people were wrapping in hand-tuned harnesses did better when the harness tuned itself. That is the loop paying off at the frontier, not just in theory.
Why the labs care: "the harness is the dataset"
There's a deeper reason this matters beyond your own project. Every run your harness records is a trajectory - a play-by-play of how a real task went. Whoever captures the best trajectories has the best raw material to train the next model on. Philipp Schmid puts it bluntly: the harness is the dataset, so the team that runs the strongest harness builds the stronger data flywheel (The Agent Harness). The self-improving loop is that flywheel turning on its own. Its productized form is Harness-as-a-Service - you build on someone's ready-made runtime, and they get to learn from every run that flows through it.
What it doesn't do
Be honest about the limits. Automation still needs a yardstick - your eval set from Lesson 4.2 - or it has no way to tell a good edit from a bad one. And it needs a guardrail, because a loop that changes its own harness can just as easily make things worse: an edit that fixes one task might quietly break three others. That guardrail is exactly what Lesson 4.5 is about. So the self-improving loop does not replace your judgement - it removes the repetitive part of turning the crank, and leaves the judgement calls to you.
run → observe failure → propose a harness edit → test against the eval set → keep it if it helped.
Check yourself
The self-improving loop automates -
The agent edits its own scaffolding from execution feedback - the manual ratchet, run automatically. It does not retrain the model's weights, and it does not remove your judgement.
"The harness is the dataset" means -
Whoever captures the best execution trajectories builds the stronger data flywheel (Philipp Schmid). The self-improving loop is that flywheel, and Harness-as-a-Service is its productized form.
Automated harness evolution still needs -
It needs a yardstick (your eval set) and a guardrail so the loop can't make things worse - which is why it pairs with Lesson 4.5. It removes the repetitive part, not your judgement.
Do this now (5 min)
Name one fix you keep making by hand - the same correction you type over and over into your agent. Then sketch, in a couple of lines, how a hook plus your eval set could catch that failure and close it automatically the next time it happens - so you never type that correction again.
Go deeper
Primary source (read this): Addy Osmani - Agent Harness Engineering, on treating the harness as something that evolves from its own observed runs.
Wisdom (test it on people): the HumanLayer community - a good place to sanity-check whether an automated loop is really earning its keep, or just adding churn.