Chapter 1 · The Ratchet & the Practice Loop · Lesson 1.1

The Ratchet

The one habit that turns a coding-agent user into a harness engineer: every mistake becomes a permanent fix.

0.1-0.3 · Sprint Zero (models, harness, spec-driven dev)
1.1 · The ratchet - fix the harness, not the prompt
1.2 · CLAUDE.md as a pilot's checklist
1.3 · Hooks, not instructions
1.4 · Spec before code
1.5 · Context firewalls & subagents
1.6 · Cross-agent review

The shift

Sprint Zero showed you that the harness - everything wrapped around the model - is what decides your results. This lesson is the single most important habit for improving that harness. It's this: when the agent gets something wrong, you don't just re-prompt it - you change the harness so it can never get that thing wrong again.

"A decent model with a great harness beats a great model with a bad harness." Addy Osmani, Agent Harness Engineering

That is the ratchet: a mechanism that only ever tightens. A rule goes in after a real failure, and comes out only when a better model makes it unnecessary. You never add speculative rules "just in case", and you never loosen it by hand. Osmani's test for whether a rule has earned its place:

"Every line in a good AGENTS.md should be traceable back to a specific thing that went wrong." Addy Osmani

(An AGENTS.md - or CLAUDE.md - is just a rules file the agent reads on every turn. More on it in Lesson 1.2.) The payoff is that fixes compound: re-prompting fixes one run, but a ratcheted fix fixes every run after it, and survives restarts, new sessions, and teammates. As one guide puts it: "The model does not get smarter. The harness does. That is the flywheel."

Where a fix can live

A mistake doesn't always become a rule. Pick the cheapest durable home for it:

The three homes for a ratcheted fix

A rule in the agent's rules file (CLAUDE.md / AGENTS.md) - when the agent just needs to know something (a convention, a gotcha).
A hook - a small script that runs automatically - when knowing isn't enough and it must be enforced every time (run the tests after an edit; block a dangerous command). Instructions are followed ~70-90% of the time; a hook fires 100%. That's Lesson 1.3.
A reviewer subagent - a second agent that checks the work - when the check needs judgement a script can't make. That's Lesson 1.6.

Worked example

Say your agent keeps writing code that reads from the database but forgets to handle the "no results found" case, so the app crashes on an empty list. You could paste the crash back and let it patch that one spot. That's re-prompting - it fixes today's crash and nothing else.

The ratchet move, cheapest option first:

# A RULE (the agent just needs to KNOW):
- Every DB read must handle the empty-result case before using the data.

# or a HOOK (it must be ENFORCED every time):
#   after each edit, run the test suite; the "empty list" test fails
#   loudly if the case is missing, and the agent must fix it to proceed.

# or a REVIEWER subagent (needs judgement):
#   a second agent reads new data-access code and asks
#   "is every empty/error path handled?" before you approve.

Notice each option is worked backwards from a behaviour the model failed to deliver (from Lesson 0.2). That "name the behaviour" test is also how you stop the harness from bloating: if you can't name the failure a rule prevents, it hasn't earned its place.

Check yourself

The agent repeats a mistake. The ratchet says you should -

The ratchet only ever tightens: a real failure earns a durable fix (rule, hook, or reviewer) so it can't recur. Retrying fixes one run; swapping models is not the lever here.

Every line in a good rules file should be -

Rules are earned, not brainstormed. An unearned line just competes for the model's attention and dilutes the rules that matter (Lesson 1.2 goes deep on this).

A fix that must be enforced every single time belongs in -

Instructions are followed ~70-90% of the time; a hook runs 100%. Knowledge → rule, enforcement → hook, judgement → reviewer subagent.

Do this now (5 min)

Think of the coding agent you use most. Two quick recalls:

Name one mistake it has made more than once (forgot a case, wrong style, deleted something it shouldn't).
Write the single line that would prevent it - then decide its home: a rule (it just needs to know), a hook (must be enforced), or a reviewer (needs judgement)?

Bring it back and we'll pressure-test whether it's really earned - and whether you picked the right home.

I'm your teacher - ask freely. Not sure whether your example is a rule, a hook, or a reviewer job? Paste it and I'll help you place it. That back-and-forth is the lesson.

Go deeper

Primary source (read this): Addy Osmani - Agent Harness Engineering. The clearest single essay on the ratchet and working backwards from behaviour.

Wisdom (test it on people): the HumanLayer community argues hard for lean, earned rules files - a good place to get your ratcheted rules critiqued.

← 0.3 Spec-driven dev Course map Glossary Next: 1.2 · CLAUDE.md as a pilot's checklist