Chapter 1 · The Ratchet & the Practice Loop · Lesson 1.2

CLAUDE.md as a Pilot's Checklist

Keep the rules file lean and earned: every line earns its place, or it dilutes the ones that matter.

What a rules file is

Most coding agents read a plain rules file at the root of your project - CLAUDE.md for Claude Code, AGENTS.md for the vendor-neutral version. It's just markdown. The important part is when it's read: its contents get pasted into the model's system prompt on every single turn. So every line you put there is spent, again and again, on every request the agent handles. That's why it's precious real estate, not a junk drawer.

Map, not manual

The instinct is to treat this file like a style guide and pour in everything you'd tell a new hire. Resist it. Addy Osmani's framing is that a good rules file should read like a pilot's checklist - the short list of things that must be true before takeoff - not a manual you could read cover to cover. A pilot doesn't re-read how the engine works every flight; they run the terse checklist of things easy to forget and expensive to miss. Your rules file is the same: a map of the few things that trip the agent up, not a manual of everything you know. Osmani's test for whether a line belongs is strict:

"Every line in a good AGENTS.md should be traceable back to a specific thing that went wrong." Addy Osmani, Agent Harness Engineering

Why short matters

This isn't tidiness for its own sake. Every rule you add competes for the model's attention with every other rule. In practice a model reliably follows only somewhere around 150-200 instructions at once, and a good agent's own built-in system prompt already burns roughly 50 of those before you've written a word - so your budget is smaller than it looks. Pad the file and the rule that actually matters gets crowded out by the ten that don't. One useful principle from My Experiments With AI (their first rule): a line earns its place only if "the agent would otherwise waste a cycle discovering it." If the model would already get it right, or figure it out cheaply on its own, the line is pure cost.

Some teams put a number on it. Osmani notes that HumanLayer keeps their rules file under about 60 lines; the substack above suggests a ceiling of around 200 lines for a CLAUDE.md. Treat those as guardrails, not laws - the point isn't the exact number, it's that a ceiling forces you to spend the space on what's earned.

Tie it back to the ratchet

This is the same discipline as Lesson 1.1, pointed at one file. The ratchet says a rule goes in only after a real failure. So every line in your rules file should trace back to a specific thing the agent got wrong. Speculative "best practice" lines - added because they sound responsible, not because anything broke - are the enemy here: they take up attention budget and crowd out the lines you actually earned.

A quick contrast makes the difference obvious:

# BLOATED (vague, unearned) - the model already tries to do this:
- Write clean, readable, well-documented, idiomatic code
  following best practices.

# EARNED (specific, traceable to a failure):
- After editing any file, run the test suite before saying
  you are done. You have skipped this twice.

The first line names no failure, so the model can't act on it differently than it already does - it just spends attention. The second names an exact behaviour the agent failed to deliver, so it changes what happens on the next run. That's the whole test.

How to keep it lean

Check yourself

A rules file like CLAUDE.md is -

The file's contents go into the model's system prompt on every turn, so every line is spent again and again. That's exactly why space in it is precious and each line must be earned.

Each extra line in the file -

Models reliably follow only ~150-200 instructions, and the agent's own system prompt already uses ~50. Padding the file crowds out the rules that matter - a line runs as a script only if it's a hook (Lesson 1.3).

A line earns its place by -

Every line should be traceable to a specific thing that went wrong. Best-practice-sounding lines the model already follows just spend attention budget without changing behaviour.

Do this now (5 min)

Open the rules file for the agent you use most (CLAUDE.md or AGENTS.md). Then:

  1. Count the lines. Note the number - are you near a sane ceiling, or well over?
  2. Find one vague, unearned rule to cut - a line you can't tie to anything that actually broke.
  3. Find one earned rule to keep - a line that traces to a real failure.

Decide each one by the same test: name the failure it prevents. If you can't name it, it goes.

I'm your teacher - ask freely. Paste your whole rules file and I'll give you a lean/earned critique, line by line - which lines trace to a real failure, and which are just spending attention. That's the fastest way to feel the difference.

Go deeper

Primary source (read this): Addy Osmani - Agent Harness Engineering. The source of the pilot's-checklist framing and the traceable-line test.

Wisdom (test it on people): the HumanLayer community argues hardest for lean, earned rules files - a good place to have yours pressure-tested.