Chapter 5 · Multi-Agent & Team Harnesses · Lesson 5.2

Orchestration Patterns

The win: once you have a real reason to run multiple agents (Lesson 5.1), a few named shapes cover almost every case.

Chapter 0 · Sprint Zero
Chapter 1 · The ratchet & the practice loop
Chapter 2 · Spec-driven development in depth
Chapter 3 · Scaling & trusting the harness
Chapter 4 · Measuring & evolving the harness
5.1 · When to add a second agent
5.2 · Orchestration patterns
5.3 · Worktrees & parallel isolation
5.4 · The team harness
5.5 · Fighting drift & governance

What orchestration is

Orchestration is coordinating several agent runs deterministically - you decide the shape of the work, not the model. Lesson 5.1 was about whether to reach for a second agent at all. This lesson assumes you already have a valid reason and answers the next question: what shape does the work take?

The good news for a newcomer: you don't have to invent the coordination each time. Almost every multi-agent task fits one of four named shapes. Pick the shape that matches the job, and the wiring falls out of it.

Four shapes

Read the task first, then match it to one of these:

Four shapes

Fan-out / parallel - split the work into independent subtasks, run them all at once, then gather the results. Good for breadth: cover many files or many topics in one pass.
Pipeline - each item flows through a series of stages (draft → test → review), and there is no barrier between items - one can be in "review" while the next is still in "draft".
Judge panel - several independent evaluators each look at one output and vote; the majority verdict survives. Good for the verify step.
Loop-until-done - keep going until you hit a target, or until K rounds in a row produce nothing new. The stop condition is a number you set, not a vibe.

Separate the maker from the judge

The moment you scale up, one rule matters more than the rest: keep the thing that makes apart from the thing that judges. A model grading its own work is the failure you are trying to design out.

Keep the agent that generates separate from the agent that evaluates - a generator / evaluator split, or "GANs for prose". Addy Osmani, Agent Harness Engineering

The judge panel above is this rule made concrete: the panel that scores an output is never the agent that wrote it.

Diversity beats redundancy

When you build that verify step, do not just clone one judge three times - three copies of the same judge share the same blind spots and miss the same bugs. A panel of different lenses catches far more: one judge asks "is it correct?", another "is it secure?", a third "does it actually run?". Different questions, different failures caught.

Each worker and each judge is a subagent - a child with a narrow job, its own clean window, and a short, condensed handoff back to whoever is coordinating. That is what keeps a panel of five from flooding the main thread with noise.

Keep the control flow deterministic

Here is the part newcomers get backwards. The control flow - the loops, the fan-out, the counting of votes, the stop condition - lives in your orchestration, written down as plain logic. It is not something you ask a model to improvise mid-run. Let the model decide "should I fan out to three agents now, or loop again?" and you have traded a predictable machine for a slot machine. The agents do the thinking inside each box; the boxes and the arrows between them are yours to fix.

Check yourself

Which best defines orchestration?

Orchestration is coordinating several runs deterministically - you decide the shape (fan-out, pipeline, judge panel, loop-until-done). It is not one model improvising and not just piling on agents.

Splitting independent subtasks to run at once is -

Independent parts that can run simultaneously and be gathered afterwards is fan-out. A pipeline is for staged flow; a judge panel is for verifying one output; loop-until-done is for repeating toward a target.

A judge panel works best using -

Diversity beats redundancy: judges with different lenses (correct, secure, does-it-run) catch more than clones of one judge, which share the same blind spots. And the judge is never the agent that made the work.

Do this now (5 min)

Take one real task you'd hand to more than one agent and pick its shape. Ask:

Does it split into independent parts? That's fan-out.
Does each item pass through stages? That's a pipeline.
Does the output need verifying? That's a judge panel.

Then sketch the agents on paper: what each one does, and the one short thing each returns. If you can't say what each returns, the shape isn't clear yet.

I'm your teacher - ask freely. Not sure whether your task is a fan-out or really a pipeline in disguise? Or how many judges a panel should have before diversity stops paying off? Describe the task and we'll pick the shape together.

Go deeper

Primary source (read this): Addy Osmani - Agent Harness Engineering. Where the generator / evaluator split ("GANs for prose") and deterministic coordination come from.

Wisdom (test it on people): the HumanLayer community - practitioners trading real orchestration shapes that held up in production.

← 5.1 When to add a second agent Course map Glossary Next: 5.3 Worktrees & parallel isolation →