Chapter 0 · Sprint Zero · Lesson 0.1

The Model Landscape

Where the AI models stand in 2026 - and why their sameness is the whole reason this course exists.

0.1 · The model landscape
0.2 · What a harness is (and how it evolved)
0.3 · Spec-driven development & the toolkit trio
1.1 · The ratchet
1.2+ · CLAUDE.md, hooks, spec-before-code, review

Two words first

Every AI coding tool is powered by a large language model (LLM) - the "brain" that predicts text and code. There are two kinds, split by how you get them:

The only two categories you need

Closed / frontier models - the "best available", locked behind a company's API. You send text, you get text back; you never hold the model. Examples in 2026: Claude Opus 4.x, GPT-5.x, Gemini 3.x. Frontier just means "at the leading edge of capability."
Open-weight models - the model's actual weights are published, so anyone can download and run them. Examples in 2026: DeepSeek, Qwen, Kimi, GLM, Mistral.

The gap between them is nearly gone

For years, closed frontier models were clearly ahead. In 2026 that lead shrank to almost nothing. The single clearest data point: DeepSeek V4-Pro scores 80.6% on SWE-bench Verified - within 0.2 points of Claude Opus 4.6 - under a free MIT licence, per MindStudio. (SWE-bench Verified is a standard test of fixing real GitHub bugs.)

Open-weight models now lag the best closed models by roughly 3 months - the smallest gap ever measured - and it's not one model: DeepSeek, Qwen, Kimi, GLM and Mistral all reached frontier quality at once. Epoch AI data, via MindStudio

The top of the leaderboard has compressed too: on coding, the gap between the #1 and #10 model fell from 11.9% to 5.4% in a single year, per Kilo. When the tenth-best model is within a few points of the best, "which model" stops being the interesting question.

Narrowed, not closed. Be honest about the caveat: closed models still lead on the hardest agentic tasks (long, multi-step autonomous work) and broad general knowledge, and the frontier keeps moving - Anthropic and OpenAI shipped new models in the middle of the 2026 open-weight wave, per Can Demir. The gap compressed; it didn't vanish.

Why this is the hook for everything else

Here is the payoff, and the reason this course opens here. When many models are all roughly as capable, the model stops being what sets your results apart. The differentiator moves to everything wrapped around the model - the prompts, tools, memory, and checks. That wrapper has a name: the harness. That's Lesson 0.2, and the rest of the course.

Check yourself

An "open-weight" model differs from a "closed" model because -

Open-weight = the model's weights are published so you can download and run it yourself. Closed models live behind an API. It's about access, not raw intelligence.

In 2026 the open-vs-closed capability gap has -

It shrank to roughly 3 months and single benchmark points on coding/math - but closed models still lead on the hardest agentic and generalist tasks. Narrowed, not closed.

Because models have converged, your results now depend most on -

When many models are roughly equal, the wrapper around the model - prompts, tools, memory, checks - becomes the real differentiator. That wrapper is the harness (Lesson 0.2).

Do this now (3 min)

Open any AI coding tool's model picker (Claude Code, Cursor, Aider, ChatGPT - whatever you use). Notice you can usually swap between a closed model and an open one for the same task. Ask yourself: for a normal task, would the choice actually change the outcome much? Hold that thought - Lesson 0.2 shows what would change it.

I'm your teacher - ask freely. Confused by "weights", "frontier", or a benchmark name? Ask and I'll explain it plainly before we move on.

Go deeper

Primary source: "The Open-Source Models Closing the Frontier Gap" - a plain-English tour of the 2026 convergence with the key numbers.

Wisdom (a place to see it debated): r/LocalLLaMA - the community that lives and breathes open-weight models.

Course map Glossary Next: 0.2 What a harness is →