Thinking in code

There is a story about software development that has become dominant in the last two years. You decide what you want, you write a plan, and a system implements it. The plan is the thinking; the code is the artifact. For a class of problems this is true and useful. For another class — the one I want to talk about — the arrow points the other way. The code is the thinking. The plan is the residue.

What the spec leaves out

A spec can stay internally consistent while being externally incoherent because prose tolerates contradictions that types do not. You can write "the system processes inputs and routes them to the right handler" and the sentence reads correctly. It is also empty. It does not say what an input is, which handlers exist, what "right" means, or how the routing is decided. The sentence is well-formed; the design is not.

Code does not allow that. A function signature is the cheapest form of commitment a software design can make — it has to name a thing, list its parts, and say what comes back. A class is a claim about which fields belong together and which behaviors live in the same room. A folder layout is a claim about which concepts are siblings. None of these are visible in prose, and none of them survive being skipped.

The first time I usually notice this is when I sit down to write a function I thought I understood and find I cannot name its arguments. I do not know whether the second parameter is a string or a record. I do not know whether the response includes the original request or only the new fields. The spec did not tell me, because the spec never had to. Prose let me keep both interpretations open at once. Code does not.

That is what specs are quietly leaving out: the commitments that only exist once a syntax forces them. You can spend a week refining a spec without making any of them. The first hour of typing usually makes a dozen.

Why typing is doing work

Typing the code surfaces decisions the prose was skipping because constrained syntax forces a choice the unconstrained syntax did not. Names are the most obvious version of this. When I name a function, I am claiming what it does. When I name a parameter, I am claiming what kind of thing flows through it. When I name a type, I am claiming where the boundaries of the concept are. Every name is a small commitment, and the commitments compose into a model.

I spent an afternoon last year writing what I thought was a clean spec for a workflow before I had touched the data. Three pages of prose. The audience, the inputs, the outputs, the steps. It read well. It also passed three reviewers. When I started typing it, two things happened in the first hour. Three of the entities I had described separately turned out to be the same thing under different names — the spec had introduced them in different paragraphs and never noticed. One of them, which I had described as a single concept, turned out to be two — a request and an event, identical-looking at the document level, but with different lifecycles that the types refused to share.

None of that was a typing mistake. The contradictions were already in the spec. They survived because prose does not check for them, and the reviewers were reading for clarity, not for closure. The compiler is not just a syntax checker. It is the first reader I have ever met that refuses to fill in gaps.

The ambiguity asymmetry

Spec-first workflows feel less ambiguous than code-first workflows because prose hides ambiguity inside well-formed sentences while code exposes ambiguity as a missing branch or a name that does not resolve. The feeling is backwards from the substance. The spec feels finished because it has no errors; it has no errors because it has no compiler. The code feels rough because everything not yet decided shows up as something that does not compile. The rough version is the honest one.

This holds at the human end. It holds harder at the model end. A model reading a spec fills the ambiguity in with the most plausible reading and hands back something that looks correct, because the most plausible reading is exactly what its training optimizes for. A model writing code against a typed contract cannot fill in the same way — the contract refuses some readings outright, and the wrong output is visible in a way the spec version was not. The constraint travels into the inference. I have written about a related shape of this in prompts as pipelines: constraints decay with proximity, but they decay much less when the constraint is mechanical instead of textual. A type signature is not a request that the model behave a certain way. It is a shape the output has to fit into, or it does not fit at all.

That is the asymmetry the spec-first framing misses. It treats prose and code as two notations for the same thinking and picks the one that feels lighter. They are not two notations. They are two pressures, and only one of them pushes back.

Where the orchestrator model breaks

The orchestrator framing breaks on ambiguous problems because it assumes the thinking has already happened upstream of the typing. The implementation step is then a transcription job that can be handed off — to a junior, to an agent, to anything that can produce code from instructions. The framing works when the assumption holds. For a known input mapped to a known output by a routine I have written ten times before, the spec really does close, and the agent really is doing transcription. I delegate those happily, and I will keep delegating them.

The framing breaks when the typing was the medium of the thinking. The pattern I watch for is small and specific. A feature seems clear in conversation. It gets vaguer when I write it down in a planning doc. It only becomes specific when I sit down to write the function and find I have to name the second argument. The naming is what made it specific. If I had handed the feature off at the planning-doc stage, the second argument would have been named anyway — plausibly, smoothly, and wrong, because the right name was not in the document. It was downstream of work that had not happened yet, and the only place that work could happen was in the typing.

The honest move is to notice when typing is doing the thinking and stay in the medium that is doing the work. I covered the inverse case in judgement-shaped problems — for inputs an integration can close around, the spec is enough and the orchestrator framing is fine. This post names the other side. The two shapes need different stacks. Sending one through the other's pipeline wastes both.

Not every problem is like this. Some inputs really do close at the spec; for those, delegating implementation is a clean win and I take it. The mistake is using one framing for both. Before I let the typing be done somewhere else, I want to know which medium the thinking is in. If it is in the code, the code stays with me.