Prompts as pipelines

There is a common misconception that prompt engineering is the craft of writing better prompts. Most of the advice online — the tactics, the lists of techniques, the templates — takes the single prompt as the unit of work. After two years of writing, rewriting, and living with the prompts I actually use, I have come to disagree. The unit of work is not the prompt. It is the pipeline.

The single-prompt trap

The first prompt I ever kept in a file was a thousand words long. It told a model how to turn a rough topic into a finished article: define the audience, pick the angle, draft a headline, write the piece, check the facts, tighten the prose. It was exhaustive. It was also unreliable. The model would ignore the fact-check step when the draft got interesting. It would forget the audience by the time it reached the conclusion. It would quietly invent statistics it had been explicitly told not to invent.

This happens because instructions in a long prompt do not keep their weight. Each rule you add dilutes the ones before it. The model is not reading your prompt as a checklist; it is reading it as evidence of what kind of reply you want. The twelfth constraint lands softer than the first, and the first lands softer than it did before the twelfth arrived. You can keep adding rules, but past a certain length, you are no longer teaching — you are wishing.

What a pipeline gives you

My current setup is five separate prompts. Research. Strategy. Writing. Audit. Distribution. Each one has its own inputs, its own output, and its own narrow job. Research runs first and feeds verified facts forward so the later phases never have to invent what they should already know. The strategy prompt never sees the body text; the writing prompt never sees the distribution plan. This is not an aesthetic choice. It is a reliability choice.

A pipeline works because the context of each phase only contains what that phase needs. Strategy thinks about the reader and the angle. Writing thinks about the outline and the voice. Audit thinks about weak sentences and unverified claims. When a phase fails, it fails on its own terms, in a bounded way, and I can see exactly what went wrong — because the only thing in the room is the phase that broke.

The checkpoint is the artifact

Between each phase I stop and read what came out. I approve it, rerun it, or rewrite it by hand. Nothing advances until I say so. This is the part most agent frameworks want to remove. Auto-chain the phases, they say. Let the model decide when it is done. I have tried it. It produces output faster and worse.

Errors in phase N compound into phase N+1 because each phase trusts its inputs. If the strategy phase picks a weak angle and nothing stops it, the writing phase faithfully drafts a thousand words in service of a bad idea. The audit that follows then reads those thousand words against criteria that assume the angle was chosen well, and misses the root problem entirely. The checkpoint is not a courtesy. It is the only thing between a fixable drift and a finished piece that has to be thrown away.

The prompts in my pipeline are valuable. The checkpoints between them are what make the prompts valuable. If someone copied the four prompt files without the four stops, they would have something that looks like my system and behaves like a worse one.

Constraints that travel

Here is something I did not expect. The rule Do not fabricate data sits at the top of the writing prompt, in its own section, at the start of a short file. It used to sit at the top of my mega-prompt, also at the top, also at the start. Same words. Different behavior. In the pipeline, the model does not fabricate. In the mega-prompt, it did.

I think this is because constraints decay with proximity. A rule stated once, at the start of a long document, competes with everything that follows it for the model's attention. The further the model reads, the more the document as a whole becomes the signal, and the opening constraint becomes one of many voices in a noisy room. In a short, focused prompt there is no noisy room. The constraint is still there when the model stops reading, because the model never left its neighborhood.

This is the part I did not know until I had run it both ways. You can write the same instruction in the same words, and have it enforced in one version and ignored in another, purely because of where it sits. Short prompts are not just easier to write. They are the substrate that makes constraints hold.

What this changes on Monday

If you have been rewriting the same prompt for the fourth time this week, stop. You have probably confused two problems. One is that the prompt is unclear; the other is that the prompt is trying to do too much. The fix for the first is more words. The fix for the second is fewer words, and more prompts — because a single prompt cannot hold two responsibilities without leaking one into the other. Try splitting it at the first natural seam, often between deciding and doing, and see whether each half behaves better on its own. My guess is that it will.

The right question to ask about a prompt is not how good it is. It is what it is responsible for, and what happens between it and the next one. Prompts compose. They do not concatenate.

Inspired by the disciplined approach to prompt systems? See how Dunking Devils applies structure and craft to high-performance systems. Explore structured approaches from OpenAI and build orchestrated AI systems with Mastra.