The Lexicon Method: Containing Code Drift in Agent-Built Software

Anyone building production software with an AI coding agent hits the same quiet failure. The agent produces something almost correct, the gap is small enough to ship, and over many sessions those gaps compound until the product no longer matches its own plan. That's code drift, and more context won't fix it. Structure will.

Anyone building production software with an AI coding agent runs into the same quiet failure. You start a fresh session, the agent reads the codebase, and it produces something that is almost correct. It doesn't error. It simply isn't the thing that was decided. The gap is small enough to ship, and over many sessions those small gaps compound until the product no longer matches its own plan. That divergence is code drift, and it's the central tax on building this way.

The common response is to give the agent more context, on the theory that it forgot something. Wrong diagnosis. Drift isn't a memory failure. It's a truth failure: the agent can't reliably distinguish what was decided from what merely exists in the codebase, and adding context doesn't resolve that ambiguity. Structure does. But structure alone isn't enough either, and the honest version of this method says so up front.

A note on the word "containing." An earlier draft of this method claimed to eliminate drift. That claim doesn't survive contact with a real long-running system. Structure removes the ambiguity drift feeds on; it doesn't make the agent's instruction-following deterministic, it doesn't make the source of truth self-verifying, and it doesn't survive context compaction on its own. What structure does, paired with the enforcement and verification machinery below, is contain drift: catch it early, make it visible, and stop it from silently rewriting the record. That's the achievable goal, and it's worth a lot.

What follows is a complete method built from five parts: three documents, one reconciliation rule, an enforcement layer, a verification loop, and a lightweight naming lexicon. The same structure that contains drift also reduces token cost, because the agent stops loading knowledge it doesn't need.

Why agents drift

An AI agent has no inherent notion of authority among the text it reads. A stale code comment, an abandoned draft, the current implementation, and the original specification all arrive as equally valid input. When they conflict, and in any evolving project they eventually will, the agent has to pick a winner. It tends to favor whatever it saw most recently, because recency is the only ranking signal available when nothing else marks one source as authoritative.

Every drift problem traces back to that blind spot, but it expresses itself through four distinct channels, and a method that closes only the first will leak through the other three:

Authority ambiguity. The agent can't tell decided from existing. The document structure closes this one.
Instruction decay. Even a perfectly clear rulebook is followed probabilistically. Instructions stated once at session start fade under context pressure, and an agent deep in a long task will skip the "consult the docs first" step exactly when it matters most. The enforcement layer closes this one.
Truth-source rot. The source of truth is itself a document, maintained at least partly by the agent, and it accumulates its own divergence from the codebase it describes. A wiki nobody reconciles against reality becomes one more stale input. The verification loop closes this one.
Mid-session amnesia. Long sessions compact their own context. An agent that read the relevant decision document at hour one may, after compaction, be working from a lossy summary of it at hour three. Re-grounding closes this one.

The remedy for the first channel is structural: make the distinction between decided and existing impossible to overlook. The remedies for the other three are mechanical, and they're what separates a method that works in an essay from one that works in a repository.

The three-document structure

Project knowledge gets divided into three layers, each with a distinct owner, purpose, and rule governing when it may change. The method depends on never letting those layers blur together.

The wiki: what was decided

The wiki is the source of truth. It holds the product requirements and the architecture decisions, the record of what was decided and why, in the spirit of Architectural Decision Records. Its governing rule: decision content is append-only. When a decision changes, the existing document's content is never rewritten. A new document gets added that supersedes it.

This rule exists to defeat circularity, the most damaging failure mode in agent-maintained documentation. If the agent is allowed to rewrite the specification to match what it built, the specification stops being an independent check and becomes a mirror, dutifully recording the agent's mistakes as though they were decisions. Append-only forecloses that path, because frozen intent can't be silently revised. The agent may only add, and additions are dated and reviewable in a way that in-place edits are not. A nice secondary benefit: the wiki becomes a complete, chronological decision history.

One mechanical refinement, because the naive version of this rule contradicts itself. If supersession means renaming the old file to something like _superseded-whatever.md, then superseding requires mutating an existing file. Renames break inbound links, churn git history, and depend on the agent reliably performing a step with no forcing function. The cleaner implementation keeps filenames immutable and records status in exactly one of two places:

Frontmatter. Each wiki document carries a status: field (current, superseded-by: , archived, deprecated, draft). Supersession means adding the new file and flipping one metadata line in the old one. A change to status, never to content. The distinction is enforceable in review: a supersession commit touches one new file plus one frontmatter line, and anything beyond that is a violation.
Or a maintained index. INDEX.md lists every document with its status and supersession chain, and is the single mutable file in the wiki.

Either way, "newest non-archived wins" stops depending on the agent inferring recency from filenames. The current document is the one the metadata says is current, and the supersession chain is explicit and walkable.

The status document: what's true right now

The status document is present reality. Structured around three tenses, where the work is going, where it is, and where it just was, it reports current state and is expected to change continuously. It serves two functions: it's where divergence from the wiki first becomes visible, and it's the cheapest re-entry point for a fresh session, since those three tenses orient a cold agent in seconds.

Be clear-eyed about what the status document is: it's agent-maintained, which means it's the least trustworthy layer, not a second source of truth. It inherits every reliability problem the wiki was protected from. The method's trust ordering is explicit: human-authored decisions first, then the codebase itself, then the status document. The status document's job is to be a fast, cheap, fallible orientation aid whose claims are checkable against the other two. A status document that says a feature is done is a pointer to go look, not a fact.

The agent file: the rulebook

The central agent file is the rulebook. The agent reads it at the start of every session, but, critically, it does not automatically read the wiki. If all knowledge lives in the wiki and nothing directs the agent to it, the agent benefits only by chance. The agent file is therefore deliberately thin, functioning as an index and a set of rules: it states that the wiki exists, specifies which portions to consult before particular work, and defines how to behave when sources conflict. This matches the guidance in Anthropic's own agentic coding best practices: the agent file is for durable, high-leverage instructions, not for knowledge dumps.

What the agent file cannot do, and an earlier draft of this method quietly assumed it could, is guarantee its own application. That's the enforcement layer's job.

The reconciliation rule

When the status document and the wiki disagree, the wiki prevails, without exception. If reality has drifted from the specification, that's a condition to be flagged and corrected, not a justification for rewriting the specification. The failure mode to guard against is an agent that resolves the discrepancy by updating the wiki to match what was built. This looks orderly but it's drift in disguise, silently redirecting the source of truth from what was decided to what the agent last did.

The rule is enforced through precise wording in the agent file. An instruction to align reality to the wiki and flag divergences produces fundamentally different behavior than an instruction to keep the wiki reflecting the latest changes. The method permits exactly one exception: when the plan genuinely changes, the change is introduced as a new, deliberately authored document that supersedes the prior one. A human-authored decision supersedes the wiki; the agent's own reasoning never qualifies.

The provenance problem

That last sentence hides a hole the earlier draft never addressed: how does anyone, the agent included, know a document is human-authored? Files carry no authorship. An agent maintaining the repository can write a file that looks exactly like a human decision, and the rule as stated is an honor-system guard against the one party the method says can't be trusted with the record.

The fix is to anchor provenance in something the agent can't forge:

Git authorship as the floor. Decision documents are committed by the human, from the human's identity, ideally in commits containing nothing else. git log --format='%an %ae' -- path/to/adr.md is then a checkable fact, not a claim in the file. If agent commits carry a distinct author or a co-author trailer (and they should), the distinction is mechanical.
A human-only directory as the stronger form. Decisions live under a path the agent's tooling is forbidden, by hook, not by instruction, from writing to. The agent can draft a proposed ADR anywhere else and surface it; only the human moves it into the decisions directory. Authorship becomes structural: nothing in that directory can have been self-authorized.
An approval marker as the lightweight middle. Each decision document carries an explicit approved-by line the human adds, and the agent file instructs the agent to treat unapproved decision documents as drafts. Weaker than the directory gate, but combined with git authorship it makes forgery visible in review.

Pick the strongest form your tooling supports. The principle is constant: "human-authored" must be a property the system can verify, not a property a document asserts about itself.

The enforcement layer

Here's the uncomfortable fact the document structure can't fix: instructions are probabilistic. The agent file can say "consult the wiki before architectural work" in the clearest prose ever written, and a model deep in a long, tool-heavy session will sometimes proceed without doing it. Not out of defiance. Attention over a crowded context is lossy, and the rulebook read four hundred tool calls ago has faded exactly like every other early input. A method whose integrity depends on the agent reliably remembering to apply the method is a method that drifts.

The systems that survive long-running agent work all converge on the same answer: move the load-bearing rules out of the agent's memory and into the harness, deterministic machinery that fires regardless of what the model is attending to. In Claude Code these are hooks; every serious agent harness has an equivalent.

Session-start injection. A hook that fires when a session opens and injects the wiki pointer, the reconciliation rule, and the status document location directly into context. Every session, unconditionally. The agent doesn't need to remember the wiki exists if the harness reminds it before the first user message lands.
Per-prompt routing. A hook on each user prompt that prepends routing guidance: read the wiki index first, identify the affected modules, open only the files the relevant wiki pages point to. This converts "the agent should scope through the wiki" from a hope into a per-turn nudge that arrives fresh, after any compaction, at full strength.
Write gates. Pre-commit or pre-tool-use hooks that enforce the rules instructions can only request: block content edits to non-draft wiki documents (allowing the one frontmatter status line), block agent writes to the human-only decisions directory entirely, and require that commits touching architecture also touch, or explicitly defer, the corresponding wiki page.
Divergence prompts. A gate that, when implementation files governed by a current ADR are modified, asks the one question that matters: does this change conform to the decision, or does it need a superseding document first?

The division of labor is clean. Instructions handle judgment: how to reconcile, what counts as divergence, when history is worth reading. Hooks handle invariants: the things that must happen every time and must never happen at all. Anything in the agent file that, if skipped once, silently corrupts the record is in the wrong layer. Promote it to a hook. The agent file proposes; the harness disposes.

The verification loop

The earlier draft treated the wiki as a fixed point that only reality could diverge from. Experience with real agent-maintained documentation says otherwise: the wiki itself rots. Incremental updates, each individually reasonable, each nudged by a hook at the end of some session, accumulate into a document set that no longer matches the filesystem it describes. Files get moved and the wiki keeps the old paths. Conventions evolve and the wiki records the previous one. A decision gets implemented at 90% and the wiki, updated optimistically, says done. None of these are the circularity failure; the append-only rule was followed. The truth source just quietly stopped being true.

The lesson generalizes: a source-of-truth document must be reconciled against live state, not merely maintained by incremental edits. Each edit is made against the editor's belief about the system. Beliefs drift. Therefore documents maintained only incrementally drift, with perfect inevitability.

So the method requires a scheduled audit, weekly, or after any large autonomous run:

Walk every current wiki document and check its claims against the codebase. Do the file paths exist? Do the entry points still have those names? Is the described data flow the actual data flow? This is mechanical work, ideal for a delegated agent whose only job is comparison. It makes no fixes; it produces a discrepancy report.
Classify each discrepancy. Either the code drifted from a decision (the reconciliation rule applies: flag it, align code to wiki or escalate for a superseding decision) or the wiki's descriptive content is stale while the decision still holds (update the description; describing reality accurately is not the forbidden circularity, rewriting intent to match reality is).
Audit the status document against both. Anything marked done gets spot-checked. Anything in progress gets confirmed as actually in progress.

That distinction in step two deserves a sentence, because it's the line between maintenance and corruption. A wiki page has two kinds of content: normative (what we decided, append-only, supersession only) and descriptive (where things live, what they're called, updatable freely, because its job is accuracy). The audit updates descriptions; it never touches decisions. When the two are tangled in one page, the audit is the moment to untangle them.

The verification loop is what makes the wiki's authority earned rather than declared. The reconciliation rule says the wiki wins conflicts; the audit is why it deserves to.

Mid-session re-grounding

Everything above guards the boundaries between sessions. Drift also happens inside one. Long sessions compact their own context: the harness summarizes earlier conversation to make room, and summaries are lossy precisely where decisions are dense. An agent that carefully read the relevant ADR early in a session may, hours and hundreds of tool calls later, be working from a paraphrase of a summary of it. And a paraphrase of a constraint is how constraints get violated politely.

The countermeasure is cheap and specific: re-read the deciding document at the point of consequence, not just at the point of orientation. The agent file states it as a standing rule (before implementing against a decision, re-open the decision) and the per-prompt hook reinforces it, because hook-injected context arrives after any compaction and is therefore always at full fidelity. The decision document the agent read at session start is a memory; the one it re-reads before writing the code is a fact. Prefer facts at the moment of action.

This is also why the status document's three-tense structure earns its keep twice. It's the re-entry point for a fresh session, and it's the re-entry point for the same session after compaction, when the agent is functionally fresh whether it knows it or not.

The naming lexicon

The token savings of this method come from a single observation: cost is driven less by what you type than by what the agent re-reads every session. The most expensive routine action an agent takes is reading an entire knowledge base to understand the platform, loading large amounts of context irrelevant to the immediate task.

The lexicon addresses this by making a document's status and relevance legible before its contents are loaded. With frontmatter-based status, the index is the filter: the agent reads INDEX.md (or the frontmatter lines alone, which cost a few tokens per file), determines what's current and task-relevant, and loads only that. Filename prefixes can serve as a redundant visual layer for the humans browsing the directory, but the machine-readable status lives in metadata, where changing it doesn't break links.

This is the only form of shorthand worth adopting. Compressing prompts into terse symbolic notation degrades output and generates correction cycles that erase any savings. Effective shorthand compresses what the agent re-reads, not what the operator types. A status line and an index entry yield savings on every session for the life of the project.

One honesty note: an earlier draft called session starts "nearly free." Treat the savings as a directional claim until measured on your own project. The right comparison is tokens-to-orientation with and without the index, which is a half-hour experiment. The mechanism is sound (skip-before-read beats read-then-discard by construction); the magnitude is yours to measure.

Result

With the full method in place, beginning a fresh session is cheap and, more importantly, grounded. The harness injects the rulebook and pointers unconditionally. The agent reads the status document and orients in seconds, scans the index, skips everything superseded or archived, and loads only the current files relevant to the task. Before implementing against a decision, it re-opens the decision. When sources conflict, the trust ordering is explicit and the forbidden resolution is mechanically blocked, not merely discouraged. And on schedule, the source of truth itself gets audited against the code it claims to describe, so its authority stays earned.

Drift is not eliminated. Nothing eliminates it while instruction-following is probabilistic and documents are written by fallible parties on both sides of the keyboard. But each of its four channels now has a dedicated countermeasure: structure closes authority ambiguity, hooks close instruction decay, the audit closes truth-source rot, and re-grounding closes mid-session amnesia. What remains is drift that gets caught, flagged at the divergence gate, surfaced in the audit, visible in the supersession chain, instead of drift that compounds silently for months.

The broader principle stands, sharpened: durable advantage lives in the system rather than the model, and the system is more than documents. The model will keep improving on its own. What it will never do on its own is understand a specific codebase, the decisions behind it, the difference between the two, and apply that understanding reliably on the four-hundredth tool call of an autonomous run. The documents encode the understanding. The machinery makes it reliable. Both remain the operator's to build.

Reference: a starting lexicon

Status lives in frontmatter (machine-readable, link-stable); the index mirrors it for one-glance scanning. Filename prefixes are an optional human-facing redundancy.

|--------|-------|---------|----------------|

Governing rules

The wiki is the permanent source of truth. Decision content is append-only: never rewrite a decision; supersede it with a new document and flip one status line (or one index entry). Descriptive content (paths, names, locations) may be corrected freely; accuracy is its job.
When the status document or the code conflicts with the wiki, align reality to the wiki. Never update a decision to match what was built.
If implementation diverges from a current ADR, stop and flag it. Do not resolve it unilaterally.
Only a verifiably human-authored decision document supersedes the wiki, verified by git authorship, a human-only decisions directory, or an approval marker, never by the document's own assertion. The agent's reasoning never qualifies; the agent may draft proposals, the human promotes them.
On session start: read the agent file, the status document, then only the current wiki files relevant to the task. Do not load the full wiki to understand the platform.
Before implementing against a decision, re-read the decision document at that moment. A constraint recalled from earlier context (or from a compaction summary) is not a constraint read.
Trust ordering when sources disagree: human-authored decisions, then the codebase, then the status document. The status document is an orientation aid, not a source of truth.
Any rule whose silent omission corrupts the record belongs in the harness, not the rulebook: enforce session-start injection, wiki-write gates, and the decisions-directory boundary with hooks, and keep judgment calls in the agent file.
On a schedule (and after any large autonomous run), audit the wiki against the live codebase. Classify each discrepancy as code-drifted-from-decision (apply rule 2) or stale description (correct it). The wiki's authority over conflicts is earned by this audit, not declared.