That post is useful. The part that stuck with me was not just how to configure Claude Code well. It was what happens after the configuration starts working.
Agent configuration is not documentation in the normal sense. A CLAUDE.md, AGENTS.md, skill file, hook, MCP config, repo map, permission rule, or local workflow note is not just context for future humans.
It changes behavior.
A small instruction like this is not a comment. It changes what work the agent considers complete.
## Validation
Before marking work complete:
- run bun run typecheck
- run bun run --filter @aicrier/web build
- include any failing command and error text in the final note
Do not say the change is done until validation has run or the blocker is explicit.
That tells the agent where to look, what to ignore, what commands prove work, what files are dangerous, what patterns are expected, and what kind of answer is acceptable.
Agent Config Is Technical Debt — AICrier
At that point, agent config is closer to production infrastructure than README prose.
And infrastructure rots.
The Failure Mode Is Subtle
Bad agent config usually does not fail like a broken build.
It shows up as drift in behavior.
The agent gets cautious around a file that is no longer risky.
It keeps running an old test command because that was once the approved path.
It treats a temporary workaround as a permanent architecture rule.
## Temporary migration note
Until the auth migration is finished, do not edit src/auth/session.ts.
Use src/auth/session-v2.ts for all new work.
Added: 2026-02-12
Owner: unassigned
Remove after: migration complete
That might be reasonable for a week. Six months later, it is just unowned behavior.
The harness loads every task with global context because specialized knowledge was never moved closer to the code it governs. It avoids a better approach because an older model needed a guardrail the current model no longer needs.
None of this looks catastrophic in one run. It just makes the system slightly slower, more hesitant, and more likely to preserve old assumptions.
That is what makes it easy to miss.
Humans skim stale docs. Agents obey stale instructions.
Better Models Can Make This Worse
The easy assumption is that stronger models make the harness matter less.
I think the opposite is more likely.
A weak model ignores a lot of nuance because it cannot use it. A stronger model can follow the system you built for it. If the system is clean, that is useful. If the system is full of old assumptions, dead rituals, and cargo-culted rules, the better model may execute that debt more reliably.
The config that helped in March can be the constraint that hurts in June.
## Model workaround
For the previous coding model:
Always write a full implementation plan before editing files.
Never make a one-line fix without first explaining the whole subsystem.
Added for: prior model behavior
Review when: default coding model changes
That kind of rule may have been useful for one model, one team, or one failure pattern. It should not automatically survive a model upgrade.
A rule that said "never touch this module" may have made sense before a refactor. A warning about brittle tests may be wrong after the test harness was fixed. A long global instruction about API boundaries may belong in one package, not every agent run across the repo.
The agent does not know which rules are current unless the system gives it a way to tell.
Who Owns The Harness?
Most teams are going to under-own this.
They will add an agent instruction file, wire up a few tools, create some prompt snippets, and then treat the whole thing as setup work. Done once. Occasionally appended to. Rarely pruned.
That is how technical debt forms.
The ownership question is practical, not philosophical.
Who removes stale instructions? Who retires temporary workarounds? Who moves narrow rules out of global context? Who checks whether a model still needs a guardrail? Who reviews behavior-changing config with the same care as any other dependency?
If the answer is nobody, the harness becomes an unmanaged layer of behavior. The agent will still use it, even when the team no longer understands why it is there.
Can AI Help Maintain It?
Yes, but this is exactly where the boundary matters.
AI can help find harness debt. It can scan instruction files for duplicated rules, stale references, broken commands, old package paths, obsolete warnings, and conflicting instructions. It can compare the current repo shape against the assumptions in the config. It can notice that a command no longer exists, that a workflow points at a deleted directory, or that three different files are giving the agent slightly different versions of the same rule.
That is useful work.
I would not let the same system silently rewrite the rules that govern its own behavior.
The safer pattern is closer to dependency maintenance. Let the agent propose harness changes as reviewable diffs. Let it explain why a rule looks stale, what evidence it found, what behavior might change, and how to validate the change. Then a human reviews the patch like any other behavior-affecting change.
There is also a useful middle ground: use agents to generate harness audits, not harness edits.
## Harness audit task
Review CLAUDE.md, AGENTS.md, .agents/skills, hooks, and MCP config.
Find:
- stale file paths
- commands that no longer exist
- duplicated rules
- rules that only apply to one package
- temporary workarounds with no owner or removal condition
- model-specific guidance that predates the current model
Return findings and proposed diffs.
Do not edit files.
A good audit can answer questions like:
Which global rules are actually local?
Which instructions mention files, commands, or services that no longer exist?
Which rules were added as temporary workarounds and never removed?
Which instructions are repeated across multiple control files?
Which parts of the harness changed after a model upgrade?
Which validation commands are still trusted but no longer prove much?
That turns the problem from vague maintenance into a concrete review queue.
So yes, AI can help with the problem. It probably should. But the agent should not be the unchecked owner of the instructions that define its own authority. It can inspect, propose, summarize evidence, and make the boring cleanup cheaper.
Ownership still belongs to the engineering team.
Format Is Part Of The Interface
There is no universal agent-file format. That does not make format unimportant. It means format is part of the harness contract.
A human-maintained instruction file should usually be Markdown because engineers can read it, review it, diff it, and keep it close to the repo.
## Repo rules
- Do not push without explicit user instruction.
- Prefer existing helper APIs over new abstractions.
- When changing shared behavior, add or update focused tests.
When a prompt mixes instructions, context, examples, and variable input, more explicit boundaries can help. Anthropic's Claude guidance recommends XML-style tags for that reason. Tags like instructions, context, and input tell the model which text is rule, which text is background, and which text is the thing being worked on.
<task>
Audit the payment package for stale agent instructions.
</task>
<context>
The billing migration finished in April.
Any instruction that routes new work through legacy-billing is suspect.
</context>
<output>
Return findings with file paths, evidence, and proposed changes.
</output>
For settings a tool has to consume, use real structured data. Do not make the model infer a command table from prose if TOML, JSON, or YAML would make the contract explicit.
[tools.typecheck]
command = "bun run typecheck"
owner = "platform"
[tools.web_build]
command = "bun run --filter @aicrier/web build"
owner = "web"
The practical rule is simple: choose the least structured format that is still reviewable, explicit, and supported by the tool that consumes it.
Human guidance should stay readable. Machine config should be parseable. Examples should be clearly fenced. Temporary migration notes should be marked as temporary. Anything that affects runtime behavior should be treated as an interface, not just prose.
Making the config tiny is not the goal. Large repos need context. Agents need maps. They need conventions, commands, boundaries, and examples.
The useful standard is to treat the harness as living infrastructure.
A few rules I would start with:
Put global rules on a diet.Global instructions should be rare, durable, and genuinely cross-cutting. If a rule only matters for one package, one workflow, or one kind of task, move it closer to that place.
## Bad global rule
Before editing any UI, inspect the notes editor drag-and-drop behavior.
That is not a repo rule. It is a notes editor rule. Put it near the notes editor workflow instead.
Give temporary rules an expiration date.If the instruction exists because of a current bug, migration, broken test suite, or model weakness, mark it as temporary. Otherwise temporary becomes architecture.
Separate policy from procedure.Keep policy global and keep procedure local. A rule like "do not push without permission" belongs in the top-level harness because it should apply everywhere.
## Global policy
Never push to git without explicit user instruction.
"commit" means commit only.
"push" or "commit and push" means pushing is allowed.
A workflow like "run this command before editing the notes page" belongs next to that workflow, with an owner and a reason it exists.
## Notes editor procedure
Applies to: apps/web/src/app/admin/notes
Before changing image upload behavior:
- verify pasted images
- verify drag-and-drop images
- verify preview and write mode render the same image size
If a procedure changes often, breaks often, or only applies to one part of the repo, do not bury it in global agent instructions.
Delete cleverness first.The more clever a workaround is, the more likely it becomes invisible debt. Hooks, aliases, prompt tricks, and special-case skills need extra suspicion.
Review the harness when the model changes.A model upgrade is not just a capability upgrade. It changes what instructions are useful. Some guardrails become unnecessary. Some vague instructions become newly dangerous because the model can now act on them more effectively.
Keep proof paths explicit.The agent should know how work gets proven: tests, builds, screenshots, logs, review surfaces. But those proof paths need maintenance too. A stale validation command is worse than no command because it creates fake confidence.
The Cost Moves To Coherence
The hard part is not getting agents to write more code. That part is getting cheaper.
The hard part is keeping the system coherent while more code, prompts, hooks, rules, and workflows are being produced.
That is less glamorous than another demo of an agent generating a feature from a sentence. But it is probably where a lot of the real leverage is.
Once agents can make changes quickly, the expensive part is not generation.
The expensive part is keeping the system coherent after generation gets cheap.
Agent config is one of the places that coherence either survives or quietly decays.