Text rules don't compile into agent behavior. The evidence is in the wild. We named the pattern and built the fix.
What your agent knows. System prompts, CLAUDE.md, .cursorrules, memory stores.
What your agent actually does. Compliance, error avoidance, learned corrections.
.cursorrules says "never use var." Agent uses var. Corrected. Tomorrow, var again.CLAUDE.md says "always use the v2 API." Agent calls v1. Added in bold. v1 again.These aren't prompting failures. They're compilation failures.
OpenAI has begun calling this pattern harness engineering. Calx builds the behavioral governance layer inside that pattern, cross-runtime, for everyone else.
Not our test. A practitioner on the Cursor forums ran this across three context lengths, nine formatting rules, three runs each. Same rules. Same agent. The only variable was the delivery mechanism.
We did not run this. We found it. It is exactly the pattern our own longitudinal data keeps producing, in a tighter setup than we could have built from the outside. We cite it. We did not author it.
Separate from the controlled test above. A 43-day longitudinal observation of real human-AI collaboration across a working codebase. First-party data. Ours.
Recency-priority encoding (RPE) means instructions at the top of context lose weight as the conversation grows. Rules written first are forgotten first.
At 500 instructions, accuracy drops to 68%. The more rules you add, the less likely any single rule is followed. Instruction density works against you.
Compliance does not degrade linearly. It holds until around the 40-50% context window threshold, then collapses. A cliff, not a slope.
237 rules transferred from one agent to another. The receiving agent made 44 new mistakes, 13 in categories the rules explicitly addressed.
237 rules transferred from one agent to another. The receiving agent made 44 new mistakes, 13 in categories the rules explicitly addressed.
Without human friction in the correction loop, agents accept instructions but fail to modify behavior. Compliance is performed, not enacted.
Names the pattern behind the practitioner test. Text instructions do not compile into agent behavior; structural enforcement does. The variable is the delivery mechanism, not the rule content.
Search "rules ignored" in the Cursor forums. Read the Claude Code GitHub issues about .cursorrules drift. The Compiler Gap is already documented by practitioners. We named it.