Every team has its own CLAUDE.md. Every agent has its own rules. You cannot tell which corrections keep firing. That is a behavioral-control problem, not a prompt problem.
You own the rollout. The rules live in three places. Some teams wrote their own wrappers. Compliance is whatever anyone remembers at the time.
When the agent does the wrong thing, someone corrects it. Maybe they write a new rule. Maybe they ping you. Maybe they just fix it silently and move on. The correction evaporates.
Next week, a different engineer makes the same mistake. The correction fires again. The rule is the same rule. Text rules do not compile into behavior.
What you are missing is not a better prompt file. It is a correction compiler. A place where recurring corrections become structural enforcement that survives across teams and sessions.
"Instructions are more like guidelines than actual rules. LLMs aren’t deterministic."
The people running the rollouts already know. Our Paper 3 evidence repo catalogs 70+ practitioner reports saying the same thing. Text rules do not compile into behavior.
Pick one team’s agent rollout. We instrument the correction surface for two weeks, cluster the recurrences, and hand you a behavioral control report with an enforcement plan.