The research

Three papers. One thesis.

Text rules do not compile into agent behavior. We tested it. The data is public.

The Behavioral Plane: Correction Engineering in Human-AI Collaboration

0/9 text rules followed. 9/9 compiled rules enforced. A 43-day longitudinal study across 8 concurrent AI agents proving the compiler gap is real.

0/9 vs 9/9

Read on Zenodo &nearr;

Behavioral Stickiness: How Corrections Compound Over Time

How corrections compound over time. Session-over-session persistence data showing that compiled rules do not decay, while text-based rules degrade to zero within sessions.

151 corrections

Read on Zenodo &nearr;

The Compiler Gap: Why Text-Based Rules Fail in Agentic Systems

A synthesis paper drawing on 100+ sources to prove the compiler gap is a universal phenomenon across all AI agent systems, not an artifact of a single deployment.

100+ sources

Read on Zenodo &nearr;

The data

151 corrections. 43 days. 8 agents.

Not a demo. Not a benchmark. A longitudinal study of real human-AI collaboration.

Week 1: DiscoveryWeek 2: PatternsWeek 3: CompilationWeek 4: Convergence

Human correction

System enforcement

No activity

0/9

text rules followed

9/9

compiled rules enforced

The variable was not the rule content. It was the delivery mechanism.

151

corrections captured

days of real usage

concurrent agents

published papers

The landscape

Four approaches to the problem.

Only one compiles.

Memory Systems

Store what agents know. Retrieval does not equal behavior change.

Prompt Engineering

Improve what agents read. Better prompts still get ignored.

Agent Frameworks

Orchestrate what agents do. Workflows without behavioral enforcement.

Behavioral Infrastructure

Compile corrections into structural rules. What you say becomes what agents do.

Calx lives here

Put the research to work.

Book a Correction Audit Read the consensus