Where supervisors work with agents and corrections get captured. Where recurrence becomes visible. Where compiled rules are promoted.
Bench is opinionated. Every loop exists to make correction visible, fast, and compilable.
When an agent produces the wrong thing, you fix it inline. Bench captures the delta between agent output and your edit. No re-prompting. No extra modal. No “add a rule.”
The correction is tagged with the operator identity, the workstream, and the class of output. That metadata is what makes recurrence detection work.
Three recurrences of the same corrected class becomes the compile threshold. A promote card surfaces with the cluster, the recurrence count, and a one-click action: compile this into runtime enforcement.
Compilation is not automatic. You decide. Calx never silently rewrites what your agents can do. The rule is yours to promote, defer, or dismiss.
Public availability follows proof. We do not ship software that has not earned its scope.
The audit gives us a baseline for your workflow. The design partnership that follows is where Bench and Tether ship to your team, with Spencer hands-on through the pilot.
Book a Correction Audit →