The Product · Bench

Bench.
The control surface.

Where supervisors work with agents and corrections get captured. Where recurrence becomes visible. Where compiled rules are promoted.

Not a chat app

Not a workspace

Not a better Cursor

The place where correction becomes fast and enforceable.

Mock · Bench main screen

Bench · main screen · flow mode

Four loops, one surface

What Bench does that a workspace does not.

Bench is opinionated. Every loop exists to make correction visible, fast, and compilable.

Loop 01

Inline correction capture

Correct the agent's output the way you would edit prose. The correction is captured at the edit site, scoped to the operator.

Loop 02

Recurrence detection

Similar corrections cluster across sessions. When a cluster crosses the compilation threshold, a promote card surfaces.

Loop 03

Compilation moment

User promotes the cluster into a Tether hook. The rule stops being text. It becomes middleware enforced in the harness.

Loop 04

Evidence ledger

Every violation the compiled rule prevents gets logged. The weekly report shows what would have shipped without Calx.

Inline capture

Correction where it happens, not in a side panel.

When an agent produces the wrong thing, you fix it inline. Bench captures the delta between agent output and your edit. No re-prompting. No extra modal. No “add a rule.”

The correction is tagged with the operator identity, the workstream, and the class of output. That metadata is what makes recurrence detection work.

Captures at the edit site, not in a separate annotation UI
Classifies text, code, data, and structural edits differently
Never sends prompts or completions off-device without scoped consent

Mock · Bench inline edit capture

Bench · inline edit capture

Mock · Bench compilation moment

Bench · compilation moment

Compilation moment

The promote card is the product.

Three recurrences of the same corrected class becomes the compile threshold. A promote card surfaces with the cluster, the recurrence count, and a one-click action: compile this into runtime enforcement.

Compilation is not automatic. You decide. Calx never silently rewrites what your agents can do. The rule is yours to promote, defer, or dismiss.

User-controlled promotion, never silent compilation
The compiled rule is version-controlled and scoped to the operator or team
You can roll back to text-rule status at any time

Access

Bench is shipping to design partners.

Public availability follows proof. We do not ship software that has not earned its scope.

The shortest path to Bench is a Correction Audit.

The audit gives us a baseline for your workflow. The design partnership that follows is where Bench and Tether ship to your team, with Spencer hands-on through the pilot.

Book a Correction Audit →

Current statusDesign partners · inbound only

Running withEngineering firm · recruiting agency · + 2

Next stepAudit → scoped pilot → partnership