The Product · Bench

Bench.
The control surface.

Where supervisors work with agents and corrections get captured. Where recurrence becomes visible. Where compiled rules are promoted.

Not a chat app
Not a workspace
Not a better Cursor
The place where correction becomes fast and enforceable.

What Bench does that a workspace does not.

Bench is opinionated. Every loop exists to make correction visible, fast, and compilable.

Loop 01
Inline correction capture
Correct the agent's output the way you would edit prose. The correction is captured at the edit site, scoped to the operator.
Loop 02
Recurrence detection
Similar corrections cluster across sessions. When a cluster crosses the compilation threshold, a promote card surfaces.
Loop 03
Compilation moment
User promotes the cluster into a Tether hook. The rule stops being text. It becomes middleware enforced in the harness.
Loop 04
Evidence ledger
Every violation the compiled rule prevents gets logged. The weekly report shows what would have shipped without Calx.

Correction where it happens, not in a side panel.

When an agent produces the wrong thing, you fix it inline. Bench captures the delta between agent output and your edit. No re-prompting. No extra modal. No “add a rule.”

The correction is tagged with the operator identity, the workstream, and the class of output. That metadata is what makes recurrence detection work.

  • Captures at the edit site, not in a separate annotation UI
  • Classifies text, code, data, and structural edits differently
  • Never sends prompts or completions off-device without scoped consent

The promote card is the product.

Three recurrences of the same corrected class becomes the compile threshold. A promote card surfaces with the cluster, the recurrence count, and a one-click action: compile this into runtime enforcement.

Compilation is not automatic. You decide. Calx never silently rewrites what your agents can do. The rule is yours to promote, defer, or dismiss.

  • User-controlled promotion, never silent compilation
  • The compiled rule is version-controlled and scoped to the operator or team
  • You can roll back to text-rule status at any time

Bench is shipping to design partners.

Public availability follows proof. We do not ship software that has not earned its scope.

The shortest path to Bench is a Correction Audit.

The audit gives us a baseline for your workflow. The design partnership that follows is where Bench and Tether ship to your team, with Spencer hands-on through the pilot.

Book a Correction Audit →
Current statusDesign partners · inbound only
Running withEngineering firm · recruiting agency · + 2
Next stepAudit → scoped pilot → partnership
PlatformmacOS desktop · Linux in beta