Address external review: fix release-blocking bin bug + close permission-drift blind spots#49
Merged
Merged
Conversation
Two trust-critical fixes flagged in external review. 1. npm installs the `scopetrail` bin as a symlink, so the entrypoint's `import.meta.url === process.argv[1]` main-module check was false when launched via `npx`/global install: main() never ran and the CLI exited 0 with no output -- a CI gate that silently passed every PR. Resolve both sides through realpathSync. Add a test that runs the built bin through a symlink (the existing CLI tests only ran `node dist/index.js` directly). 2. Codex sandbox/approval drift ranked a missing base value at -1, so a brand-new .codex/config.toml that merely set the narrowest posture (read-only sandbox, untrusted approval) was reported as a high-severity widening/weakening -- a false positive on the safest possible config. Anchor the missing baseline at Codex's safe default so only settings genuinely wider than that default surface; a brand-new danger-full-access sandbox or `never` approval still fires. Add unit tests for both Codex directions and a benign benchmark fixture (codex-baseline-narrowest); regenerate RESULTS.md (29 cases, precision/recall unchanged at 100%/0% FP) and sync the README corpus counts. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Implements the rest of the external review (the trust-critical bin/Codex pair landed in the previous commit). Detector blind spots: - #4 Merge all recognized MCP server maps instead of first-map-wins, so an empty `mcpServers: {}` can no longer shadow a populated `servers: {}` in Cursor/VS Code configs. - #3 Diff sensitive MCP fields (env, headers, cwd) on existing servers, which serverCommand()-based comparison ignored. A server keeping the same command but gaining a secret env var or auth header now surfaces; secret-bearing key names escalate to high. Applies to .mcp.json and Codex [mcp_servers.NAME]. Removals stay silent (narrowing, not widening). New kinds: scope_trail.mcp_server_sensitive_field_changed and scope_trail.codex_mcp_sensitive_field_changed. - #5 Model Claude hook entries by (matcher, type, command) rather than command alone, so rebinding a guard from one tool to another (same command) is caught. NOTE: this reverses the prior "matcher change is noise" decision and its test — a PreToolUse guard moved from Bash to Read stops guarding Bash, which is a real enforcement-surface change. - #7 Expand the critical removed-deny pattern set beyond .env/secret/ credential/.pem to SSH keys, *.key, cloud credentials, registry tokens, and kube configs. Docs & supply-chain: - #2/#8 Qualify the README "local-only" claim: the scanner uploads nothing, but the Action's setup step runs `npm ci` from the registry. - #9 Pin agent-gov-core to exact 1.3.0 (lockfile kept in sync) so npm consumers do not silently resolve a newer parser/schema. - #10 Note that the benchmark figures bound regressions, not real-world field accuracy. Each change ships unit tests and a labeled benchmark fixture. Corpus grows to 35 cases (27 rogue, 8 benign) across 21 detector kinds; precision/recall hold at 100%/0% false-positive rate, RESULTS.md regenerated and README counts synced. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
runDiff materialized the base snapshot, then the head snapshot, and only assigned `cleanup` after both succeeded. If head materialization threw — an unresolvable head ref (a shallow checkout missing it), or a max-buffer error on an oversized config — the base snapshot's temp dir was already on disk and leaked, since cleanup was never wired up. Clean the base snapshot explicitly before the error propagates. Found during a fresh correctness pass over the codebase (not part of the external review). The existing unresolvable-ref test fails on the *base* ref, before any dir exists, so it never exercised this path. Adds a regression test that asserts no scopetrail-snapshot temp dir survives a head-side failure. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements the full external review of ScopeTrail across two commits.
Trust-critical pair (first commit)
import.meta.url === process.argv[1]was false when thescopetrailbin was launched through an npm symlink (npx/global install), somain()never ran and the CLI exited 0 with no output — a CI gate silently passing every PR. Now resolves both sides throughrealpathSync. Added a test that runs the built bin through a symlink (existing CLI tests only rannode dist/index.jsdirectly).-1, so a brand-new.codex/config.tomlat the narrowest posture (read-onlysandbox,untrustedapproval) was reported as a high-severity "widening." The baseline now anchors at Codex's safe default; genuinely-wide new configs (danger-full-access,never) still fire.Detector blind spots (second commit)
mcpServers: {}no longer shadows a populatedservers: {}.env,headers,cwd) on existing servers. A server keeping its command but gaining a secret env var or auth header now surfaces; secret-bearing key names escalate to high. Covers.mcp.jsonand Codex[mcp_servers.NAME]. Removals stay silent (narrowing). New kinds:mcp_server_sensitive_field_changed,codex_mcp_sensitive_field_changed.(matcher, type, command), so rebinding a guard from one tool to another (same command) is caught.PreToolUseguard moved fromBashtoReadstops guardingBash— a real enforcement-surface change. Flagging if you'd rather keep the old behavior..env/secret/credential/.pemto SSH keys,*.key, cloud credentials, registry tokens, and kube configs.Docs & supply-chain
npm cifrom the registry.agent-gov-coreto exact1.3.0(lockfile kept in sync) so npm consumers don't silently resolve a newer parser/schema.Verification
--test); each finding ships a unit test.RESULTS.mdregenerated and README counts synced.dist/rebuilt (CI gates on it).🤖 Generated with Claude Code