fix(sandbox): Landlock bash sandbox never engaged — resolve bashMode lazily#61
Merged
Merged
Conversation
…SDK path The Landlock sandbox shipped in #60 never actually engaged at runtime. bash ran fully unsandboxed in prod (confirmed: the deployed agent could `ls /`, read sibling projects, and list /root/). Root cause: bashMode was resolved in a `pi.on("session_start", …)` handler. session_start is only emitted from pi's `bindExtensions()`, which is called by the interactive/rpc/print *modes*. The server drives Pi via the bare SDK (createAgentSession + session.prompt) and never calls bindExtensions, so session_start never fires — the handler never ran and bashMode stayed at its initial "none" (unsandboxed). The extension's registerTool calls (scoped read/write, and the bash env injection) DO take effect because they run in the factory body, which is why per-turn proxy env worked while containment silently didn't. The old SandboxManager path had the same gate. Fix: resolve bashMode lazily and memoized on first bash use, inside the bash tool's execute handler (which provably runs in prod — proxy env injection depends on it). session_start still calls ensureBashMode() so interactive/rpc modes resolve eagerly and get a status line. Also log the resolved mode (info/warn) — ctx.ui is a no-op in the headless SDK path, so the sandbox state was invisible in prod; this makes it greppable and would have caught the regression.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The bug
The Landlock sandbox from #60 never actually engaged at runtime. On the deployed agent, bash ran fully unsandboxed — it could
ls /, read sibling projects under the projects root, and list/root/. (Caught by a live sandbox-probe test against the deployed agent.)Root cause
bashModewas resolved in api.on("session_start", …)handler. Butsession_startis only emitted from pi'sbindExtensions(), which is called by the interactive/rpc/print modes. This server drives Pi via the bare SDK (createAgentSession+session.prompt) and never callsbindExtensions— sosession_startnever fires, the handler never runs, andbashModestays at its initial"none"(unsandboxed).What did work — the scoped
read/writetools and the per-turn proxy-env injection — runs in the extension factory body, not insession_start. That's why proxy env worked in prod while containment silently didn't, and why it looked fine in local/interactive testing (where modes do callbindExtensions). The oldSandboxManagerpath had the same gate, so bash was likely never sandboxed in the deployed server either.Fix
bashModelazily and memoized on first bash use, inside the bash tool'sexecutehandler — which provably runs in prod (the proxy-env injection everyone relies on depends on it). No lifecycle event required.session_startstill callsensureBashMode()so interactive/rpc modes resolve eagerly and keep the status line.info/warn).ctx.uiis a no-op in the headless SDK path, which is exactly why this was invisible in prod — now it's greppable (bash sandbox mode resolved … mode=landlock) and would have caught this regression.Verification plan
After deploy: grep prod logs for
bash sandbox mode resolved mode=landlock, and re-run the live probe (ls /should now bePermission denied).🤖 Generated with Claude Code