fix(codex): make agent home dirs vscode-owned so Codex can create state_*.sqlite#117
Conversation
…te_*.sqlite We stopped seeding Codex's state_*.sqlite index (commit 2eaf2b4), so Codex now creates it at startup instead of receiving it pre-uploaded. The create failed with a permission error because the directory wasn't owned by vscode (the user the agent runs as). Two distinct ownership defects: 1. The agent home dirs (~/.codex, ~/.claude, ~/.local/share/opencode) were not reliably vscode-owned in cloud templates (E2B's base image ships a `node` user; the root `npm install -g @openai/codex` bake step left ~/.codex as node:node). This breaks even a plain `agentbox codex` start. Fixed with a cheap, idempotent create-time chown (ensureAgentHomeDirsOwned) — no re-bake. 2. The upload primitives only chowned the final landed path, not the parent directory chain they mkdir -p'd as root. Session-teleport lands a rollout at ~/.codex/sessions/YYYY/MM/DD/, leaving that chain root-owned so Codex can't write a new rollout / its sqlite index. Mirror the carry.ts parent-walk fix in both upload primitives (cloud-cp.ts + docker box-cp.ts), gated on the dest being under /home/vscode. Chowns are name/id-derived (vscode / id -un), not hardcoded 1000, since the vscode uid varies per provider (docker/hetzner=1000, vercel=1001, e2b=1002). Claude-Session: https://claude.ai/code/session_0152GmbNW3e7QpXNkQFd3MB2
|
The latest updates on your projects. Learn more about Vercel for GitHub. 1 Skipped Deployment
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes using default effort and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 090f9f6. Configure here.
| `while [ "$parent" != ${quoteShellArg(BOX_HOME)} ] && [ "$parent" != "/" ]; do ` + | ||
| `$SUDO chown "$(id -un):$(id -gn)" "$parent" || true; ` + | ||
| `parent=$(dirname "$parent"); ` + | ||
| `done` |
There was a problem hiding this comment.
Parent walk chowns /home
Medium Severity
When an upload’s resolved finalPath is exactly /home/vscode, the new parent-chain chown treats that as under home, sets parent to /home, and the loop condition only excludes /home/vscode, so /home itself can be reassigned to the agent user. The existing carry path avoids this by requiring destinations under BOX_HOME/ with a trailing segment, not equality with BOX_HOME.
Additional Locations (1)
Reviewed by Cursor Bugbot for commit 090f9f6. Configure here.
Bugbot: when an upload's resolved finalPath was exactly /home/vscode, the `=== BOX_HOME` branch of the gate let the parent walk run with dirname=/home, reassigning /home itself to the agent user. Gate strictly on `startsWith(BOX_HOME + '/')` (a trailing segment), matching carry.ts. Applies to both cloud-cp.ts and docker box-cp.ts; adds a regression test. Claude-Session: https://claude.ai/code/session_0152GmbNW3e7QpXNkQFd3MB2


Problem
We stopped seeding Codex's
state_*.sqliteindex (commit2eaf2b428, "fix(codex): don't seed Codex's session-state DBs into boxes"). Codex now creates that index at startup instead of receiving it pre-uploaded — and the create failed with a permission error because the target directory wasn't owned byvscode(the user the agent runs as). A box agent diagnosed it live: "/home/vscode/.codexwas owned bynode:node, and the uploaded session directory wasroot:root, so the vscode user couldn't createstate_5.sqlite."Two distinct ownership defects, each blocking a different path:
agentbox codexstart: Codex writes~/.codex/state_*.sqliteat the top level and can't if~/.codexitself isnode:node(E2B's base image ships anodeuser; the rootnpm install -g @openai/codexbake step left it that way).mkdir -p'd as root. Session-teleport lands a rollout at~/.codex/sessions/YYYY/MM/DD/, leaving that chainroot:rootso Codex can't write a new rollout / its sqlite index. This is the exact bug already fixed forcarry:incarry.ts:144-156— never ported to the upload path.Fix
agent-credentials.ts— newensureAgentHomeDirsOwned(): a cheap, idempotent create-timechown -R vscode:vscodeover~/.codex,~/.claude,~/.local/share/opencode. Fixes existing prepared templates without a re-bake (preferred over a Dockerfile change).chown -Rdoesn't deref symlinks, so the baked credential symlinks are untouched.cloud-provider.ts— calls it unconditionally afterseedAgentVolumesIfFresh, before teleport/agent launch.cloud-cp.ts(uploadToCloudBox) — after the final-path chown, walks the parent chain up to/home/vscode(exclusive), gated on dest under home. Thebash -cwrapping protects$(...)/whilefrom Vercel's outersudo -u vscode -H bash -lcexec nesting (no per-backend carve-out needed).box-cp.ts(dockeruploadToBox) — equivalent parent-walk viadocker exec --user root(fix across all providers).cloud-cp.test.ts— asserts the parent-walk is present under/home/vscode/and absent for/etc/*and/workspace/*.Both parts are needed: Part 1 runs at create (before teleport); Part 2 fixes the post-create teleport upload. Plain codex starts never hit the upload path, so they rely on Part 1.
Chowns are name/
id-derived (vscode/id -un), not hardcoded1000, because the vscode uid varies per provider (see results).Verification
pnpm build,pnpm lint, fullpnpm test(916+ tests across 25 packages) all green, including the newcloud-cp.test.ts.Live, end-to-end on every available provider (created a box from a test repo, then checked ownership + write-probes;
agentbox cpexercises the same upload primitive session-teleport uses):~/.codexwritable (createdstate_probe.sqlite)sessions/ancestors vscode-owned + sibling writablebash -cwrapping survives Vercel's exec nesting.|| truetolerance already in place.Notable finding: the in-box
vscodeuid differs per provider (docker/hetzner=1000, vercel=1001, e2b=1002) because each base image reserves 1000 differently — exactly why the fix chowns by name, not a hardcoded 1000.All four verification boxes destroyed; no orphan sandboxes/servers left behind.
https://claude.ai/code/session_0152GmbNW3e7QpXNkQFd3MB2
Note
Medium Risk
Changes ownership normalization on every cloud create and on host→box uploads for paths under the agent home; failures are tolerated but mis-chown could still leave edge cases on FUSE/read-only mounts.
Overview
Fixes EACCES when Codex creates
state_*.sqliteat startup (no longer pre-seeded): agent homes and upload-created paths must be writable byvscode.Create-time: Adds
ensureAgentHomeDirsOwned— best-effortchown -R vscode:vscodeon~/.codex,~/.claude, and OpenCode’s data dir — and runs it on every cloud box create after credential seeding so wrong template owners (e.g.node:nodeon E2B) don’t block Codex.Upload-time:
uploadToCloudBoxand dockeruploadToBoxnow chown the landed path and walk parent directories up to/home/vscodewhen the destination is under home, so root-ownedmkdir -pchains from session teleport /agentbox cpdon’t block sibling writes (e.g. sqlite under~/.codex/sessions/...). Paths under/etcor/workspaceskip the parent walk.Exports the new helper from
sandbox-cloud;cloud-cp.test.tsasserts the parent-walk script for home vs non-home destinations.Reviewed by Cursor Bugbot for commit 090f9f6. Configure here.