Skip to content

fix(codex): drop heavy host-only artifacts from codex config staging#125

Merged
madarco merged 1 commit into
nightlyfrom
fix/codex-static-trim
Jun 27, 2026
Merged

fix(codex): drop heavy host-only artifacts from codex config staging#125
madarco merged 1 commit into
nightlyfrom
fix/codex-static-trim

Conversation

@madarco

@madarco madarco commented Jun 27, 2026

Copy link
Copy Markdown
Owner

Problem

Host ~/.codex is ~1.1 GB and was synced into boxes almost whole — ~800 MB of it is macOS-host artifacts useless in a Linux box:

Dir Size What
packages/standalone 485M macOS aarch64 Codex release binaries (in-box codex is npm-installed)
plugins/.plugin-appserver 238M platform-specific plugin app-server runtime
computer-use 57M Codex Computer Use.app macOS bundle
.tmp 213M plugin marketplace cache (regenerable)
archived_sessions, tmp, cache, vendor_imports, sqlite, models_cache.json host history / regenerable caches

This made the scp codex static step during cloud prepare (and the docker volume sync on every create) crawl.

Fix

Exclude the heavy host-only dirs from both codex staging paths:

  • Cloud bake (CODEX_RSYNC_EXCLUDES, host-stage.ts) — used by hetzner/vercel/e2b/daytona stageCodexStaticForUpload.
  • Docker volume sync (codex.ts, the agentbox-codex-config rsync) — plus the matching rm -rf purge so existing shared volumes get trimmed on the next sync.

Config / auth / skills / prompts / rules / memories / plugins are still synced, so codex keeps working.

Verification

  • Cloud (dry-run): staged codex tarball 820 MB → 482 KB (121 files; config.toml/skills/prompts kept).
  • Docker (live e2e): fresh box from ../agentbox-test-repo-gh~/.codex 1.5 GB → 59 MB, all junk dirs purged, kept config intact, and codex exec returned a real PONG. All 297 sandbox-docker tests pass.
  • Cloud live-validation on Hetzner was blocked by an unrelated, sustained Cloudflare 403 on the datacenter IP (the claude-installer issue from fix(hetzner): retry Claude native installer with backoff + clear error #123) preventing a fresh bake; the change's correctness is covered by the dry-run + the docker live test (which exercises a superset of the same trimmed file set).

https://claude.ai/code/session_019m5WHxP4vmsoXaHUhQdY9e


Note

Low Risk
Narrow rsync/staging filter changes with no auth or config semantics changes; only drops host artifacts that are useless in-box.

Overview
Stops copying macOS-only and regenerable bulk from the host ~/.codex into Linux sandboxes, which had been inflating cloud prepare tarballs and Docker agentbox-codex-config syncs to hundreds of MB or ~1.5 GB.

Cloud prepare (CODEX_RSYNC_EXCLUDES in host-stage.ts) now skips packages, plugins/.plugin-appserver, computer-use, and archived_sessions, alongside existing cache/runtime excludes.

Docker volume sync (ensureCodexVolume in codex.ts) applies the same rsync excludes and extends the post-sync rm -rf purge so volumes that already absorbed those paths get trimmed on the next create. Config, auth, skills, prompts, and other real settings still sync.

Reviewed by Cursor Bugbot for commit e3f7193. Configure here.

The host `~/.codex` is ~1.1 GB and was being synced into boxes almost whole:
~485 MB of macOS aarch64 standalone release binaries (`packages/`), a ~238 MB
plugin app-server runtime (`plugins/.plugin-appserver`), the macOS
`Codex Computer Use.app` bundle (`computer-use/`), host session archives, and
regenerable caches (`.tmp` ~213 MB, `tmp`, `cache`, `vendor_imports`, `sqlite`,
`models_cache.json`). None of it is usable in a Linux box — the in-box codex is
npm-installed and rebuilds these caches on demand.

Exclude all of it from both codex staging paths:
- `CODEX_RSYNC_EXCLUDES` (host-stage.ts) — the cloud bake path (hetzner/vercel/
  e2b/daytona `stageCodexStaticForUpload`). Dry-run: staged tarball 820 MB -> 482 KB.
- the docker `agentbox-codex-config` volume rsync (codex.ts), plus its `rm -rf`
  purge so existing shared volumes get cleaned on the next sync. Verified live:
  a fresh docker box's `~/.codex` dropped 1.5 GB -> 59 MB and `codex exec` still
  returns a real turn.

Config / auth / skills / prompts / rules / memories / plugins are still synced,
so codex keeps working — just without the host-only ballast.

Claude-Session: https://claude.ai/code/session_019m5WHxP4vmsoXaHUhQdY9e
@vercel

vercel Bot commented Jun 27, 2026

Copy link
Copy Markdown

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
agentbox-web Skipped Skipped Jun 27, 2026 3:31pm

Request Review

@madarco madarco merged commit 968aac8 into nightly Jun 27, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant