Skip to content

docs: troubleshooting section for bwrap sandbox failures on restricted hosts#318

Open
AZERIA-IT wants to merge 1 commit into
openai:mainfrom
AZERIA-IT:docs/troubleshooting-bwrap-sandbox
Open

docs: troubleshooting section for bwrap sandbox failures on restricted hosts#318
AZERIA-IT wants to merge 1 commit into
openai:mainfrom
AZERIA-IT:docs/troubleshooting-bwrap-sandbox

Conversation

@AZERIA-IT
Copy link
Copy Markdown

Summary

  • Adds a ## Troubleshooting section to the root README.md covering bubblewrap (bwrap) sandbox initialization failures on capability-restricted hosts (VPS, restricted LXC, kernels with kernel.unprivileged_userns_clone=0, runtimes that drop CAP_NET_ADMIN/CAP_SYS_ADMIN).
  • Documents the failure mode (tasks complete with status completed but apply_patch and shell tool calls fail, with bwrap: loopback: Failed RTM_NEWADDR: Operation not permitted in Codex worker logs), gives a one-line diagnostic, and two workarounds.
  • Cross-links the relevant in-flight code PRs and tracking issues so users hitting this in the field can self-diagnose and pick an interim path.

Why docs-only, not code

There are already four open PRs proposing different code fixes for the same underlying behavior:

Issues #240 and #304 track the underlying problem. Adding a fifth code variant would be noise. This PR only adds user-facing documentation so people hitting the bug today can diagnose it and apply an interim workaround while maintainers decide which fix to merge.

Repro environment

  • OVH VPS (Ubuntu 25.04), running the plugin against a remote Codex worker.
  • Plugin version 1.0.4 (current main, commit 807e03a).
  • bwrap --bind / / --dev /dev --proc /proc --unshare-net true fails on the host with the loopback RTM_NEWADDR error.
  • After applying both workarounds from this section (sysctl + danger-full-access config + running outside the companion), the same task completed in ~43s and wrote the expected file.

Test plan

  • Reproduced the bug on an OVH VPS (plugin v1.0.4, default settings)
  • Confirmed Workaround A (sysctl kernel.unprivileged_userns_clone=1) resolves it locally
  • Confirmed Workaround B (danger-full-access + running outside the companion) resolves it locally
  • Verified README renders correctly after merge

Context

This was first reported downstream; see AZERIA-IT/claude-code-codex-task#1 for additional repro details and logs.

If this repo requires a CLA, please tag the PR and I'll sign it. Commits are DCO Signed-off-by: already.

…d hosts

Adds a Troubleshooting section to the README covering bubblewrap (bwrap)
sandbox initialization failures on capability-restricted hosts (VPS,
restricted LXC, kernels with kernel.unprivileged_userns_clone=0,
runtimes dropping CAP_NET_ADMIN or CAP_SYS_ADMIN).

Documents the symptom (tasks complete with apply_patch and shell tool
failures, bwrap loopback RTM_NEWADDR error in worker logs), a one-line
diagnostic, and two workarounds:

- Host-side: enable unprivileged user namespaces via sysctl.
- Reduced sandboxing: set danger-full-access in ~/.codex/config.toml,
  with the caveat that codex-companion.mjs currently overrides this
  per-turn until one of the in-flight code PRs lands (openai#147, openai#226, openai#241,
  openai#260). Cross-links issues openai#240 and openai#304 for context.

Docs-only change. No code, manifest, or schema files touched.

Signed-off-by: Ubuntu <ubuntu@vps-bba8e540.vps.ovh.net>
@AZERIA-IT AZERIA-IT requested a review from a team May 12, 2026 13:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant