Warning
This project is still in early-stage development. You can expect breaking changes between releases.
Run AI coding agents in a locked-down local sandbox with:
- Minimal filesystem access (read/write access to only the repository directory)
- Configurable egress policy enforced by sidecar proxy (mitmproxy sidecar blocks non-allowed domains)
- Iptables firewall preventing direct outbound (all traffic must go through the proxy)
- Reproducible environments (Debian container with pinned dependencies)
- Persistent volume for agent state - auth and config preserved across container restarts
- Ability to easily switch between agents without losing state
- Support for CLI and devcontainers (including VS Code and JetBrains IDEs)
Target platform: Colima + Docker Engine on Apple Silicon. Should work with any Docker-compatible runtime.
CLI (preferred) - run the agent in a terminal session managed by agentbox exec.
Devcontainer - open the project in VS Code or JetBrains and let the IDE manage the container lifecycle.
| Agent | CLI | VS Code | JetBrains |
|---|---|---|---|
| Claude Code | π’ | π’ | π’ |
| Codex CLI | π’ | π΅ | π΅ |
| Copilot CLI | π’ | π΅ | π΄ |
- π’ Full Support - stable, heavily used.
- π΅ Preview - functional but not heavily tested.
- π΄ Not Supported - known blockers.
- Copilot's IntelliJ plugin cannot complete auth in a devcontainer.
You need docker and docker-compose installed. So far we've tested with Colima + Docker Engine, but this should work with Docker Desktop for Mac or Podman as well. Instructions that follow are for Colima.
# colima for VM, docker packages for running containers in VM, yq is a dependency for agentbox cli
brew install colima docker docker-compose docker-buildx yq
colima start --cpu 4 --memory 8 --disk 60The CLI is a helper script and thin wrapper around docker-compose that simplifies the process of initializing and starting the sandbox.
# Clone the repo
git clone https://github.com/mattolson/agent-sandbox.git
# Add agenbox bin directory to your path (add this to your .bashrc or .zshrc)
export PATH="$PWD/agent-sandbox/cli/bin:$PATH"yq is required for certain CLI functionality. Install with brew install yq.
You can also run the agentbox CLI through a published docker image if you don't want to install anything locally:
# Pull the image to local docker
docker pull ghcr.io/mattolson/agent-sandbox-cli
# Add to your .bashrc or .zshrc
alias agentbox='docker run --rm -it -v "/var/run/docker.sock:/var/run/docker.sock" -v"$PWD:$PWD" -w"$PWD" -e TERM -e HOME --network none ghcr.io/mattolson/agent-sandbox-cli'agentbox initThis prompts interactively for the project name, agent, mode, and IDE when needed, then generates the docker compose and network policy files for the sandbox.
For scripted use, you can pass flags to skip the selection prompts. Use --batch to disable all prompts:
agentbox init --batch --agent claude --mode cliSee the CLI README for the full list of flags and environment variables.
To inspect the configuration after init, use agentbox policy config to output the effective network policy and
agentbox compose config for the fully combined docker compose stack.
CLI:
# Open a shell in the agent container
agentbox exec
# Start your agent cli (e.g. claude). Because you're in a sandbox, you can even try yolo mode!
claude --dangerously-skip-permissionsDevcontainer (VS Code / JetBrains):
VS Code:
- Install the Dev Containers extension
- Command Palette > "Dev Containers: Reopen in Container"
JetBrains (IntelliJ, PyCharm, WebStorm, etc.):
- Open your project
- From the Remote Development menu, select "Dev Containers"
- Select the devcontainer configuration
Follow the setup instructions specific to the agent image you are using:
Switch to a different agent without reinitializing the project:
agentbox switch --agent codexswitch preserves user-owned override files and per-agent state volumes (credentials, history). In devcontainer projects it regenerates .devcontainer/devcontainer.json for the selected agent.
Network enforcement has two layers:
- Proxy (mitmproxy sidecar) - Enforces a domain allowlist at the HTTP/HTTPS level. Blocks requests to non-allowed domains with 403.
- Firewall (iptables) - Blocks all direct outbound from the agent container. Only the Docker host network is reachable, which is where the proxy sidecar runs. This prevents applications from bypassing the proxy.
The proxy image ships with a default policy that blocks all traffic. You must mount a policy file to allow any outbound requests. agentbox init will set this up for you.
The agent container has HTTP_PROXY/HTTPS_PROXY set to point at the proxy sidecar. The proxy runs a mitmproxy addon (enforcer.py) that checks every HTTP request and HTTPS CONNECT tunnel against the domain allowlist. Non-matching requests get a 403 response.
The agent's iptables firewall (init-firewall.sh) blocks all direct outbound except to the Docker bridge network. This means even if an application ignores the proxy env vars, it cannot reach the internet directly.
The proxy's CA certificate is shared via a Docker volume and automatically installed into the agent's system trust store at startup.
The network policy lives in your project in the .agent-sandbox/policy/ directory. Devcontainer projects also get a
managed .agent-sandbox/policy/policy.devcontainer.yaml layer for IDE-related allowlists, but user-owned policy edits
stay in the shared .agent-sandbox/policy/ files. These files can be checked into version control and shared with
your team.
To edit the policy file:
agentbox edit policyThis opens the network policy file in your editor. If you save changes, the proxy service will automatically restart to apply the new policy.
Example policy:
services:
- claude
domains:
# Add your own
- registry.npmjs.org
- pypi.orgThe .agent-sandbox directory, and in devcontainer workflows the .devcontainer/ directory, are mounted read-only
inside the agent container. The proxy reads the policy at startup, so changes require a restart from the host.
See docs/policy/schema.md for the full policy format reference. If you still have legacy single-file sandbox files, use the upgrade guide before editing policy or compose.
- Git inside the container - Credential setup and SSH-to-HTTPS rewriting
- Dotfiles and shell customization - Mount dotfiles and shell.d scripts
- Language stacks - Extend the base image with Python, Node, Go, Rust
- Image versioning - Pin and bump image digests
- Troubleshooting - Common issues and fixes
This project reduces risk but does not eliminate it. Local dev is inherently best-effort sandboxing.
Key principles:
- Minimal mounts: only the repo workspace + project-scoped agent state
- Network egress is tightly controlled through sidecar proxy with default deny policy
- Firewall verification runs at every container start
If you store git credentials inside the container (via git credential-store or any other method), the token grants access to whatever repositories it was scoped to. A classic personal access token or OAuth token grants access to all repositories your GitHub account can access, not just the current project. The network allowlist limits where data can be sent, but an agent with a broad token could read or modify any of your repos on github.com.
To limit exposure:
- Run git from the host - No credentials in the container at all
- Use a fine-grained PAT - Scope the token to specific repositories
- Use a separate GitHub account - Isolate sandboxed work entirely
Operating as a devcontainer (VS Code or JetBrains) opens a management channel between the IDE and the container. This channel is separate from the agent's normal network data plane.
What this means in practice:
- The proxy and iptables firewall still constrain ordinary outbound traffic from processes in the container
- IDE-managed features such as port forwarding, localhost callbacks, opening browser URLs on the host, and extension RPC are part of a separate control plane
- Blocking the container's bridge-network traffic does not fully remove that IDE control plane
- Installing IDE extensions can introduce additional risk
Treat the IDE and its extensions as trusted host-side code. If you want the tightest boundary, use CLI mode instead of devcontainer mode.
For Codex specifically, prefer device code OAuth in sandboxed environments. It avoids localhost callback flows entirely.
See SECURITY.md for the reporting process. Do not post full reproduction details for sandbox escapes, proxy or firewall bypasses, credential exposure, or similar security issues in a public issue.
See docs/roadmap.md for planned features and milestones.
Running into problems? Check the troubleshooting guide.
See CONTRIBUTING.md for contribution paths, issue labels, planning requirements, and PR expectations.