feat: add e2b-haystack integration for E2B cloud sandbox tools by tholor · Pull Request #3195 · deepset-ai/haystack-core-integrations

tholor · 2026-04-21T15:47:23Z

partially addresses #3227

Summary

Migrates the E2B sandbox toolset prototype from deepset-ai/haystack-experimental#448 into a proper e2b-haystack integration package.

Adds E2BSandbox for managing the lifecycle of an E2B cloud sandbox (lazy warm_up, close, full serialisation round-trip)
Adds four Haystack Tool subclasses that share a single sandbox instance: RunBashCommandTool, ReadFileTool, WriteFileTool, ListDirectoryTool
Adds E2BToolset (a Toolset subclass) that bundles all four tools as a convenience — just pass E2BToolset() to any Haystack Agent
Follows the haystack_integrations/tools/ module path convention (same as mcp-haystack and github-haystack)

Test plan

38 unit tests pass (hatch run test:unit) — all sandbox calls are mocked, no API key needed
Lint clean (hatch run fmt-check)
Review examples in integrations/e2b/examples/ for correctness
Manual smoke test with a real E2B API key (integration test, requires E2B_API_KEY)
Maintainer to add E2B_API_KEY secret to GitHub repo for CI integration tests

🤖 Generated with Claude Code

Introduces `e2b-haystack`, a new integration that provides E2B cloud sandbox tools for Haystack agents. Migrated from deepset-ai/haystack-experimental#448. Exposes four tools sharing a single `E2BSandbox` instance: - `RunBashCommandTool` - execute bash commands - `ReadFileTool` - read sandbox filesystem files - `WriteFileTool` - write sandbox filesystem files - `ListDirectoryTool` - list directory contents Plus `E2BToolset` as a convenience Toolset bundling all four tools. Includes 38 unit tests, two usage examples, and full serialisation round-trip support. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add mypy override to ignore missing stubs for `e2b` package (which doesn't ship a py.typed marker or type stubs) - Quote \$GITHUB_OUTPUT in workflow to fix actionlint/shellcheck SC2086 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

github-actions · 2026-04-21T16:27:59Z

Coverage report (e2b)

Click to see where and how coverage changed

File	Statements	Missing	Coverage	Coverage (new stmts)	Lines missing
integrations/e2b/src/haystack_integrations/tools/e2b
bash_tool.py					53-55
e2b_sandbox.py
list_directory_tool.py					72, 80-81
read_file_tool.py
sandbox_toolset.py					74, 78, 82, 90-92
write_file_tool.py					67, 75-76
Project Total

_{This report was generated by python-coverage-comment-action}

Gate the "Run integration tests" CI step on the E2B_API_KEY env var being present, matching the pattern used by other integrations (e.g. cohere). Without this the step exits with code 5 (no tests collected) because there are no integration-marked tests and no API key is configured yet. Also exposes E2B_API_KEY from secrets at the workflow env level so it will be available once a maintainer adds the secret to the repo. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Adds real end-to-end integration tests (marked @pytest.mark.integration) that exercise all four tools against a live E2B sandbox: - RunBashCommandTool: echo, non-zero exit code, stderr capture - WriteFileTool + ReadFileTool: round-trip, nested directory creation - ListDirectoryTool: list /tmp, list after write - E2BToolset: warm_up/close lifecycle, shared sandbox state across tools Also suppresses S108 (/tmp path warning) in test per-file-ignores — /tmp is correct and intentional inside a sandboxed environment. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

The e2b SDK raises CommandExitException (with exit_code/stdout/stderr attributes) instead of returning a result for non-zero exit codes. Detect this via duck-typing and return the formatted result string so the LLM can see and react to the exit status, rather than propagating a ToolInvocationError. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Toolset was introduced in haystack-ai 2.19.0. The previous lower bound of 2.12.0 caused an ImportError on the "lowest direct dependencies" CI run. This matches the floor already used by the mcp integration. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

e2b 1.x does not have the Sandbox.create() classmethod — it was introduced in 2.0.0. The lowest-direct-dependency CI job resolves e2b>=1.0.0 to 1.0.0, causing AttributeError when mock.patch tries to patch Sandbox.create. Bumping the floor to >=2.0.0 fixes the lowest- direct run while keeping Python 3.9+ compatibility (e2b 2.x requires >=3.9). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

socket-security · 2026-04-22T12:08:22Z

Review the following changes in direct dependencies. Learn more about Socket for GitHub.

Diff	Package	Supply Chain Security	Vulnerability	Quality	Maintenance	License
	e2b@2.20.2

View full report

bogdankostic

Looking forward to having this integration soon in Haystack!

I've left a few minor comments, mostly focused on polishing the docstrings. The only major blocker we should address IMO is the deserialization logic for individual tools to ensure the sandbox environments remain consistent within one pipeline.

bogdankostic · 2026-04-24T12:52:41Z

+        }
+
+    @classmethod
+    def from_dict(cls, data: dict[str, Any]) -> "RunBashCommandTool":


If a user builds an Agent with the four tools passed individually (sharing one E2BSandbox) rather than via E2BToolset, and then serializes/deserializes the pipeline, each tool's from_dict constructs its own independent E2BSandbox, breaking the shared state of the tools.

I tried this out by adapting build_pipeline in e2b_pipeline_example:

def build_pipeline() -> Pipeline: sandbox = E2BSandbox(sandbox_template="base", timeout=120) tools = [ WriteFileTool(sandbox=sandbox), RunBashCommandTool(sandbox=sandbox), ] agent = Agent( chat_generator=OpenAIChatGenerator(model="gpt-4o-mini"), tools=tools, system_prompt=( "You are a helpful coding assistant with access to a live Linux sandbox. " "Use the available tools freely to write files and run commands." ), max_agent_steps=10, ) pipeline = Pipeline() pipeline.add_component("agent", agent) return pipeline

Executing the script with this version of build_pipeline, we get the following error:

e2b_pipeline_example.py", line 88, in verify_roundtrip assert len(sandbox_ids) == 1, "Tools should share a single sandbox after round-trip" ^^^^^^^^^^^^^^^^^^^^^ AssertionError: Tools should share a single sandbox after round-trip

@bogdankostic good point - do we have a similar pattern somewhere else in haystack that we can follow here for serializing a shared object?

I don't think there's a straightforwar way to preserve a shared E2BSandbox across individually-serialized tools. The round-trip would need some form of cross-tool deduplication.

A quick solution would be to log a warning in each tool's to_dict pointing users at E2BToolset for the serialize/deserialize path.

Something like:

def to_dict(self) -> dict[str, Any]: logger.warning( "Serializing %s standalone will not preserve a shared E2BSandbox " "across tools after deserialization. If you need to serialize an " "agentic pipeline (e.g. to YAML) with multiple E2B tools sharing " "one sandbox, use E2BToolset instead.", type(self).__name__, ) return { "type": generate_qualified_class_name(type(self)), "data": {"sandbox": self._e2b_sandbox.to_dict()}, }

WDYT?

Hm ok ... I guess the above warning but we the simplest approach.
I explored one alternative to make the round-trip work (see latest commit):

Identity-based dedup inside E2BSandbox.from_dict — give each E2BSandbox a UUID at construction, serialize it, and have from_dict consult a class-level weakref.WeakValueDictionary to return the already-restored instance for the same UUID. Transparent to all callers (tools, toolset, ad-hoc usage).

Pro: Enables production / platform use cases, where you regularly have to serialize / deserialize pipelines. Showing an warning there is not really useful and we should then rather disable the standalone tools for such cases and always push users directly to "E2BToolset"

Con: Makes the code more complex.

What do you think? @sjrl @bogdankostic

I just tried it out, works like a charm 🙌🏼

Co-authored-by: bogdankostic <bogdankostic@web.de>

…ox.py Co-authored-by: bogdankostic <bogdankostic@web.de>

- drop nested py.typed (parent tools/py.typed already covers the namespace) - drop e2b.md API reference (auto-generated post-merge) - remove duplicate e2b>=2.0.0 from test env (already a project dep) Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Each E2BSandbox now carries a stable instance_id. E2BSandbox.from_dict consults a process-wide WeakValueDictionary keyed on that id so multiple tools that shared one sandbox before serialization keep sharing it after round-trip — addresses the case where users pass tools individually (WriteFileTool, RunBashCommandTool, ...) instead of via E2BToolset. A cache hit is only honored when the full serialized config (api_key, template, timeout, environment_vars) matches the cached entry. A crafted YAML reusing another tenant's id but with a different api_key falls through to a fresh instance and never observes the cached one — closes the cross-tenant escalation path that a naive id-only cache would open. On config mismatch the cache entry is preserved (no DoS). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

E2BToolset accepts api_key: Secret | None but E2BSandbox.__init__ requires Secret. Fall back to Secret.from_env_var("E2B_API_KEY") when None. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

… tests _require_sandbox auto-calls warm_up, so the previous tests expecting "E2B sandbox is not running" were silently hitting the live E2B API with a fake key and failing on the 401 response. Mock Sandbox.create to fail and assert each tool wraps the warm_up RuntimeError as ToolInvocationError instead. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Use the same `Secret = Secret.from_env_var("E2B_API_KEY", strict=True)` default as E2BSandbox.__init__ for consistency. Drops the now-unreachable `or` fallback in the E2BSandbox call. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bogdankostic

Nice, looking good now, just found one minor improvement related to changing the default value of api_key that I will directly change.

…ox.py

github-actions Bot added topic:CI type:documentation Improvements or additions to documentation labels Apr 21, 2026

tholor and others added 6 commits April 22, 2026 10:30

Merge branch 'main' into add-e2b-integration

9ce9ae2

tholor marked this pull request as ready for review April 22, 2026 12:09

tholor requested a review from a team as a code owner April 22, 2026 12:09

tholor requested review from bogdankostic and removed request for a team April 22, 2026 12:09

bogdankostic requested changes Apr 24, 2026

View reviewed changes

bogdankostic added the integration:e2b label Apr 24, 2026

bogdankostic self-assigned this Apr 24, 2026

tholor and others added 4 commits April 24, 2026 15:54

Update .github/labeler.yml

1091574

Co-authored-by: bogdankostic <bogdankostic@web.de>

Update integrations/e2b/src/haystack_integrations/tools/e2b/e2b_sandb…

bafe81e

…ox.py Co-authored-by: bogdankostic <bogdankostic@web.de>

address review comments

83d6e82

remove defaults from docstring

8200e30

sjrl reviewed Apr 29, 2026

View reviewed changes

Comment thread integrations/e2b/src/haystack_integrations/tools/e2b/py.typed Outdated

sjrl reviewed Apr 29, 2026

View reviewed changes

Comment thread integrations/e2b/e2b.md Outdated

sjrl reviewed Apr 29, 2026

View reviewed changes

Comment thread integrations/e2b/pyproject.toml Outdated

tholor and others added 4 commits April 29, 2026 15:43

fix(e2b): satisfy E2BSandbox api_key type from E2BToolset

8a01275

E2BToolset accepts api_key: Secret | None but E2BSandbox.__init__ requires Secret. Fall back to Secret.from_env_var("E2B_API_KEY") when None. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

bogdankostic approved these changes Apr 30, 2026

View reviewed changes

Comment thread integrations/e2b/src/haystack_integrations/tools/e2b/e2b_sandbox.py Outdated

Update integrations/e2b/src/haystack_integrations/tools/e2b/e2b_sandb…

dc73967

…ox.py

bogdankostic merged commit d847cc0 into main Apr 30, 2026
18 checks passed

bogdankostic deleted the add-e2b-integration branch April 30, 2026 10:18

Conversation

tholor commented Apr 21, 2026 • edited by bogdankostic Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Uh oh!

github-actions Bot commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Coverage report (e2b)

Uh oh!

socket-security Bot commented Apr 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bogdankostic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bogdankostic Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

tholor Apr 24, 2026

Choose a reason for hiding this comment

Uh oh!

bogdankostic Apr 27, 2026

Choose a reason for hiding this comment

Uh oh!

tholor Apr 29, 2026

Choose a reason for hiding this comment

Uh oh!

bogdankostic Apr 30, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

bogdankostic left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

tholor commented Apr 21, 2026 •

edited by bogdankostic

Loading

github-actions Bot commented Apr 21, 2026 •

edited

Loading

socket-security Bot commented Apr 22, 2026 •

edited

Loading