Correlate scratchpad completion with `run_id` by manzt · Pull Request #9350 · marimo-team/marimo

manzt · 2026-04-23T21:55:15Z

Closes #9302
Fixes #9255, and flips the 4 xfail integration tests from #9342 flip to passing.

The ScratchCellListener used to fire its done sentinel on the scratch cell's idle status (+ 50ms grace for flushing stderr/stdout). Anything broadcast after that grace was silently dropped from the SSE stream.

These changes add an optional run_id to ExecuteScratchpadCommand and CompletedRunNotification. The api endpoint and MCP mint a UUID and pass it to both the command and the ScratchCellListener; the listener now fires its sentinel only on the matching CompletedRunNotification.

Note: The reason we couldn't just observe CompletedRunNotification directly (#9302) is because the CompletedRunNotificationn from session.instantiate trips up the listener early (separate run).

Summary by cubic

Correlate each scratchpad run with a run_id and wait for the matching CompletedRunNotification before finishing. This prevents early success in /api/kernel/execute, streams compile-time errors, and simplifies the done SSE.

Bug Fixes
- Add optional run_id to ExecuteScratchpadCommand and CompletedRunNotification; /api/kernel/execute and the MCP code server mint a UUID and ScratchCellListener waits for the matching completion, ignoring others.
- Always emit CompletedRunNotification in a finally so listeners don’t hang if run_scratchpad raises.
- Stream compile-time errors to stderr and keep console output, so SyntaxError diagnostics are visible before done.
- Add integration tests for ctx.create_cell validation and the skip_validation=True path to cover early RuntimeError reporting and graph-state failures.
Migration
- done SSE now returns { success, output } only; the error field was removed. On failure output is { mimetype: "text/plain", data: "" }.
- OpenAPI and generated TS types updated: CompletedRunNotification.run_id?, ExecuteScratchpadCommand.runId?.

^{Written for commit a01e152. Summary will update on new commits.}

vercel · 2026-04-23T21:55:20Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
marimo-docs	Ready	Preview, Comment	Apr 23, 2026 10:42pm

cubic-dev-ai

No issues found across 11 files

Copilot

Pull request overview

This PR improves scratchpad execution streaming by correlating “completion” with a specific scratchpad run via a run_id, so the SSE listener doesn’t terminate early and drop downstream reactive errors (fixing #9255 and enabling previously-xfail integration scenarios to pass).

Changes:

Add optional run_id correlation to ExecuteScratchpadCommand and CompletedRunNotification, and plumb it through the HTTP execute endpoint + MCP execute_code.
Update ScratchCellListener to treat a matching CompletedRunNotification(run_id=...) as the completion sentinel (instead of scratch-cell idle), and ensure completion is always broadcast via finally.
Standardize the terminal SSE done payload to {success, output} and update unit/integration tests + OpenAPI schema accordingly.

Reviewed changes

Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`marimo/_server/scratchpad.py`	Switch listener completion sentinel to `CompletedRunNotification` filtered by `run_id`; change `done` event shape to `{success, output}`.
`marimo/_runtime/runtime.py`	Always broadcast `CompletedRunNotification(run_id=...)` for scratchpad via `try/finally` to prevent listeners from blocking forever.
`marimo/_runtime/commands.py`	Add `run_id: str
`marimo/_messaging/notification.py`	Add optional `run_id` field to `CompletedRunNotification`.
`marimo/_server/api/endpoints/execution.py`	Mint a UUID `run_id` per `/api/kernel/execute` call and pass it to both the command and listener.
`marimo/_mcp/code_server/main.py`	Mint/pass `run_id` for MCP `execute_code` so it waits for the correct completion event.
`tests/_server/test_scratchpad.py`	Update listener construction (requires `run_id`) and update expectations for the new `done` payload shape + completion sentinel.
`tests/_server/test_scratchpad_integration.py`	Flip previously-xfail scenarios to passing and update SSE snapshots to the new behavior.
`packages/openapi/api.yaml`	Document/add `run_id` on `CompletedRunNotification` and `runId` on `ExecuteScratchpad*` schemas.
`packages/openapi/src/api.ts`	Regenerated TS types to include the new `run_id` / `runId` fields.
`marimo/_schemas/generated/notifications.yaml`	Regenerated notification schema reflecting `CompletedRunNotification.run_id` (and related schema updates).

Copilot · 2026-04-23T22:01:48Z

+    ``success`` is false when the scratch cell itself errored OR any
+    downstream cell captured by the listener errored. The actual error
+    detail was already streamed via ``stderr`` events earlier in the
+    response — ``done`` carries only the success bit plus the scratch
+    cell's rendered output on success (empty on failure).


build_done_event no longer includes any error payload for failures and assumes the traceback/detail was already streamed via preceding stderr events. That assumption doesn't hold for scratchpad compile errors (e.g. MarimoSyntaxError from _try_compiling_cell), which are broadcast via CellNotificationUtils.broadcast_error as output=MARIMO_ERROR without emitting any stderr console output. With the current SSE shape, /api/kernel/execute callers may only see {success: false, output: {…}} with no error detail. Consider emitting a synthetic stderr SSE event when the scratch cell output channel is MARIMO_ERROR (or reintroducing a minimal error/errors field in the done payload) so failures always include actionable diagnostics.

Copilot · 2026-04-23T22:01:48Z

+``{success: false, output: {mimetype: "text/plain", data: ""}}``;
+error detail arrives earlier in the stream as ``stderr`` SSE events.


The updated module docstring states that failure cases always deliver error detail earlier via stderr SSE events and that the terminal done event is uniformly {success, output}. There isn't an integration test covering a scratchpad compile-time failure (e.g. a SyntaxError / MarimoSyntaxError from _try_compiling_cell), which historically may not emit console stderr. Adding a scenario like session.execute("x =") would validate the new contract and guard against silent failures.

Suggested change

``{success: false, output: {mimetype: "text/plain", data: ""}}``;

error detail arrives earlier in the stream as ``stderr`` SSE events.

``{success: false, output: {mimetype: "text/plain", data: ""}}``.

When the kernel emits error detail before ``done``, these snapshots

assert it through earlier SSE events such as ``stderr``.

Fixes #9255. The ``ScratchCellListener`` used to fire its done sentinel on the scratch cell's ``idle`` status, relying on a 50ms grace for trailing output. Anything broadcast after that grace (most commonly an ``mo.state`` setter whose reactive descendants are flushed *after* the scratch runner returns, per ``Kernel.run_scratchpad``) was silently dropped from the SSE stream — ``/api/kernel/execute`` would return ``success: true`` while a downstream cell was in an exception state. ``ExecuteScratchpadCommand`` and ``CompletedRunNotification`` gain an optional ``run_id``. ``/api/kernel/execute`` and the MCP code server mint a UUID and pass it to both the command and the ``ScratchCellListener``; the listener now fires its sentinel only on the matching ``CompletedRunNotification``. Unrelated completions (from the ``session.instantiate`` call the endpoint makes first, or from concurrent browser activity) are ignored instead of tripping the listener early. ``handle_execute_scratchpad`` broadcasts its completion in a ``try/finally`` so a raising ``run_scratchpad`` can't leave the listener blocked indefinitely. The ``done`` SSE event is reshaped to a single ``{success, output}`` form. The ``error`` field is removed — the traceback is already in preceding ``stderr`` events, so duplicating it on ``done`` was redundant. On failure, ``output`` is ``{mimetype: "text/plain", data: ""}``. ``execute-code.sh`` is unaffected: it reads ``.output.data // empty`` for the success path and ``.success`` for the exit code. The four xfail integration tests from #9342 flip to passing.

Adds two integration tests that cover the `ctx.create_cell` validation surface: the default dry-run compile (raises early on multiply-defined names with the `skip_validation` hint) and the `skip_validation=True` bypass.

manzt · 2026-04-23T22:41:47Z

+
+
+def test_ctx_create_cell_multiply_defined(session: _Session) -> None:
+    """``ctx.create_cell`` introducing a duplicate definition errors early.
+
+    The notebook already has ``x = 10`` in ``cell_a``. code_mode's
+    dry-run compile detects the new cell would multiply-define ``x``
+    and raises ``RuntimeError`` before any real mutation — the new
+    cell is never registered and ``x`` stays ``10``.
+    """
+    session.setup_cells(["cell_a"], ["x = 10"])
+
+    lines = session.execute(
+        "import marimo._code_mode as cm\n"
+        "async with cm.get_context() as ctx:\n"
+        '    ctx.create_cell("x = 20")',
+    )
+
+    assert lines == snapshot(
+        [
+            "event: stderr",
+            'data: {"data": "Traceback (most recent call last):\\n  File \\"<marimo>/marimo/_runtime/executor.py\\", line N, in execute_cell_async\\n    await eval(cell.body, glbls)\\n  File \\"<tmp>\\", line 2, in <module>\\n    async with cm.get_context() as ctx:\\n  File \\"<marimo>/marimo/_code_mode/_context.py\\", line N, in __aexit__\\n    self._dry_run_compile(ops)\\n  File \\"<marimo>/marimo/_code_mode/_context.py\\", line N, in _dry_run_compile\\n    raise RuntimeError(\\n    )\\nRuntimeError: Multiply-defined names:\\n  - \'x\' is already defined in cell \'cell_a\' (cell_a)\\n\\nTo skip validation, use: async with cm.get_context(skip_validation=True) as ctx\\n"}',
+            "",
+            "event: done",
+            'data: {"success": false, "output": {"mimetype": "text/plain", "data": ""}}',
+            "",
+        ]
+    )
+
+
+def test_ctx_create_cell_skip_validation(session: _Session) -> None:
+    """``skip_validation=True`` bypasses the dry-run compile check.
+
+    Same setup as above, but the caller opts out of validation — no
+    ``RuntimeError`` is raised and the new cell is registered, as
+    evidenced by the ``created cell`` stdout summary. The kernel's own
+    graph-validity pass still flags the resulting multiply-defined
+    state (visible to the listener as a child error, hence
+    ``success: false``), but no traceback reaches stderr because the
+    new cell never runs — the error is a pure graph-state marker.
+    """
+    session.setup_cells(["cell_a"], ["x = 10"])
+
+    lines = session.execute(
+        "import marimo._code_mode as cm\n"
+        "async with cm.get_context(skip_validation=True) as ctx:\n"
+        '    ctx.create_cell("x = 20")',
+    )
+
+    assert lines == snapshot(
+        [
+            "event: stdout",
+            'data: {"data": "created cell \'<cid>\'\\n"}',
+            "",
+            "event: done",
+            'data: {"success": false, "output": {"mimetype": "text/plain", "data": ""}}',
+            "",
+        ]
+    )


@mscolnick some drive-by, additional code mode tests.

codecov · 2026-04-24T02:49:31Z

Bundle Report

Changes will increase total bundle size by 25.11MB (100.0%) ⬆️⚠️, exceeding the configured threshold of 5%.

Bundle name	Size	Change
marimo-esm	25.11MB	25.11MB (100%) ⬆️⚠️

Copilot AI review requested due to automatic review settings April 23, 2026 21:55

github-actions Bot added the bash-focus Area to focus on during release bug bash label Apr 23, 2026

Copilot started reviewing on behalf of manzt April 23, 2026 21:55 View session

manzt added the enhancement New feature or request label Apr 23, 2026

cubic-dev-ai Bot reviewed Apr 23, 2026

View reviewed changes

manzt requested a review from mscolnick April 23, 2026 21:58

manzt force-pushed the fix/scratchpad-run-id-correlation branch from d5ebe35 to 85be69a Compare April 23, 2026 21:59

vercel Bot deployed to Preview April 23, 2026 22:00 View deployment

Copilot AI reviewed Apr 23, 2026

View reviewed changes

manzt mentioned this pull request Apr 23, 2026

fix(scratchpad): wait for CompletedRunNotification to capture downstream reactive errors #9302

Closed

manzt force-pushed the fix/scratchpad-run-id-correlation branch from 85be69a to d5cf740 Compare April 23, 2026 22:08

vercel Bot deployed to Preview April 23, 2026 22:09 View deployment

manzt force-pushed the fix/scratchpad-run-id-correlation branch from d5cf740 to 690916b Compare April 23, 2026 22:19

vercel Bot deployed to Preview April 23, 2026 22:20 View deployment

Test code_mode create_cell validation paths

a01e152

Adds two integration tests that cover the `ctx.create_cell` validation surface: the default dry-run compile (raises early on multiply-defined names with the `skip_validation` hint) and the `skip_validation=True` bypass.

manzt force-pushed the fix/scratchpad-run-id-correlation branch from 6165fd9 to a01e152 Compare April 23, 2026 22:40

manzt commented Apr 23, 2026

View reviewed changes

vercel Bot deployed to Preview April 23, 2026 22:42 View deployment

mscolnick approved these changes Apr 27, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Correlate scratchpad completion with `run_id`#9350

Correlate scratchpad completion with `run_id`#9350
manzt wants to merge 2 commits intomainfrom
fix/scratchpad-run-id-correlation

manzt commented Apr 23, 2026 •

edited by cubic-dev-ai Bot

Loading

Uh oh!

vercel Bot commented Apr 23, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai Bot left a comment

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

Copilot AI Apr 23, 2026

Uh oh!

manzt Apr 23, 2026

Uh oh!

codecov Bot commented Apr 24, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		``{success: false, output: {mimetype: "text/plain", data: ""}}``;
		error detail arrives earlier in the stream as ``stderr`` SSE events.

-``{success: false, output: {mimetype: "text/plain", data: ""}}``;
-error detail arrives earlier in the stream as ``stderr`` SSE events.
+``{success: false, output: {mimetype: "text/plain", data: ""}}``.
+When the kernel emits error detail before ``done``, these snapshots
+assert it through earlier SSE events such as ``stderr``.

Conversation

manzt commented Apr 23, 2026 • edited by cubic-dev-ai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by cubic

Uh oh!

vercel Bot commented Apr 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

manzt Apr 23, 2026

Choose a reason for hiding this comment

Uh oh!

codecov Bot commented Apr 24, 2026

Bundle Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

manzt commented Apr 23, 2026 •

edited by cubic-dev-ai Bot

Loading

vercel Bot commented Apr 23, 2026 •

edited

Loading