Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
d5ebe35 to
85be69a
Compare
There was a problem hiding this comment.
Pull request overview
This PR improves scratchpad execution streaming by correlating “completion” with a specific scratchpad run via a run_id, so the SSE listener doesn’t terminate early and drop downstream reactive errors (fixing #9255 and enabling previously-xfail integration scenarios to pass).
Changes:
- Add optional
run_idcorrelation toExecuteScratchpadCommandandCompletedRunNotification, and plumb it through the HTTP execute endpoint + MCPexecute_code. - Update
ScratchCellListenerto treat a matchingCompletedRunNotification(run_id=...)as the completion sentinel (instead of scratch-cellidle), and ensure completion is always broadcast viafinally. - Standardize the terminal SSE
donepayload to{success, output}and update unit/integration tests + OpenAPI schema accordingly.
Reviewed changes
Copilot reviewed 10 out of 11 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
marimo/_server/scratchpad.py |
Switch listener completion sentinel to CompletedRunNotification filtered by run_id; change done event shape to {success, output}. |
marimo/_runtime/runtime.py |
Always broadcast CompletedRunNotification(run_id=...) for scratchpad via try/finally to prevent listeners from blocking forever. |
marimo/_runtime/commands.py |
Add `run_id: str |
marimo/_messaging/notification.py |
Add optional run_id field to CompletedRunNotification. |
marimo/_server/api/endpoints/execution.py |
Mint a UUID run_id per /api/kernel/execute call and pass it to both the command and listener. |
marimo/_mcp/code_server/main.py |
Mint/pass run_id for MCP execute_code so it waits for the correct completion event. |
tests/_server/test_scratchpad.py |
Update listener construction (requires run_id) and update expectations for the new done payload shape + completion sentinel. |
tests/_server/test_scratchpad_integration.py |
Flip previously-xfail scenarios to passing and update SSE snapshots to the new behavior. |
packages/openapi/api.yaml |
Document/add run_id on CompletedRunNotification and runId on ExecuteScratchpad* schemas. |
packages/openapi/src/api.ts |
Regenerated TS types to include the new run_id / runId fields. |
marimo/_schemas/generated/notifications.yaml |
Regenerated notification schema reflecting CompletedRunNotification.run_id (and related schema updates). |
| ``success`` is false when the scratch cell itself errored OR any | ||
| downstream cell captured by the listener errored. The actual error | ||
| detail was already streamed via ``stderr`` events earlier in the | ||
| response — ``done`` carries only the success bit plus the scratch | ||
| cell's rendered output on success (empty on failure). |
There was a problem hiding this comment.
build_done_event no longer includes any error payload for failures and assumes the traceback/detail was already streamed via preceding stderr events. That assumption doesn't hold for scratchpad compile errors (e.g. MarimoSyntaxError from _try_compiling_cell), which are broadcast via CellNotificationUtils.broadcast_error as output=MARIMO_ERROR without emitting any stderr console output. With the current SSE shape, /api/kernel/execute callers may only see {success: false, output: {…}} with no error detail. Consider emitting a synthetic stderr SSE event when the scratch cell output channel is MARIMO_ERROR (or reintroducing a minimal error/errors field in the done payload) so failures always include actionable diagnostics.
| ``{success: false, output: {mimetype: "text/plain", data: ""}}``; | ||
| error detail arrives earlier in the stream as ``stderr`` SSE events. |
There was a problem hiding this comment.
The updated module docstring states that failure cases always deliver error detail earlier via stderr SSE events and that the terminal done event is uniformly {success, output}. There isn't an integration test covering a scratchpad compile-time failure (e.g. a SyntaxError / MarimoSyntaxError from _try_compiling_cell), which historically may not emit console stderr. Adding a scenario like session.execute("x =") would validate the new contract and guard against silent failures.
| ``{success: false, output: {mimetype: "text/plain", data: ""}}``; | |
| error detail arrives earlier in the stream as ``stderr`` SSE events. | |
| ``{success: false, output: {mimetype: "text/plain", data: ""}}``. | |
| When the kernel emits error detail before ``done``, these snapshots | |
| assert it through earlier SSE events such as ``stderr``. |
85be69a to
d5cf740
Compare
Fixes #9255. The ``ScratchCellListener`` used to fire its done sentinel on the scratch cell's ``idle`` status, relying on a 50ms grace for trailing output. Anything broadcast after that grace (most commonly an ``mo.state`` setter whose reactive descendants are flushed *after* the scratch runner returns, per ``Kernel.run_scratchpad``) was silently dropped from the SSE stream — ``/api/kernel/execute`` would return ``success: true`` while a downstream cell was in an exception state. ``ExecuteScratchpadCommand`` and ``CompletedRunNotification`` gain an optional ``run_id``. ``/api/kernel/execute`` and the MCP code server mint a UUID and pass it to both the command and the ``ScratchCellListener``; the listener now fires its sentinel only on the matching ``CompletedRunNotification``. Unrelated completions (from the ``session.instantiate`` call the endpoint makes first, or from concurrent browser activity) are ignored instead of tripping the listener early. ``handle_execute_scratchpad`` broadcasts its completion in a ``try/finally`` so a raising ``run_scratchpad`` can't leave the listener blocked indefinitely. The ``done`` SSE event is reshaped to a single ``{success, output}`` form. The ``error`` field is removed — the traceback is already in preceding ``stderr`` events, so duplicating it on ``done`` was redundant. On failure, ``output`` is ``{mimetype: "text/plain", data: ""}``. ``execute-code.sh`` is unaffected: it reads ``.output.data // empty`` for the success path and ``.success`` for the exit code. The four xfail integration tests from #9342 flip to passing.
d5cf740 to
690916b
Compare
Adds two integration tests that cover the `ctx.create_cell` validation surface: the default dry-run compile (raises early on multiply-defined names with the `skip_validation` hint) and the `skip_validation=True` bypass.
6165fd9 to
a01e152
Compare
|
|
||
|
|
||
| def test_ctx_create_cell_multiply_defined(session: _Session) -> None: | ||
| """``ctx.create_cell`` introducing a duplicate definition errors early. | ||
|
|
||
| The notebook already has ``x = 10`` in ``cell_a``. code_mode's | ||
| dry-run compile detects the new cell would multiply-define ``x`` | ||
| and raises ``RuntimeError`` before any real mutation — the new | ||
| cell is never registered and ``x`` stays ``10``. | ||
| """ | ||
| session.setup_cells(["cell_a"], ["x = 10"]) | ||
|
|
||
| lines = session.execute( | ||
| "import marimo._code_mode as cm\n" | ||
| "async with cm.get_context() as ctx:\n" | ||
| ' ctx.create_cell("x = 20")', | ||
| ) | ||
|
|
||
| assert lines == snapshot( | ||
| [ | ||
| "event: stderr", | ||
| 'data: {"data": "Traceback (most recent call last):\\n File \\"<marimo>/marimo/_runtime/executor.py\\", line N, in execute_cell_async\\n await eval(cell.body, glbls)\\n File \\"<tmp>\\", line 2, in <module>\\n async with cm.get_context() as ctx:\\n File \\"<marimo>/marimo/_code_mode/_context.py\\", line N, in __aexit__\\n self._dry_run_compile(ops)\\n File \\"<marimo>/marimo/_code_mode/_context.py\\", line N, in _dry_run_compile\\n raise RuntimeError(\\n )\\nRuntimeError: Multiply-defined names:\\n - \'x\' is already defined in cell \'cell_a\' (cell_a)\\n\\nTo skip validation, use: async with cm.get_context(skip_validation=True) as ctx\\n"}', | ||
| "", | ||
| "event: done", | ||
| 'data: {"success": false, "output": {"mimetype": "text/plain", "data": ""}}', | ||
| "", | ||
| ] | ||
| ) | ||
|
|
||
|
|
||
| def test_ctx_create_cell_skip_validation(session: _Session) -> None: | ||
| """``skip_validation=True`` bypasses the dry-run compile check. | ||
|
|
||
| Same setup as above, but the caller opts out of validation — no | ||
| ``RuntimeError`` is raised and the new cell is registered, as | ||
| evidenced by the ``created cell`` stdout summary. The kernel's own | ||
| graph-validity pass still flags the resulting multiply-defined | ||
| state (visible to the listener as a child error, hence | ||
| ``success: false``), but no traceback reaches stderr because the | ||
| new cell never runs — the error is a pure graph-state marker. | ||
| """ | ||
| session.setup_cells(["cell_a"], ["x = 10"]) | ||
|
|
||
| lines = session.execute( | ||
| "import marimo._code_mode as cm\n" | ||
| "async with cm.get_context(skip_validation=True) as ctx:\n" | ||
| ' ctx.create_cell("x = 20")', | ||
| ) | ||
|
|
||
| assert lines == snapshot( | ||
| [ | ||
| "event: stdout", | ||
| 'data: {"data": "created cell \'<cid>\'\\n"}', | ||
| "", | ||
| "event: done", | ||
| 'data: {"success": false, "output": {"mimetype": "text/plain", "data": ""}}', | ||
| "", | ||
| ] | ||
| ) |
There was a problem hiding this comment.
@mscolnick some drive-by, additional code mode tests.
Bundle ReportChanges will increase total bundle size by 25.11MB (100.0%) ⬆️
|
Closes #9302
Fixes #9255, and flips the 4 xfail integration tests from #9342 flip to passing.
The
ScratchCellListenerused to fire its done sentinel on the scratch cell'sidlestatus (+ 50ms grace for flushing stderr/stdout). Anything broadcast after that grace was silently dropped from the SSE stream.These changes add an optional
run_idtoExecuteScratchpadCommandandCompletedRunNotification. The api endpoint and MCP mint a UUID and pass it to both the command and theScratchCellListener; the listener now fires its sentinel only on the matchingCompletedRunNotification.Note: The reason we couldn't just observe
CompletedRunNotificationdirectly (#9302) is because theCompletedRunNotificationnfromsession.instantiatetrips up the listener early (separate run).Summary by cubic
Correlate each scratchpad run with a
run_idand wait for the matchingCompletedRunNotificationbefore finishing. This prevents early success in/api/kernel/execute, streams compile-time errors, and simplifies thedoneSSE.Bug Fixes
run_idtoExecuteScratchpadCommandandCompletedRunNotification;/api/kernel/executeand the MCP code server mint a UUID andScratchCellListenerwaits for the matching completion, ignoring others.CompletedRunNotificationin afinallyso listeners don’t hang ifrun_scratchpadraises.stderrand keep console output, soSyntaxErrordiagnostics are visible beforedone.ctx.create_cellvalidation and theskip_validation=Truepath to cover earlyRuntimeErrorreporting and graph-state failures.Migration
doneSSE now returns{ success, output }only; theerrorfield was removed. On failureoutputis{ mimetype: "text/plain", data: "" }.CompletedRunNotification.run_id?,ExecuteScratchpadCommand.runId?.Written for commit a01e152. Summary will update on new commits.