fix: externalize finalize cursor so HTTP continuation pages correctly by rustyconover · Pull Request #1 · Query-farm/vgi-causal

rustyconover · 2026-06-24T03:11:09Z

The bug

The buffering finalize streams (ate, propensity_scores, att) carried a position-less DrainState{done: bool}: emit the whole result in one out.emit(), then set done. Over the stateless HTTP transport the framework wire-serializes finalize state between ticks and emits at most one data batch per response, resuming from the continuation token. An emit-all+done cursor restarts from row 0 on every HTTP resume and loops forever once the output exceeds one response — which propensity_scores (one row per input subject, genuinely unbounded) routinely does. subprocess/unix keep live state in-process and hide the bug; only http exposes it.

The fix

Replace DrainState{done} with an offset cursor: started flag, integer offset, and the already-computed result batch as Arrow IPC bytes (result_ipc) — all wire-serializable. The first finalize tick computes the estimate once into the cursor; every tick emits at most ROWS_PER_TICK (64) rows from offset, advances offset, and out.finish()es when drained. Because offset survives the round-trip, a resumed tick emits the next slice — never re-runs the estimator, never restarts from row 0. ate (3 rows) and att (1 row) are bounded but use the identical cursor for uniformity. Results are byte-identical to before. SinkBuffer.drain_result holds the shared cursor loop; each finalize is now a one-liner.

Validation (fails on old code, passes on new)

tests/harness.py run_buffering(..., serialize_state=True) — re-serializes finalize state between every tick (deserialize_from_bytes(serialize_to_bytes())) and caps each tick at one response worth of rows, rejecting an over-cap emit and resuming from the pre-tick token, with a 10 000-tick overrun guard. Models the HTTP continuation faithfully: the old emit-all cursor overruns the guard; the offset cursor pages and terminates.
TestCursorSurvivesContinuation (tests/test_tables.py) — drives propensity_scores over an 800-row cohort (>> ROWS_PER_TICK) normal vs serialize_state=True, asserting identical rows/order, no dupes, termination; a second case monkeypatches ROWS_PER_TICK=2 so ate/att also page.
test/sql/causal.test — adds a 200-row generate_series cohort and asserts propensity_scores returns count(*) = 200, count(distinct id) = 200, all valid probabilities, and an ordered head — so the http leg genuinely pages.

Confirmed locally: pytest -q green, ruff/mypy/pydoclint clean, and the SQL E2E suite passes over all three transports (subprocess, http, unix). Verified fail-old (overrun guard fires) / pass-new (800 rows) directly.

🤖 Generated with Claude Code

The buffering finalize streams (ate, propensity_scores, att) used a position-less DrainState{done}: emit the whole result in one out.emit(), then set done. Over the stateless HTTP transport the framework wire- serializes finalize state between ticks and emits at most one batch per response, resuming from the continuation token. An emit-all+done cursor re-runs from row 0 on every resume and loops forever once the output exceeds one response -- which propensity_scores (one row per input subject, unbounded) routinely does. subprocess/unix keep live state in process and hide the bug; only http exposes it. Replace DrainState with an offset cursor: started flag, integer offset, and the computed result batch as IPC bytes (result_ipc) -- all wire- serializable. The first finalize tick computes the estimate once into the cursor; every tick emits at most ROWS_PER_TICK (64) rows from offset, advances offset, and finishes when drained. Because offset survives the round-trip, a resumed tick emits the next slice -- never re-runs the estimator, never restarts from row 0. ate (3 rows) and att (1 row) are bounded but use the identical cursor for uniformity. Results are byte- identical to before. SinkBuffer.drain_result holds the shared cursor loop; each finalize is now a one-line call. Validation: - tests/harness.py run_buffering gains serialize_state=True, which re- serializes finalize state between every tick and caps each tick at one response worth of rows (rejecting an over-cap emit and resuming from the pre-tick token), with a 10000-tick overrun guard. This models the HTTP continuation faithfully: the old emit-all cursor overruns the guard; the offset cursor pages and terminates. - TestCursorSurvivesContinuation (test_tables.py) drives propensity_scores over an 800-row cohort (>> ROWS_PER_TICK) normal vs serialize_state=True and asserts identical rows/order, no dupes, termination; a second case monkeypatches ROWS_PER_TICK=2 so ate/att also page. - causal.test adds a 200-row generate_series cohort and asserts propensity_scores returns count 200, distinct 200, all valid probabilities, and an ordered head -- so the http leg genuinely pages. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

rustyconover merged commit 63ffe2e into main Jun 24, 2026
10 checks passed

rustyconover deleted the fix/http-continuation-cursor branch June 24, 2026 03:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: externalize finalize cursor so HTTP continuation pages correctly#1

fix: externalize finalize cursor so HTTP continuation pages correctly#1
rustyconover merged 1 commit into
mainfrom
fix/http-continuation-cursor

rustyconover commented Jun 24, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

rustyconover commented Jun 24, 2026

The bug

The fix

Validation (fails on old code, passes on new)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant