Skip to content

exp 136: stream completion-side wall audit#111

Closed
danReynolds wants to merge 5 commits into
mainfrom
exp-136-stream-completion-counter
Closed

exp 136: stream completion-side wall audit#111
danReynolds wants to merge 5 commits into
mainfrom
exp-136-stream-completion-counter

Conversation

@danReynolds
Copy link
Copy Markdown
Owner

Hypothesis

Exp 135 found the writer-isolate handler is only ~22-25% of A11c overlap burst wall — the remaining ~75% lives on the main isolate. The completion-side microtask scheduling cost counter entry in signals.json#stream-rerun-dispatch.blockedOnMeasurement is what gates the next dispatch implementation experiment.

This measurement experiment ships that counter. The hypothesis: of the ~75% main-isolate share, the synchronous post-await body of StreamEngine._requery (bookkeeping, hash-changed shortcut, entry.emit, trailing _flushQueue re-entry) plus the StreamEntry.emit fanout for-loop account for a measurable, structurally bounded share that future dispatch experiments must clear.

Approach

  • Four new profile-mode counters: streamCompleteUs, streamCompleteCount, streamEmitUs, streamEmitCount. All gated behind kProfileMode so release builds tree-shake the recording.
  • Instrument StreamEngine._requery: start the stopwatch right after the selectIfChanged await resolves; record in the existing finally after _flushQueue re-enters. The cross-isolate round-trip never inflates the wall.
  • Instrument StreamEntry.emit: wrap the subscriber for-loop only. streamEmitUs is a strict subset of streamCompleteUs.
  • New audit harness benchmark/profile/stream_completion_audit.dart reuses the audit_workloads.dart A11c / keyed-PK scenarios exp 119 / 121 / 135 share. Reports raw counters and derived fractions including accounted / wall = (invalidate_us + complete_us) / wall_us.
  • Existing test/profile_counters_test.dart extended to round-trip the new keys through snapshot / reset.

Full writeup: experiments/136-stream-completion-counter.md.

Results

Four repeated passes (a/b/c/d) — d is the saved aggregate at benchmark/profile/results/exp-136-stream-completion-aggregate.md:

workload wall_ms (a/b/c/d) complete / wall invalidate / wall accounted / wall us per complete
A11c baseline (0 streams x 500) 37.4 / 32.8 / 32.8 / 32.4 0 / 0 / 0 / 0 % 0 / 0 / 0 / 0 % 0 / 0 / 0 / 0 % n/a
A11c disjoint (50 streams x 500) 42.3 / 38.8 / 37.6 / 37.3 0 / 0 / 0 / 0 % 18.9 / 21.7 / 21.0 / 21.1 % 18.9 / 21.7 / 21.0 / 21.1 % n/a
A11c overlap (50 streams x 500) 89.5 / 87.4 / 86.3 / 91.7 29.8 / 29.5 / 29.8 / 31.7 % 15.1 / 14.7 / 15.0 / 14.9 % 44.9 / 44.2 / 44.8 / 46.7 % 7.1 / 6.9 / 6.8 / 7.4 µs
keyed PK subs (50 streams x 200) 20.1 / 19.4 / 19.4 / 23.7 21.6 / 20.8 / 22.2 / 19.5 % 15.3 / 15.5 / 15.2 / 13.9 % 36.9 / 36.3 / 37.3 / 33.5 % 4.1 / 3.8 / 4.1 / 4.1 µs

Sanity: dispatcher_parked_total = 0, dispatcher_wake_retry_total = 0, dispatcher_max_parked_concurrent = 0 on every workload — exp 120 / 122 still hold.

Headline reading:

  • A11c overlap: complete_us ~30% of wall (very tight band across 4 passes), emit_us <0.3% (hash-suppression keeps emissions to ~20-35 per 500-write burst). Combined with exp 121's invalidate_us (~15%), synchronous main-isolate stream-engine code accounts for ~45% of wall. The remaining ~55% is framework microtask scheduling, async/await chaining, reader-pool internals, and subscriber callback dispatch the counters cannot reach.
  • A11c disjoint: complete_us = 0 (column elision filters every re-query at invalidation time, so _requery never fires).
  • Keyed-PK: ~20-22% complete + ~14-15% invalidate, per-handler ~4 µs across ~1,050 handlers per burst.

Outcome

Accept for review — measurement. Closes the last blockedOnMeasurement entry in stream-rerun-dispatch direction. The next dispatch implementation experiment must clear a per-handler complete_us target on A11c overlap (~7 µs of ~3,750 handlers per burst) reproducibly across a 5-run release suite. Three speculative follow-ups are now evaluable: _flushQueue re-entry collapse, hash-suppressed completion fast-path, per-completion batching — none committed.

Test plan

  • dart analyze --fatal-infos lib/ benchmark/profile/ test/profile_counters_test.dart
  • dart test test/profile_counters_test.dart test/stream_test.dart test/stream_invalidation_coalescing_test.dart test/stream_dependency_shapes_test.dart test/stream_cache_hit_reliability_test.dart test/stream_overflow_fallback_test.dart test/stream_trigger_cascade_test.dart — all pass
  • dart run benchmark/generate_history.dart — parsed 115 experiments, 118 benchmark runs
  • 4 repeated audit passes (dart run -DRESQLITE_PROFILE=true benchmark/profile/stream_completion_audit.dart --markdown) with stable bands per the table above

🤖 Generated with Claude Code

danReynolds and others added 3 commits May 10, 2026 07:31
Closes the writer-isolate wall vs SQLite wall split candidate in
signals.json#stream-rerun-dispatch.blockedOnMeasurement.

Adds three profile-mode counters in the writer isolate
(writerHandlerUs, writerSqliteUs, writerHandlerCount), exposes them
cross-isolate via Database.snapshotWriterProfileCounters() backed by a
new snapshot/reset RPC pair on the writer protocol, and ships a focused
audit harness that reports the breakdown on the existing A11c and
keyed-PK scenarios.

Headline reading: on A11c overlap the writer-isolate handler is only
~22-25% of writer-side burst wall (rest is main-isolate microtask
scheduling); within the writer, SQLite step is ~55-58% and Dart
dispatch is ~42-45% of writer handler. Structural ceiling for removing
all writer-side Dart dispatch on overlap is ~9-11% of total wall — same
per-benchmark decision threshold edge as exp 121's invalidation
traversal ceiling. Future stream-rerun-dispatch work needs the
remaining completion-side scheduling counter before another
implementation pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
- ProfileCounters writerHandlerUs doc references the actual public
  accessor (Database.snapshotWriterProfileCounters → Writer.snapshotWriterCounters);
  the original prose pointed at a name that doesn't exist on Writer.
- Database.snapshotWriterProfileCounters / resetWriterProfileCounters
  now run under the writer mutex (writer.locked) so they serialize with
  execute() / transaction() the same way other public Database write
  methods do. Without it a snapshot could land between the BEGIN and
  COMMIT messages of an in-flight transaction and sample partial state.
- writer_step_wall_audit.dart clamps dispatch_us at zero. handler_us
  and sqlite_us come from independent stopwatches, so a marginal handler
  could in theory round to sqlite_us > handler_us and produce a negative
  fraction in the rendered markdown. Clamping makes the report robust
  against that measurement artifact.
- test/database_test.dart guards the new writer-counter API surface:
  one test asserts the snapshot returns the EXP-135 keys, another that
  reset+snapshot round-trips without throwing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Closes the completion-side microtask scheduling cost counter entry in
signals.json#stream-rerun-dispatch.blockedOnMeasurement.

Adds four profile-mode counters on the main isolate (streamCompleteUs,
streamCompleteCount, streamEmitUs, streamEmitCount) for the
synchronous post-await body of StreamEngine._requery (bookkeeping,
hash-changed shortcut, entry.emit, trailing _flushQueue re-entry) and
the StreamEntry.emit subscriber for-loop. Both gated behind
kProfileMode so release builds tree-shake the recording.

Adds benchmark/profile/stream_completion_audit.dart, a focused
harness that reuses the audit_workloads.dart A11c / keyed-PK
scenarios and reports complete_us / wall, invalidate_us / wall, and
the accounted-share fraction alongside the existing exp 135 / 121
counters.

Headline reading across 4 passes: on A11c overlap, complete_us is a
stable ~29.5-31.7% of writer-burst wall and entry.emit is <0.3%
(hash-suppression keeps emissions to ~20-35 per 500-write burst).
Combined with exp 121's invalidate_us (~15%), synchronous
main-isolate stream-engine code accounts for ~45% of wall; the
remaining ~55% lives in framework microtask scheduling and async
boundaries the counters cannot reach.

A11c disjoint stays at 0% complete (column elision filters at
invalidation time before _requery fires). Keyed-PK is ~20-22%
complete plus ~14-15% invalidate. Per-handler completion is stable
at ~7 µs (A11c overlap) and ~4 µs (keyed-PK).

The direction's blockedOnMeasurement list is now empty. Next
dispatch implementation experiment must clear a per-handler
complete_us target on A11c overlap (~7 µs of ~3,750 handlers per
burst) reproducibly across a 5-run release suite.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 11, 2026 11:16
CI's `check_experiment_signals` enforces `keyPriors.length <= 6`. With
exp 136 added, `stream-rerun-dispatch.keyPriors` and
`measurement-system.keyPriors` both hit 7. Curate by moving the
oldest entry in each direction (exp 106 for stream-rerun-dispatch;
exp 099 for measurement-system) into `archive`.

- exp 106's column-level dependency tracking is now context that
  exps 120 / 121 / 122 build on; the current admission story is
  better captured by 120 + 122 in keyPriors.
- exp 099's "is the path running" lesson is captured in JOURNAL.md
  and the long-text workload it asked for now lives in exp 110.

No behavioral change. Same data; smaller keyPriors slice that the
schema requires.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds profile-mode instrumentation and audit harnesses to quantify stream “completion-side” main-isolate wall time (and writer-side handler vs SQLite time) so future stream dispatch experiments can be gated by concrete counters rather than wall-time noise.

Changes:

  • Introduces new ProfileCounters keys for writer handler/SQLite wall and stream completion/emit wall, plus snapshot/reset plumbing for writer-isolate counters.
  • Instruments StreamEngine._requery and StreamEntry.emit to record completion-side counters (gated behind kProfileMode).
  • Adds/extends benchmark audit harnesses, experiment writeups, signals metadata, and tests to validate counter snapshot/reset wiring.

Reviewed changes

Copilot reviewed 18 out of 18 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
test/profile_counters_test.dart Extends snapshot/diff/reset tests to include new writer + stream completion counters.
test/database_test.dart Adds tests ensuring writer counter snapshot/reset APIs round-trip keys without throwing.
lib/src/writer/writer.dart Adds writer-isolate counter snapshot/reset RPC helpers on the main isolate.
lib/src/writer/write_worker.dart Implements writer-isolate snapshot/reset requests and instruments handler + SQLite wall timers.
lib/src/stream_engine.dart Instruments _requery completion path and emit fanout loop for new counters.
lib/src/profile_counters.dart Adds new counter fields/keys and wires them into snapshot/reset.
lib/src/database.dart Exposes writer-isolate counter snapshot/reset APIs at Database level.
benchmark/profile/audit_workloads.dart Resets and merges writer-isolate counters into shared audit scenario results.
benchmark/profile/writer_step_wall_audit.dart New exp-135 harness reporting writer handler vs SQLite wall fractions.
benchmark/profile/stream_completion_audit.dart New exp-136 harness reporting completion/emit wall fractions and derived metrics.
benchmark/profile/results/exp-135-writer-step-wall-aggregate.md Saved aggregate output from exp-135 harness.
benchmark/profile/results/exp-136-stream-completion-aggregate.md Saved aggregate output from exp-136 harness.
experiments/135-writer-step-wall-audit.md New experiment writeup documenting approach/results for exp 135.
experiments/136-stream-completion-counter.md New experiment writeup documenting approach/results for exp 136.
experiments/signals.json Updates direction narrative, clears blockedOnMeasurement, and adds entries for 135/136.
experiments/README.md Adds experiment index entries for 135 and 136.
experiments/JOURNAL.md Adds guidance on cross-isolate counter export via snapshot RPC.
docs/experiments/history.json Regenerates published experiment history to include 135/136.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/src/stream_engine.dart Outdated
Comment thread lib/src/database.dart
Comment thread lib/src/database.dart
- Fix typo in stream_engine.dart `_requery`: "ocurred" → "occurred"
  in the post-`await` dirty-re-check comment block.

- Guard `Database.snapshotWriterProfileCounters` /
  `resetWriterProfileCounters` against in-`transaction` calls. The
  underlying `writer.locked` mutex is non-reentrant, so awaiting it
  from inside an active transaction body would deadlock the writer
  pipeline. Both methods now throw `StateError` at the call site if
  `Transaction.current != null`, matching the explicit surface the
  Copilot reviewer flagged. Audit harnesses bracket workloads from
  outside transactions, so existing callers are unaffected.

- Add two regression tests confirming the StateError contract.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@danReynolds
Copy link
Copy Markdown
Owner Author

Closing this older exp 136 branch as superseded. PR #110 now carries the exp 135 writer-dispatch measurement lane, and PR #113 carries the cleaner current exp 136 completion-side scheduling measurement without the older stacked writer-counter changes. Keeping this open would duplicate the experiment queue rather than add new signal.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants