Skip to content

exp 135: writer-isolate dispatch wall vs SQLite step wall audit#110

Open
danReynolds wants to merge 2 commits into
mainfrom
exp-135-writer-dispatch-wall
Open

exp 135: writer-isolate dispatch wall vs SQLite step wall audit#110
danReynolds wants to merge 2 commits into
mainfrom
exp-135-writer-dispatch-wall

Conversation

@danReynolds
Copy link
Copy Markdown
Owner

Hypothesis

After exp 120 / exp 121, the writer-isolate handler wall on stream
workloads is mostly SQLite step time, with Dart-side dispatch (param
allocation, dirty-table marshalling, message build/send) at a small,
optimization-sensitive but per-benchmark-sub-threshold share. If
true, writer-side dispatch should not be the next implementation
target on currently-measured workloads, and the structural ceiling
for remove all writer-side Dart dispatch should be close to exp
121's invalidation-traversal ceiling (10–15% of A11c overlap wall).

This experiment ships the writer dispatch wall counter listed in
signals.json#stream-rerun-dispatch.blockedOnMeasurement (one of
two named gates blocking the next dispatch-area implementation
experiment).

Approach

Three small additions, all gated behind kProfileMode:

  • Profile counters. New writerHandlerUs, writerSqliteUs,
    writerHandlerCount fields on ProfileCounters. The existing
    snapshot/diff/reset plumbing carries the new keys.
  • Snapshot RPC. New WriterCountersSnapshotRequest /
    WriterCountersResetRequest request types on the writer protocol;
    Database.snapshotWriterProfileCounters() exposes the round-trip
    to audit harnesses. Snapshot/reset bookkeeping is excluded from
    the per-handler stopwatch so it does not contaminate the measured
    wall.
  • Handler instrumentation. Each writer handler is wrapped with
    a per-handler stopwatch (handler_us); FFI calls that drive SQLite
    (resqliteExecute, resqliteRunBatch, resqliteRunBatchNested,
    the cached transaction-control stmts, resqliteExec for SAVEPOINT
    / RELEASE / ROLLBACK TO, prepare+step inside _handleTxQuery) go
    through a _measureSqlite helper that accumulates sqlite_us.
    dispatch_us = handler_us − sqlite_us.

A new harness benchmark/profile/writer_step_wall_audit.dart
formats the breakdown on the existing A11c (baseline / disjoint /
overlap) and keyed-PK scenarios from audit_workloads.dart so the
fractions align structurally with exp 119 / exp 121.

Full implementation detail in experiments/135-writer-step-wall-audit.md.

Results

Three repeated passes (a/b/c) of the A11c scenarios:

workload wall_ms (a/b/c) sqlite_us / handler_us dispatch_us / wall_us dispatch_us per handler
A11c baseline (0 streams x 500) 34.4 / 33.0 / 33.7 69 / 66 / 68 % 16.1 / 17.1 / 16.7 % 11.1 / 11.3 / 11.2 µs
A11c disjoint (50 streams x 500) 37.2 / 37.7 / 37.6 62 / 63 / 63 % 13.2 / 12.7 / 13.0 % 9.8 / 9.6 / 9.8 µs
A11c overlap (50 streams x 500) 84.6 / 90.9 / 86.9 55 / 56 / 58 % 11.0 / 10.7 / 9.2 % 18.7 / 19.5 / 16.1 µs
keyed PK subs (50 streams x 200) 18.4 / 19.8 / 45.5 70 / 69 / 93 % 9.4 / 4.7 / 4.7 % 8.7 / 10.8 / 10.8 µs

Sanity: dispatcher_parked_total = 0, dispatcher_wake_retry_total = 0,
and dispatcher_max_parked_concurrent = 0 on every workload —
exp 120 / exp 122 still hold post-instrumentation.

Headline reading: on A11c overlap the writer-isolate handler is only
~22-25% of writer-side burst wall (rest is main-isolate microtask
scheduling); within the writer, SQLite step is ~55-58% and Dart
dispatch is ~42-45%. Structural ceiling for removing all
writer-side Dart dispatch on overlap is ~9-11% of total wall

same per-benchmark decision threshold edge as exp 121's invalidation
traversal ceiling (10-15%). Combined, even fully eliminating both
saves ~20-25% of overlap wall, with the remaining ~75% sitting on
the main isolate.

Aggregate file: benchmark/profile/results/exp-135-writer-step-wall-aggregate.md.

Outcome

In Review — measurement.

Closes the writer-isolate wall vs SQLite wall split candidate in
signals.json#stream-rerun-dispatch.blockedOnMeasurement. Only the
completion-side microtask scheduling cost counter remains. Future
writer-side dispatch experiments need to clear the ~9-11% overlap
wall ceiling reproducibly on a 5-run release suite.

Reopen the writer-side dispatch direction if a new workload pushes
the per-handler dispatch share above ~50% of writer handler, or if a
specific bounded change targets the overlap-only getDirtyTableDependencies
delta (~17 µs/handler vs ~10 µs on disjoint).

Test plan

  • dart analyze clean
  • full dart test suite passes (232 tests)
  • release-mode dart run benchmark/run_profile.dart reports
    unchanged dispatch floors (~5 µs reader, ~9 µs writer)
  • benchmark/profile/writer_step_wall_audit.dart ran 3× under
    -DRESQLITE_PROFILE=true with stable per-handler dispatch numbers
  • benchmark/profile/dispatch_pressure_audit.dart (exp 119) and
    benchmark/profile/invalidation_traversal_audit.dart (exp 121)
    still produce consistent reports against the modified
    audit_workloads.dart
  • dart run benchmark/check_experiment_signals.dart passes
  • dart run benchmark/check_generated_data.dart passes after
    regenerating docs/experiments/history.json

🤖 Generated with Claude Code

Closes the writer-isolate wall vs SQLite wall split candidate in
signals.json#stream-rerun-dispatch.blockedOnMeasurement.

Adds three profile-mode counters in the writer isolate
(writerHandlerUs, writerSqliteUs, writerHandlerCount), exposes them
cross-isolate via Database.snapshotWriterProfileCounters() backed by a
new snapshot/reset RPC pair on the writer protocol, and ships a focused
audit harness that reports the breakdown on the existing A11c and
keyed-PK scenarios.

Headline reading: on A11c overlap the writer-isolate handler is only
~22-25% of writer-side burst wall (rest is main-isolate microtask
scheduling); within the writer, SQLite step is ~55-58% and Dart
dispatch is ~42-45% of writer handler. Structural ceiling for removing
all writer-side Dart dispatch on overlap is ~9-11% of total wall — same
per-benchmark decision threshold edge as exp 121's invalidation
traversal ceiling. Future stream-rerun-dispatch work needs the
remaining completion-side scheduling counter before another
implementation pass.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 10, 2026 11:32
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds EXP-135 measurement infrastructure to split writer-isolate “handler wall” into (a) time spent inside SQLite-driving FFI calls and (b) remaining Dart-side dispatch time, and wires this into the existing audit workload suite to quantify optimization headroom.

Changes:

  • Extend ProfileCounters with writer-isolate counters (writer_handler_us, writer_sqlite_us, writer_handler_count) and update snapshot/reset + tests.
  • Add writer-protocol snapshot/reset RPCs and Database APIs to retrieve/reset writer-isolate counters cross-isolate.
  • Add a new profile harness (writer_step_wall_audit.dart) + generated aggregate markdown, and update experiment docs/signals/history to record EXP-135.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
test/profile_counters_test.dart Updates expectations for new writer_* keys and adds a basic round-trip test for snapshot/reset.
lib/src/writer/writer.dart Adds Writer helper methods to snapshot/reset writer-isolate counters via writer protocol.
lib/src/writer/write_worker.dart Implements new writer request/response types and instruments handler/SQLite wall counters in the writer isolate.
lib/src/profile_counters.dart Adds writer-side counter fields and includes them in snapshot/reset plumbing.
lib/src/database.dart Exposes new public Database APIs to snapshot/reset writer-isolate counters.
benchmark/profile/audit_workloads.dart Resets writer counters before workloads and merges writer snapshots into the main counter map.
benchmark/profile/writer_step_wall_audit.dart New audit harness to format writer handler vs SQLite vs dispatch breakdown.
benchmark/profile/results/exp-135-writer-step-wall-aggregate.md Checked-in aggregate output from the new audit harness.
experiments/135-writer-step-wall-audit.md New experiment writeup documenting hypothesis, approach, and results.
experiments/signals.json Records EXP-135 outcomes and updates measurement gates for dispatch-related work.
experiments/README.md Adds EXP-135 to the experiment index.
experiments/JOURNAL.md Adds a note about cross-isolate counter export pattern via snapshot RPC.
docs/experiments/history.json Regenerates experiment history to include EXP-135.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread lib/src/profile_counters.dart Outdated
Comment thread lib/src/database.dart
Comment thread lib/src/database.dart Outdated
Comment thread benchmark/profile/writer_step_wall_audit.dart Outdated
Comment thread lib/src/database.dart
- ProfileCounters writerHandlerUs doc references the actual public
  accessor (Database.snapshotWriterProfileCounters → Writer.snapshotWriterCounters);
  the original prose pointed at a name that doesn't exist on Writer.
- Database.snapshotWriterProfileCounters / resetWriterProfileCounters
  now run under the writer mutex (writer.locked) so they serialize with
  execute() / transaction() the same way other public Database write
  methods do. Without it a snapshot could land between the BEGIN and
  COMMIT messages of an in-flight transaction and sample partial state.
- writer_step_wall_audit.dart clamps dispatch_us at zero. handler_us
  and sqlite_us come from independent stopwatches, so a marginal handler
  could in theory round to sqlite_us > handler_us and produce a negative
  fraction in the rendered markdown. Clamping makes the report robust
  against that measurement artifact.
- test/database_test.dart guards the new writer-counter API surface:
  one test asserts the snapshot returns the EXP-135 keys, another that
  reset+snapshot round-trips without throwing.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants