Exp 138: blob-heavy batch parameter shape audit#114
Conversation
Closes the blob-encoding openQuestion in signals.json#parameter-encoding-and-binding by direct measurement. Extends `batch_param_flatten.dart` with `--cell-mode=blob --blob-size=N` so the focused harness can sweep BLOB-only wide batches in addition to the historical 25/25/25/25 TEXT/INT/REAL/BLOB mix. A 4 / 64 / 256 / 1024 / 4096 byte sweep across 100/1000/10000 rows x 2/8/20 params at 30 iterations per shape (15 for the 4096 B / 10k x 20 case) lands all non-overhead-dominated rows in a 137-298 MB/s per-byte throughput band, identical to exp 125 / 126's wide-text shapes. The signature is SQLite stepping, not removable Dart-side allocation: `_allocatePackedBatchParams` already writes the caller-supplied `Uint8List` directly into the inline parameter buffer via `view.setRange(...)`. There is no analogue of exp 125 / 126's `utf8.encode` temporary buffer to remove on the blob path. The harness mode is the durable contribution. Future blob-shape hypotheses can be evaluated directly against this band; do not retry blob-encoder allocation removal without a workload that surfaces a blob-specific cost (page pinning, mass copy for JOINs, page-cache invalidation) absent from text shapes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Pull request overview
This is a measurement-only experiment (EXP-138) closing the parameter-encoding-and-binding direction's open question about whether wide blob-heavy batches have removable Dart-side encoding cost. The benchmark harness batch_param_flatten.dart gains --cell-mode=blob and --blob-size=N knobs, and a sweep across 4 B–4096 B blobs shows per-byte throughput converging to the same SQLite-stepping band as exp 125/126's wide-text shapes — confirming _allocatePackedBatchParams already writes the caller's Uint8List directly via view.setRange(...) with nothing left to remove.
Changes:
- Add
--cell-mode=mixed|bloband--blob-size=Noptions tobatch_param_flatten.dart, with a deterministic non-collapsible blob generator; mixed-mode default preserves prior calibration. - Add experiment writeup, aggregate results table, README entry, and
signals.jsonupdates (close the blob open question, add exp 138 to history; move 116 to archive). - Regenerate
docs/experiments/history.jsonwith the new entry.
Reviewed changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| benchmark/experiments/batch_param_flatten.dart | Adds blob cell-mode and configurable blob size; mixed mode unchanged. |
| experiments/138-blob-param-shape-audit.md | New experiment writeup with hypothesis, sweep results, and decision. |
| benchmark/profile/results/exp-138-blob-param-shape-aggregate.md | Aggregate p50/throughput tables for the blob sweep. |
| experiments/signals.json | Closes blob-encoding openQuestion; updates currentRead; adds 138 entry; archives 116. |
| experiments/README.md | Adds the EXP-138 row to the in-review list. |
| docs/experiments/history.json | Regenerated to include EXP-138. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Hypothesis
After exp 125 and exp 126 extracted per-string
utf8.encodeand the temporaryUint8Listit allocates from the wide-batch parameter encoder,signals.json#parameter-encoding-and-bindingstill listed:Static reading of
_allocatePackedBatchParamssays no —Uint8Listvalues are written directly to the inline parameter buffer withview.setRange(...)— but the focused harness shipped only a fixed 4-byte blob mix, which masks any blob-specific cost. This PR closes the question by direct measurement.Approach
benchmark/experiments/batch_param_flatten.dartwith--cell-mode=mixed|bloband--blob-size=N.blobmode switches every column to BLOB and every value to a deterministic per-cellUint8List(content varies by row+col so SQLite cannot collapse identical values).[4, 64, 256, 1024, 4096]bytes across{100, 1000, 10000}rows ×{2, 8, 20}params, 30 iterations per shape (15 for the 4 KB / 10k × 20 case).Mixed-mode behaviour is unchanged, so exp 113 / 125 / 126 baselines stay calibrated.
Results
Full table in
benchmark/profile/results/exp-138-blob-param-shape-aggregate.md.10k × 20 wide shape:
Per-byte throughput on all non-overhead-dominated rows lands in a 137–298 MB/s SQLite-stepping band, identical to exp 125 / 126's wide-text band. Tiny-blob shapes (4 B) sit at 47 MB/s because per-row overhead dominates total wall — the same regime as mixed-mode. There is no allocation-removal signature: the band is flat across 64 B+ blob sizes.
Outcome
In Review — measurement. The blob path has no removable Dart-side encoding cost.
_allocatePackedBatchParamsalready writes the caller'sUint8Listdirectly into the inline parameter buffer; further savings would require changing SQLite-side work (bind_blob, page writes, WAL frames), which is out of scope for this direction.signals.json#parameter-encoding-and-binding.openQuestionsdrops the blob-encoding entry. The harness mode is the durable contribution: future blob-shape hypotheses can be re-evaluated without re-deriving the workload. Reject blob-encoder allocation removal experiments unless a workload surfaces a blob-specific cost (page pinning, mass copy for JOINs, page-cache invalidation) absent from text shapes.Test plan
dart analyze --fatal-infoson edited files (batch_param_flatten.dart,signals.json,README.md, doc + aggregate) — clean.dart test test/database_test.dart— 47/47 pass (no library code touched).dart run benchmark/check_experiment_signals.dart— signal map valid.dart run benchmark/check_generated_data.dart—history.jsonanddevices.jsoncurrent after regenerate.dart run benchmark/experiments/batch_param_flatten.dart --cell-mode=blob --blob-size=Nfor N ∈ {4, 64, 256, 1024, 4096} produced the numbers in the aggregate.🤖 Generated with Claude Code