fix: keep generated IDs free of the store path delimiter (fixes the broadcast "flake")#39
Open
danReynolds wants to merge 1 commit into
Open
fix: keep generated IDs free of the store path delimiter (fixes the broadcast "flake")#39danReynolds wants to merge 1 commit into
danReynolds wants to merge 1 commit into
Conversation
There was a problem hiding this comment.
Pull request overview
This PR fixes an intermittent test/core flake caused by cross-test leakage of Loon’s pending zero-delay broadcast timer after Loon.clearAll() runs in tearDown, which could cause the next test’s writes to piggyback on the prior test’s scheduled broadcast.
Changes:
- Update the core test suite
tearDownto await an extra event-loop turn afterLoon.clearAll()so the scheduled broadcast is drained before the next test begins. - Add an explanatory comment documenting why the extra await is required for test isolation.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
generateId drew from an alphabet that included '_', so a random ID could
contain '__' — the delimiter the store uses to split paths into segments.
Because generated IDs are used as path segments (observer value keys,
auto-generated document IDs), a '__' in an ID split it into extra
segments and misplaced the value deeper in the store than reads and
invalidations expect:
- An observable's cached value was stored below the path the broadcast
manager invalidates (clearAll/writes call delete(path, recursive:
false)), so it was never invalidated and the observer emitted a stale
value. This is the intermittent "broadcast test flake": it fired
whenever a random observer ID happened to contain '__' (~7.5% of core
suite runs at --concurrency=1, more under parallel load). Forcing an
extra segment into the observer ID turned it into 17 deterministic
failures.
- An auto-id document whose ID contained '__' was written below its
collection node and was invisible to the collection's query.
Restrict generated IDs to alphanumeric [0-9A-Za-z] via uniform rejection
sampling, so an ID can never contain the delimiter. Adds a regression
test asserting generateId never produces '__' (100k samples).
Verified: core suite 45/45 green across parallel and --concurrency=1
stress runs that previously flaked ~7.5-20%.
626eb9f to
9fc001e
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
The intermittent
test/corefailure ("105 passed, 1 failed, a different test each run") turned out to be a real library bug, not a test-timing flake.generateIddrew from an alphabet that included_, so a random id could contain__— the delimiter the store uses to split paths into segments. Generated ids are used as path segments (observable value keys, auto-generated document ids), so a__inside an id split it into extra segments and misplaced the value deeper in the store than reads/invalidations look:observerValueStoreat its_observerId("${path}__${generateId()}"). Writes/clears invalidate it viadelete(path, recursive: false). If the random id contained__, the value was stored belowpath, so the non-recursive delete never reached it and the observable emitted a stale (often empty) value. This is the "flake" — it fired whenever a random observer id happened to contain__(~7.5% of core runs at--concurrency=1, higher under parallel load), hence a different test each run.generateId(); a__in it wrote the doc below its collection node, socollection.get()/ queries didn't see it. Same root cause.I confirmed the mechanism by forcing an extra
__segment into the observer id — loon_test.dart went from passing to 17 deterministic failures.Fix
Restrict generated ids to alphanumeric
[0-9A-Za-z]via uniform rejection sampling, so an id can never contain the__delimiter. This fixes it at the source for every path-segment use. Ids stay URL-safe and ~125 bits.Verification
generateIdnever contains__across 100k samples (deterministic), plus alphanumeric/length checks.--concurrency=1(previously ~7.5%).Note
My first attempt on this branch was a tearDown broadcast-drain (cross-test-bleed theory); the data disproved it (3/40 still failed), so it's been reverted in favour of this root-cause fix. Once this lands on
main, the recurring CI re-kicks across the other PRs should stop.