fix(sync): stop a single un-writable record from wedging the sync engine forever#254
Conversation
…ine forever
Symptom: after a voice log, the app would appear to "sync forever" — every
save felt like it never settled and the status dot stayed yellow ("Syncing N
changes…") indefinitely.
Root cause: the push cycle treated every server rejection without
`code: "invalid"` as transient and only bumped its attempts counter — it was
never dropped and never rescheduled on its own. A permanently un-writable row
(a CHECK violation, or a column missing from the deployed Postgres table after
a skipped migration) therefore stayed in the queue forever. Because
`queueDepth > 0` is one of the conditions the indicator reports as "syncing",
the engine looked perpetually busy and the op only ever got re-attempted when
a later manual write happened to flush the same batch.
Fixes:
- Drop a non-"invalid" rejection once it has failed MAX_PUSH_ATTEMPTS (8)
times (its local Dexie row is untouched), so the queue can reach empty and
the indicator can settle. Whole-batch network/HTTP failures are unchanged —
those retry forever so an outage never discards writes.
- Schedule a backoff retry for bumped-but-not-dropped rejections so they keep
trying autonomously instead of stalling until the next manual write.
- Only invalidate React Query caches on a pull that actually applied rows.
Every push chains a pull, so a multi-record voice log previously triggered a
full refetch storm on no-op pulls, making analytics/insights views thrash.
Adds regression tests for the retry-cap drop and the below-cap bump paths.
Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01J2ktncagiX7xm4mXpD2uQ8
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughThe sync engine gains a ChangesSync Engine Improvements
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Coverage Report for apps/web
|
There was a problem hiding this comment.
🧹 Nitpick comments (1)
apps/web/src/__tests__/sync-engine.test.ts (1)
687-723: 💤 Low valueConsider adding coverage for the backoff scheduling behavior.
The test verifies the queue entry is kept and
attemptsis bumped, but doesn't assert thatschedulePushis called with the appropriate backoff delay. This is one of the key behavioral changes noted in the PR objectives ("Schedule backoff retries for bumped-but-not-dropped rejections so they continue attempting autonomously").Adding a spy on
schedulePushand verifying it was called with the expectednextBackoff(1)value would strengthen confidence in this path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@apps/web/src/__tests__/sync-engine.test.ts` around lines 687 - 723, The test for the push cycle rejection handling verifies that the queue entry is kept and attempts are bumped, but it does not validate that the backoff scheduling occurs as intended. Add a spy on the schedulePush function before the runPushCycle() call, then after runPushCycle() completes, add assertions to verify that schedulePush was called with the expected nextBackoff(1) value. This will ensure the backoff retry scheduling behavior for bumped-but-not-dropped rejections is properly tested.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@apps/web/src/__tests__/sync-engine.test.ts`:
- Around line 687-723: The test for the push cycle rejection handling verifies
that the queue entry is kept and attempts are bumped, but it does not validate
that the backoff scheduling occurs as intended. Add a spy on the schedulePush
function before the runPushCycle() call, then after runPushCycle() completes,
add assertions to verify that schedulePush was called with the expected
nextBackoff(1) value. This will ensure the backoff retry scheduling behavior for
bumped-but-not-dropped rejections is properly tested.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 637db7de-a26c-4119-aed2-e0f6b99bd610
📒 Files selected for processing (2)
apps/web/src/__tests__/sync-engine.test.tsapps/web/src/lib/sync-engine.ts
Symptom: after a voice log, the app would appear to "sync forever" — every
save felt like it never settled and the status dot stayed yellow ("Syncing N
changes…") indefinitely.
Root cause: the push cycle treated every server rejection without
code: "invalid"as transient and only bumped its attempts counter — it wasnever dropped and never rescheduled on its own. A permanently un-writable row
(a CHECK violation, or a column missing from the deployed Postgres table after
a skipped migration) therefore stayed in the queue forever. Because
queueDepth > 0is one of the conditions the indicator reports as "syncing",the engine looked perpetually busy and the op only ever got re-attempted when
a later manual write happened to flush the same batch.
Fixes:
times (its local Dexie row is untouched), so the queue can reach empty and
the indicator can settle. Whole-batch network/HTTP failures are unchanged —
those retry forever so an outage never discards writes.
trying autonomously instead of stalling until the next manual write.
Every push chains a pull, so a multi-record voice log previously triggered a
full refetch storm on no-op pulls, making analytics/insights views thrash.
Adds regression tests for the retry-cap drop and the below-cap bump paths.
Co-Authored-By: Claude Opus 4.8 (1M context) noreply@anthropic.com
Claude-Session: https://claude.ai/code/session_01J2ktncagiX7xm4mXpD2uQ8
Summary by CodeRabbit