Ensure tasks ID seed is always greater than existing storage files#255
Open
Abishekcs wants to merge 1 commit intorage-rb:mainfrom
Open
Ensure tasks ID seed is always greater than existing storage files#255Abishekcs wants to merge 1 commit intorage-rb:mainfrom
Abishekcs wants to merge 1 commit intorage-rb:mainfrom
Conversation
When a server restarts after a crash, the system clock may be behind the timestamps already present in the storage file due to clock skew caused by NTP synchronization, VM resume etc. This could result in new task IDs having lower timestamps than existing file entries, breaking ordering guarantees and risking duplicate task execution. Fix by scanning all acquired storage files (including recovered ones from crashed workers) on boot to find the highest existing timestamp, then applying Lamport's Implementation Rule IR2(b) to ensure the new seed is strictly greater than anything already in the storage file. This guarantees new task IDs are always ordered after existing ones regardless of clock state at boot time. Ref: "Time, Clocks, and the Ordering of Events in a Distributed System" Leslie Lamport, CACM Vol. 21 No. 7, July 1978, Section 2 (IR2b)
Author
|
While trying to understand the codebase, I came across this |
rsamoilov
reviewed
Apr 1, 2026
Member
rsamoilov
left a comment
There was a problem hiding this comment.
Hi @Abishekcs,
Thank you for the contribution - this looks great!
The implementation looks correct and handles the edge case well.
Could you please add some tests? Some suggestions:
- Startup with storage file containing timestamps in the future
- Startup with multiple recovered storage files with varying timestamps
- Startup with empty storage file
- Startup with only rem entries in a storage file
Member
|
Also, any chance you could check how much time this loop takes for a 2 MB WAL? |
Author
Sure no problem will add the tests. Thanks for the review. |
Author
Yup I can do that |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What this PR does
Ensure task IDs remain monotonic across restarts by seeding
task_id_seedfrom the maximum timestamp found in existing storage files.task_id_seed as:Why
Using only
Time.nowcan generate smaller timestamps than those already present in Storage files, leading to ordering issues. This change guarantees monotonic task ID generation.References
Screenrecording:
I played around with a script to reproduce the lower timestamp issue: https://github.com/Abishekcs/my-rage-app/blob/master/reproduce_bug.rb
Not 100% sure how realistic it is, but it helped me observe the behavior locally.
Using this repo: https://github.com/Abishekcs/my-rage-app (Note: rage gem is linked locally so changes reflect right away.)
Before.mp4
After.mp4