Use monotonic clock for total_runtime to prevent negative view_runtime#97
Open
krzysztofjablonski wants to merge 2 commits into
Open
Use monotonic clock for total_runtime to prevent negative view_runtime#97krzysztofjablonski wants to merge 2 commits into
krzysztofjablonski wants to merge 2 commits into
Conversation
`ActiveSupport::Notifications::Event#duration` (used to compute `db_runtime`) is measured with `Process.clock_gettime(CLOCK_MONOTONIC)`, while `total_runtime` was measured with `Time.now`. Wall-clock skew (e.g. NTP corrections) could make the wall-clock delta smaller than the monotonic database time, producing a negative `view_runtime` in the logs. Switching `start_time`/`stop_time` to the monotonic clock keeps both quantities on the same scale and eliminates the negative timings. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The previous spec stubbed `db_runtime` to 0.0 and then asserted `view >= 0`, which collapsed to checking `total >= 0` twice. Replace with two focused specs: - A wall-clock skew scenario that asserts `total_runtime` stays non-negative when `Time.now` jumps backwards. - A controlled-monotonic-clock scenario that stubs `Process.clock_gettime` and asserts that `view_runtime` is `total_runtime - db_runtime` with a non-zero db value. Both fail on the pre-fix `Time.now` implementation. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
In production we are seeing grape_logging entries where
view_runtimeis negative anddb_runtimeexceedstotal_runtime:{ "data": { "db": 14951.37, "view": -624.44, "duration": 14326.93 } }It happens sporadically, but at API scale it consistently pollutes our metrics.
Diagnosis
db_runtimeis collected fromActiveSupport::Notifications::Event#duration, which since Rails 7+ is measured with the monotonic clock (Process.clock_gettime(Process::CLOCK_MONOTONIC)).total_runtimewas measured withTime.now, i.e. the wall clock. Any wall-clock correction (NTP step, VM suspend/resume, drift between host and container) makes theTime.now - Time.nowdelta smaller than the request's actual elapsed time measured monotonically. If SQL queries happened to fire in between,db_runtime(monotonic) can be larger thantotal_runtime(wall-clock), andview = total - dbends up negative.For the record - this is not the classic shared-instance-variable race discussed in #78. That thread was resolved by @samsonjs:
Grape::Middleware::Base#calldoesdup.call!, so@start_time/@stop_time/@envare per-request. The issue here is mixing two different clocks inside a single log entry.Fix
start_timeandstop_timenow useProcess.clock_gettime(Process::CLOCK_MONOTONIC). Both readings are on the same scale asevent.duration, sototal_runtime >= db_runtimealways holds, regardless of wall-clock corrections.The change is minimal and does not alter the public API -
parameters[:time]still exposes the same keys and units (milliseconds).Regression test
The new spec stubs
Time.nowto return values that move backwards in time between the request's start and end. On the old codetotal_runtimeis negative and the test fails. After the fix the monotonic clock no longer depends onTime.now, sototal_runtimeis positive.Test plan
bundle exec rspec- full suite green (36 examples)bundle exec rubocop- only 2 preexisting offenses inlib/grape_logging/multi_io.rb, unrelated to this changemasterand passes after the fix🤖 Generated with Claude Code