Skip to content

Fix test flakiness due to time.time() calls#362

Open
arista-hpandya wants to merge 1 commit into
sonic-net:masterfrom
arista-hpandya:fix-reboot_test-flakiness
Open

Fix test flakiness due to time.time() calls#362
arista-hpandya wants to merge 1 commit into
sonic-net:masterfrom
arista-hpandya:fix-reboot_test-flakiness

Conversation

@arista-hpandya
Copy link
Copy Markdown
Contributor

@arista-hpandya arista-hpandya commented Mar 23, 2026

Fixes: sonic-net/sonic-mgmt#23745

In the reboot_test.py, the assert calls check for the exact timestamp to match. However sometimes there is a delay between when the call is made and when the assert is checked. If this delay is greater than 1 second, the timestamps diverge and the assert fails. This is a flakiness introduced by the test framework and hence we should mock it and make it more robust.

@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

In the reboot_test.py, the assert calls check for the exact timestamp to match. However sometimes there is a delay between when the call is made and when the assert is checked. If this delay is greater than 1 second, the timestamps diverge and the assert fails. This is a flakiness introduced by the test framework and hence we should mock it and make it more robust.

Signed-off-by: arista-hpandya <hpandya@arista.com>
@arista-hpandya arista-hpandya force-pushed the fix-reboot_test-flakiness branch from f5f99d4 to a40bbc8 Compare March 23, 2026 23:18
@mssonicbld
Copy link
Copy Markdown

/azp run

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 1 pipeline(s).

@rlhui
Copy link
Copy Markdown

rlhui commented Apr 8, 2026

@vvolam , please help review? thanks

Copy link
Copy Markdown
Contributor

@vvolam vvolam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall the time.time() fix looks good. One suggestion below for a related pre-existing issue in test_execute_reboot_fail_halt_timeout.

with (
mock.patch("reboot._run_command") as mock_run_command,
mock.patch("time.sleep") as mock_sleep,
mock.patch("time.time", return_value=TEST_TIMESTAMP),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion (pre-existing issue): time.sleep is mocked here but time.monotonic() is not. The production code uses time.monotonic() for the halt timeout loop (while time.monotonic() - start_time < timeout). Since sleep is a no-op but monotonic is real, this test spins for ~60 real wall-clock seconds waiting for the loop to exit.

Consider also mocking time.monotonic to make this test fast:

Suggested change
mock.patch("time.time", return_value=TEST_TIMESTAMP),
mock.patch("time.time", return_value=TEST_TIMESTAMP),
mock.patch("time.monotonic", side_effect=[0, 61]),

This makes the loop iterate once (0 -> start_time=0) then exit immediately on the second call (61 > 60 timeout). It would cut this test from ~60s to near-instant.

Copy link
Copy Markdown
Contributor

@vvolam vvolam left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arista-hpandya changes LGTM, one small suggestion which you could fix along with the changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

Bug: Test flakiness in reboot_test.py due to delay in time.time() calls

5 participants