[smartswitch] Use dpu_halt_services_timeout from platform.json, fallback to HALT_TIMEOUT=60#376
Conversation
…ack to HALT_TIMEOUT=60 Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
vvolam
left a comment
There was a problem hiding this comment.
@rameshraghupathy Changes are fine. Could you fix the coverage?
@vvolam Sure, will do. |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
|
@rameshraghupathy please fix PR description and use PR template |
There was a problem hiding this comment.
1. PR description contradicts the code
The description says "increasing the timeout to 120s for now" but the code reads dpu_halt_services_timeout from platform.json with a 60s fallback — it doesn't hardcode 120. Please update the PR description to accurately explain the platform.json lookup mechanism and the fallback behavior. Also please use the PR template.
@vvolam Fixed it
|
@rameshraghupathy could you address the comments? |
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Signed-off-by: Ramesh Raghupathy <ram@cisco.com>
|
/azp run |
|
Azure Pipelines successfully started running 1 pipeline(s). |
Description of PR
Summary:
On SmartSwitch platforms, HALT reboot may require additional time for DPU services and containers to terminate cleanly before the reboot sequence is considered complete.
The current implementation uses a fixed HALT_TIMEOUT=60 seconds, which may not be sufficient on all platforms and hardware combinations.
This change makes the HALT timeout configurable through platform.json using:
"dpu_halt_services_timeout"
while preserving the existing behavior by falling back to:
HALT_TIMEOUT = 60
when:
the key is missing,
the value is null,
the value is invalid,
the file cannot be read,
or the configured value is non-positive.
Type of change
Back port request
How did you do it
Added helper function get_dpu_halt_services_timeout()
Read dpu_halt_services_timeout from platform.json
Added validation to ensure only positive timeout values are accepted
Added informational logging when fallback timeout is used
Updated HALT reboot flow to use platform-configured timeout
Added unit test coverage for:
missing key in platform.json
stable mocking of timeout helper in HALT reboot tests
Default behavior
If dpu_halt_services_timeout is not configured, the existing default behavior remains unchanged:
HALT_TIMEOUT = 60
How did you verify/test it?
normal HALT reboot flow
HALT timeout failure flow
missing platform.json key handling
invalid/unreadable timeout fallback handling
existing reboot UTs