[code sync] Merge code from sonic-net/sonic-mgmt:202511 to 202603#1199
Merged
lizhijianrd merged 37 commits intoMay 19, 2026
Merged
Conversation
#### Why I did it Cherry-pick of #24427 into 202511 (auto cherry-pick had conflicts). The default GCU (Generic Config Updater) apply-patch timeout of 600 seconds is insufficient for some platforms, causing test failures. This increases the default to 900 seconds. #### How I did it Applied the same change from #24427 to the 202511 branch version of `tests/common/gu_utils.py`: - Added docstring to `get_gcu_timeout()` - Changed default timeout from 600s to 900s #### How to verify it Run any GCU test on a platform not listed in `GCUTIMEOUT_MAP` and verify the timeout is 900s. #### Conflict resolution On master, `get_gcu_timeout()` already had a multi-line docstring added by a prior PR. On 202511, the function was a one-liner without a docstring. Resolved by applying both the docstring addition and the timeout value change to the 202511 version. Signed-off-by: Zhaohui Sun <zhaohuisun@microsoft.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…e IPv4 addresses (#24519) ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fix PTF/DUT subnet mismatch in ARP tests when a VLAN interface has multiple IPv4 addresses. #### Root Cause After the community added a second VLAN IP (`192.169.0.1`), two APIs used by the ARP tests diverged in which IPv4 they selected: | API | Behavior | Selected IP | |-----|----------|-------------| | `get_first_vlan_ipv4` | Returns the **first** IPv4 in the VLAN | `192.168.0.1` (DUT) | | `ip_and_intf_info` (conftest) | Uses the **last** IPv4 network and assigns PTF an IP in that subnet | `192.169.x.x` (PTF) | This put PTF and DUT in **different subnets**, causing ping failures and MAC learning tests to break. #### Fix Rename `get_first_vlan_ipv4` to `get_vlan_last_ipv4` so it returns the **last** IPv4 in the VLAN, matching the behavior of `ip_and_intf_info`. Both APIs now consistently use `192.169.0.1`, keeping PTF and DUT in the same subnet. Additionally, improve robustness: - Use `ip_network`/`IPv4Network` type checking for IPv4 detection instead of the `":" in addr` heuristic - Add `try`/`except` for `ValueError` on malformed addresses #### Files Changed - `tests/arp/arp_utils.py` — renamed function, updated import, new selection logic - `tests/arp/test_arp_update.py` — updated import and call site <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Shivashankar CR <shivashankar.c.r@gmail.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Shivashankar C R <58802632+cshivashgit@users.noreply.github.com>
…g" failure on pytest 9.0.2 (#24515) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Add explicit test path ${SCRIPT_PATH} to pytest commands that doesn't have it (pretest, posttest, bsl). With new pytest 9.0.2, without an explicit test path, it fails to properly determine rootdir and load conftest.py, causing unrecognized arguments error. Known issue in pytest repo: pytest-dev/pytest#13913 Fixes #22508 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [x] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: markxiao <markxiao@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Mark Xiao <markxiao@arista.com>
…t_until instead of time.sleep(10) (#24523) ### Description of PR Summary: Replace bare `time.sleep(10)` with `wait_until(120, 10, 0, ...)` in `test_pmon_syseepromd_kill_and_start_status` and `test_pmon_pcied_kill_and_start_status`. After a SIGKILL the supervisor restarts the daemon asynchronously, so a fixed 10-second sleep is insufficient on slower platforms and the daemon may still be in `STARTING` state when the status check runs, causing: ``` Failed: syseepromd expected restarted status is RUNNING but is STARTING ``` Also adds the missing `check_expected_daemon_status` helper to `test_pcied.py`, consistent with the pattern already used in `test_syseepromd.py` and `test_psud.py`. ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? `time.sleep(10)` is a fixed delay that is not reliable. On some platforms the daemon takes longer than 10s to restart after SIGKILL, leaving it in `STARTING` state when the assert runs. #### How did you do it? Replace `time.sleep(10)` with `wait_until(120, 10, 0, check_expected_daemon_status, duthost, expected_running_status)` which polls up to 120s until the daemon reaches `RUNNING` state. This is consistent with how `test_psud.py` handles the same scenario. #### How did you verify/test it? Verified on Arista-7060X6-64PE platform — both `test_pmon_syseepromd_kill_and_start_status` and `test_pmon_pcied_kill_and_start_status` pass. #### Any platform specific information? Reproduced on Arista-7060X6-64PE. #### Supported testbed topology if it's a new test case? N/A ### Documentation N/A Signed-off-by: Bing Wang <bingwang@microsoft.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: bingwang-ms <66248323+bingwang-ms@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…r ERR during config_reload (#24522) ### Description of PR Summary: `test_drop_counters.py::test_ip_pkt_with_expired_ttl` fails because loganalyzer catches a transient `ERR` from syncd during test teardown. **Root cause:** The `configure_copp_drop_for_ttl_error` fixture (`drop_packets.py` line 271) calls `config_reload(duthost, safe_reload=True)` in its teardown after modifying the COPP trap configuration. During reload, orchagent restarts and immediately sends `FLEX_COUNTER_TABLE` SET commands via the ASIC channel carrying the newly allocated port VIDs. At this point syncd may not yet have finished creating all port SAI objects and populating its VID→RID translation map. For each unresolved VID, `Syncd.cpp processFlexCounterEvent` logs: ``` ERR syncd#syncd: :- processFlexCounterEvent: port VID <oid> was not found (probably port was removed/splitted) and will remove from counters now ``` syncd then self-heals by issuing a DEL for the stale counter entry. The race resolves once orchagent finishes re-programming all ports. **Fix:** Add the pattern to the `ignore_expected_loganalyzer_exceptions` autouse fixture so loganalyzer does not fail the test on this benign transient noise. ### Type of change - [x] Bug fix ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? `test_ip_pkt_with_expired_ttl` was failing on Arista-7060X6-64PE-P32O64 due to a loganalyzer false positive. #### How did you do it? Added `FlexCounterPortNotFoundRegex` to the `ignore_expected_loganalyzer_exceptions` autouse fixture in `test_drop_counters.py`. The regex matches exactly the transient syncd message that fires during `config_reload` port re-initialization. #### How did you verify/test it? - Traced the error to `config_reload` in `configure_copp_drop_for_ttl_error` teardown (drop_packets.py line 271). - Confirmed in syncd source (`Syncd.cpp processFlexCounterEvent`) that this is a known race: when `fromAsicChannel=true` and VID lookup fails, the code logs ERR and cleans up by issuing DEL for the stale counter. No functional impact. #### Any platform specific information? Observed on Arista-7060X6-64PE-P32O64 (broadcom ASIC). The error is not platform-specific — it can occur on any platform after `config_reload` when port flex counters are enabled. #### Supported testbed topology if it's a new test case? N/A — bug fix only. ### Documentation Signed-off-by: Bing Wang <bingwang@microsoft.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: bingwang-ms <66248323+bingwang-ms@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…454) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR Find all the details here: sonic-net/sonic-mgmt#23129 <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) This fixes the race conditions that were observed on Nvidia switches, this should also address sonic-net/sonic-mgmt#20563 The configure_nexthop_groups() function had two problems: Chunk batching bug: ip_batch[1:] was intended to skip only the first IP (2.0.0.1, the base neighbor), but when batching with chunk_size=200, it skipped the first element of EVERY batch, silently losing ~9 neighbors and their routes. Race condition: neighbor and route creation were interleaved in the same for-loop, so a route could reference a nexthop before its neighbor was fully programmed in HW. Fix by separating into two phases and removing the chunk batching mechanism (no longer needed with the two-phase approach): Phase 1: add all neighbors in one shot, then poll CRM ipv4_neighbor counter to confirm they are programmed in HW Phase 2: add all routes in one shot after neighbors are confirmed cherry-pick to 2025 had conflict. so created another PR ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? test_crm_nexthop_group[group_member=False] fails intermittently on msn4600 and msn4700 platforms with: CRM counter did not reach expected value within 60 seconds. Expected: used >= 1891, Actual: used=1807 #### How did you do it? #### How did you verify/test it? #### Any platform specific information? Observed on Mellanox LSN4700 and SN4600C — platforms with large NHG resource pools (~180K+) that cause the test to create ~1800+ nexthop groups, widening the race window. #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: sourabh kumar <kumarsourabh@microsoft.com>
…gle asic (#24511) Reapply PR23348 diff. broadcom-dnx check needs to support single asic sonic-net/sonic-mgmt#23851 seems to have reverted the fix from sonic-net/sonic-mgmt#23348 <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Saravanan Sellappa <saravanan@nexthop.ai>
[202511] Cherry-pick gnmi/gnxi testsuite & fixtures updates from master Backport of recent gnmi/gnxi test infrastructure work that has accumulated on master since the 202511 branch cut. All cherry-picks were applied in chronological merge order on top of `202511`. ### Included PRs (in cherry-pick order) 1. #22111 — Fix GNXI testsuite topology markers *(already present on 202511 via earlier backport — empty cherry-pick, skipped)* 2. #22412 — Remove autouse from gnmi test fixtures 3. #22354 — Update gnmi container parameters and add gnxi tests for container upgrade 4. #22553 — Consolidate gnxi tests into gnmi directory 5. #22877 — Refactor gNMI fixtures to couple server config with clients 6. #23755 — Add PtfGnmic client wrapper and gnmic capabilities integration test 7. #23876 — Add UDS transport support to gNMI/gNOI test fixtures 8. #24329 — [conditional_mark] Skip gnmi/test_gnmic.py ### Conflict resolutions - **#22412**: `tests/gnmi/test_gnmi_2038.py` — added on master by #15770 which was never backported to 202511. Dropped that file's hunk; the rest of the autouse removal applied cleanly. - **#22553**: `tests/gnmi/conftest.py` — took master's post-consolidation version (legacy proto-compile / `grpc_channel` fixtures removed). Deleted `tests/gnmi/grpc_utils.py` and `tests/gnmi/test_gnoi_system_grpc.py` consistent with master. The legacy `test_gnoi_system_grpc.py` is superseded by the consolidated `test_gnoi_system.py` which uses the new TLS-managed client framework. - **#23755**: Top-level `Makefile` was added on master by #22506 (not backported here). Dropped the Makefile hunk; the actual test code (`tests/common/ptf_gnmic.py`, `tests/gnmi/test_gnmic.py`) and fixture wiring were applied unchanged. - **#24329**: `tests/common/plugins/conditional_mark/tests_mark_conditions.yaml` — additive merge, kept both new skip blocks. ### PRs deliberately excluded - #22481 (cSONiC testbed) and #22506 (testbed Makefile) — not gnmi/gnxi scope. - #21529, #22248 — already on 202511 via batch backport #23653. ### Verification - Python syntax check (`py_compile`) on all changed `.py` files: all OK. - No 202511-only callers remain for the removed `setup_and_cleanup_protos` / `compile_protos` / `grpc_channel` fixtures (verified via `git grep`). - Pre-commit / Azure pipelines not run locally — relying on the branch CI for full verification. --------- Signed-off-by: Dawei Huang <daweihuang@microsoft.com>
…d of 8 (#24531) SKU: x86_64-arista_7280r4_32qf_32df This is the sonic-mgmt change that goes along with the sonic-buildimage change: sonic-net/sonic-buildimage#27149 ### Description of PR For SKU `x86_64-arista_7280r4_32qf_32df` increment the QSFP-DD port names based on their number of system lanes instead of number of line lanes. ### Type of change - [ ] Bug fix - [x] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? MSFT requested we increment the port names based on the number of system-side lanes instead of number of line-side lanes. #### How did you do it? Changed the `get_port_alias_to_name_map` function in port_utils.py for this SKU to increment by 4 for all ports on `x86_64-arista_7280r4_32qf_32df` SKU #### How did you verify/test it? Ran sonic-mgmt against this change and saw no related regressions. #### Any platform specific information? No #### Supported testbed topology if it's a new test case? N/A ### Documentation N/A Signed-off-by: Nathan Wolfe <nwolfe@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: arista-nwolfe <94405414+arista-nwolfe@users.noreply.github.com>
…xcvr_api is None (#24528) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: The on-DUT script crashes with AttributeError when get_sfp() or get_xcvr_api() returns None for empty SFP slots or unsupported transceivers (e.g., passive DAC cables). This causes the port_list_with_flat_memory fixture to fail at setup, blocking all dependent tests. Add None guards to skip ports without a valid xcvr API instead of crashing. Ports that cannot be queried are simply not added to the flat_memory list, allowing the actual tests to handle them gracefully. Fixes test_sfp, test_check_sfp_eeprom, and test_xcvr_info_in_db failures on Mellanox SN4700. ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? The get_port_indexes_with_flat_memory() helper (added in PR: sonic-net/sonic-mgmt#22561) assumes every SFP slot has a valid transceiver API. On empty/unpopulated ports, get_xcvr_api() returns None, crashing the fixture and blocking 3 tests (test_sfp, test_check_sfp_eeprom, test_xcvr_info_in_db). #### How did you do it? Added None checks for both get_sfp() and get_xcvr_api() return values before calling is_flat_memory(), gracefully skipping empty/unpopulated SFP slots instead of crashing. #### How did you verify/test it? Manual Testing #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: sourabh kumar <kumarsourabh@microsoft.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Sourabh Kumar <kumarsourabh@microsoft.com>
…P (#24419) (#24536) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Resolve 202511 conflicts in sonic-net/sonic-mgmt#24419 Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? The test_ptf_arp_learns_mac and test_dut_ping_learns_mac tests fail on physical testbeds because: 1. dut_interface_info used get_vlan_last_ipv4() which could return a VLAN IP not in the same subnet as the PTF interface IP 2. ptf_with_ip_config hardcoded /21 prefix which may not match the actual VLAN subnet configuration #### How did you do it? - Adding get_vlan_ipv4_for_subnet() to find the VLAN interface whose subnet contains the PTF IP - Updating dut_interface_info to select the matching VLAN IP and expose the correct prefix_len - Updating ptf_with_ip_config to use the dynamic prefix_len instead of hardcoded /21 #### How did you verify/test it? ``` --------------------------------------------- generated xml file: /data/sonic-mgmt-int/tests/logs/arp/test_arp_update.xml --------------------------------------------- ----------------------------------------------------------------------- live log sessionfinish ------------------------------------------------------------------------ INFO root:__init__.py:67 Can not get Allure report URL. Please check logs ============================================================= 5 passed, 215 warnings in 377.63s (0:06:17) ============================================================= DEBUG:tests.conftest:[log_custom_msg] item: <Function test_dut_ping_learns_mac[str2-msn4600c-acs-03-None]> ``` #### Any platform specific information? msn4600c #### Supported testbed topology if it's a new test case? t0-64 ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Janet Cui <janet970527@gmail.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…l calls (#24541) ### Description of PR [PR22094](sonic-net/sonic-mgmt#22094) recently introduced -n namespace arg in various commands FRR CLI shell calls. It passes test passes this namepsace arg as `-n asic0` in various `vtysh` commands. The FRR CLI shell calls command takes namespace index eg `vtysh -n 0 ...` . Fix is to asic_index instead of dut_namespace string at these calls. Summary: Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Fix failing `bgp/test_ipv6_nlri_over_ipv4` on multi-asic systems as various usage of `vtysh` command from the test should take namespace index eg `vtysh -n 0 ...` instead of namespace string. #### How did you do it? Fix is to asic_index instead of dut_namespace string at these calls. Create seperate setup variable and pass when executing FRR CLI shell calls. #### How did you verify/test it? Test `bgp/test_ipv6_nlri_over_ipv4` passes on both single and multi-asic system with these changes. #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: setu <setu@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Setu Patel <171176331+arista-setu@users.noreply.github.com>
…24540) Signed-off-by: Anand Mehra (anamehra) <anamehra@cisco.com> <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # sonic-net/sonic-buildimage#24802 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [x] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [x] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? The test case is not a valid scenario for T2. The dynamic threshold for pg lossless is not modified via GCU. #### How did you do it? Skip the test for T2 #### How did you verify/test it? Run test on T2 system #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Anand Mehra (anamehra) <anamehra@cisco.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: anamehra <54692434+anamehra@users.noreply.github.com>
… for multi-asic (#24542) ### Description of PR In sonic-net/sonic-mgmt#21942 we fixed `test_everflow_fwd_recircle_port_queue_check` to return the total number of packets sent by `send_and_check_mirror_packets` and use that value as the expected number of packets seen on the `Ethernet-Rec` interface. The problem is on multi-asic skus `send_and_check_mirror_packets` might use 2 src_ports, one on ASIC0 and the other on ASIC1. This results in half the packets utilizing `Ethernet-Rec0` and the other half using `Ethernet-Rec1`. This change will update the return value of `send_and_check_mirror_packets` to be a dictionary instead of an integer. The dictionary will be key'd on `(dut, asic)` and the value will be the number of packets sent. With this return value `test_everflow_fwd_recircle_port_queue_check` can just look at how many packets were sent on the `(dut, asic)` belonging to the `Ethernet-Rec` port we're checking. ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Fix `test_everflow_fwd_recircle_port_queue_check` for multi-asic SKUs #### How did you do it? Track packets per `(dut, asic)` to only assert on the packets sent on our `Ethernet-Rec`'s `(dut, asic)` #### How did you verify/test it? Ran `test_everflow_fwd_recircle_port_queue_check` on the following SKUs: single-asic fixedsystem multi-asic fixedsystem single-asic chassis multi-asic chassis #### Any platform specific information? N/A #### Supported testbed topology if it's a new test case? N/A ### Documentation N/A Signed-off-by: Nathan Wolfe <nwolfe@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: arista-nwolfe <94405414+arista-nwolfe@users.noreply.github.com>
…s (#24065) (#24450) Cherry-pick of : sonic-net/sonic-mgmt#24065 ### Description of PR Summary: The `cpu_shaper` test uses a BCM cint script (`get_shaper.c`) that calls `bcm_cosq_port_bandwidth_get` to read CPU queue shaper PPS values. This is not supported on TH5+ devices, causing the test to fail on platforms like Arista 7060X6. This change updates the cint script to try the modern gport-based API (`bcm_cosq_gport_bandwidth_get`) first, which works on TH5+ and previous platforms, and falls back to the legacy port-based API for platforms where the gport API may not be available. The success output format is preserved (`cos=N pps_max=M`) so no changes are needed in `test_cpu_shaper.py`. ### Type of change - [x] Bug fix ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? The test `cpu_shaper/test_cpu_shaper.py` fails on TH5+ devices because `bcm_cosq_port_bandwidth_get`. The BCM SDK returns rv=-16 (`BCM_E_UNAVAIL`), the cint script prints an error line, the Python regex finds no matches, and the assertion fails with `actual_pps = {}`. #### How did you do it? Updated `tests/cpu_shaper/scripts/get_shaper.c` to: 1. Try `bcm_cosq_gport_bandwidth_get` (modern gport-based API) first, works on all platforms 2. If it fails, fall back to `bcm_cosq_port_bandwidth_get` (legacy API) for older platforms. 3. If both fail, print a single error line with both return codes for debugging. 4. Reset output parameters (`flags`) between API calls to prevent stale values from affecting the fallback call. No changes to `test_cpu_shaper.py`, the success output format (`cos=%d pps_max=%d`) is identical for both API paths and matches the existing regex `r'cos=(\d+) pps_max=(\d+)'`. #### How did you verify/test it? - **DNX platform (7060X6 / J2C+):** gport API succeeds on first attempt, test passes. - **XGS platform:** gport API succeeds (modern XGS), or falls back to legacy API (older XGS). Existing behavior preserved. - Verified that error output format does not false-match the Python regex. #### Any platform specific information? - TH5+ devices, e.g., Arista 7060X6: require the gport-based API. The legacy `bcm_cosq_port_bandwidth_get` returns `BCM_E_UNAVAIL`. #### Supported testbed topology if it's a new test case? N/A — existing test, topology unchanged (t0, t1). ### Documentation N/A — no new features or test cases. --------- <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Priyansh Tratiya <ptratiya@microsoft.com>
…4576) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Add configuration profile symlinks for core/leaf switches related to t2 min topology. ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [x] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Config templates that will be used to test t2-single-node-min topology. #### How did you do it? #### How did you verify/test it? Validated by deploying t2-single-node-min topology works with these files added. #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Vinay Kaza <vinay@nexthop.ai> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: vinay-nexthop <vinay@nexthop.ai>
…st_tx_disable_channel (#24579)
### Description of PR
`platform_tests.api.test_sfp.TestSfpApi::test_tx_disable_channel` is
supposed to skip ports whose transceiver doesn't support per-channel TX
disable (e.g. DAC cables). On platforms with SFP+ DAC uplinks (e.g.
Nokia 7215) the skip stopped taking effect and the test fails on every
SFP+ DAC port.
**Root cause:** PR #23972 added `"SFP"` to the plain-string compliance
check list inside `is_xcvr_optical()`:
```diff
- if xcvr_info_dict["type_abbrv_name"] in ["QSFP-DD", "OSFP-8X", "QSFP+C", "BP"]:
+ if xcvr_info_dict["type_abbrv_name"] in ["QSFP-DD", "OSFP-8X", "QSFP+C", "BP", "SFP"]:
```
That change was needed for Cisco-console SFP whose
`specification_compliance` is a plain string, but it broke the existing
SFP DAC detection path: a standard SFP/SFP+ DAC reports
`specification_compliance` as a **dict-formatted string** (e.g.
`{'SFP+CableTechnology': 'Passive Cable', ...}`). Because `"SFP"` is now
matched in the first branch, the dict-string never matches the two
literal copper strings, the function returns `True`, and
`test_tx_disable_channel` runs on the DAC port and fails.
**Fix:** keep the new plain-string fast-path (still handles
QSFP-DD/OSFP-8X/QSFP+C/BP and Cisco-console SFP), but for `SFP` fall
through to the existing `ast.literal_eval()`-based dict parsing when the
spec is not one of the known plain-string copper values.
`ast.literal_eval` is wrapped in `try/except (ValueError, SyntaxError)`
so a non-dict, non-copper spec is treated as optical instead of raising.
Summary:
Fixes the regression introduced by #23972 for SFP DAC transceivers.
### Type of change
- [x] Bug fix
- [ ] Testbed and Framework(new/improvement)
- [ ] New Test case
- [ ] Skipped for non-supported platforms
- [ ] Test case improvement
### Back port request
- [ ] 202205
- [ ] 202305
- [ ] 202311
- [ ] 202405
- [ ] 202411
- [ ] 202505
- [x] 202511
### Approach
#### What is the motivation for this PR?
`test_tx_disable_channel` started failing on platforms with SFP+ DAC
uplinks (e.g. Nokia 7215) after PR #23972. The DAC skip in
`is_xcvr_optical()` no longer triggers for `type_abbrv_name == "SFP"`
whose `specification_compliance` is a dict-formatted string.
#### How did you do it?
In `tests/platform_tests/api/test_sfp.py::is_xcvr_optical`:
- Keep the plain-string check for `QSFP-DD`, `OSFP-8X`, `QSFP+C`, `BP`,
and `SFP` (returns False on `"Passive Copper Cable"` /
`"passive_copper_media_interface"`).
- For `SFP`, if the spec didn't match those plain strings, fall through
to `ast.literal_eval(spec)` and the existing `SFP+CableTechnology ==
"Passive Cable"` check.
- For all other types, keep the existing dict-based `10/40G Ethernet
Compliance Code` / `Extended Specification Compliance` "CR" check.
- Wrap `ast.literal_eval` in `try/except (ValueError, SyntaxError)` so a
non-dict, non-copper plain-string spec is treated as optical (preserves
the Cisco-console SFP behavior added by #23972).
#### How did you verify/test it?
Ran the testcase on a Nokia 7215 testbed (m0 topology, 4× SFP+ DAC
uplinks) with the fix:
```
platform_tests/api/test_sfp.py::TestSfpApi::test_tx_disable_channel
WARNING tests.platform_tests.api.test_sfp:test_sfp.py:862 test_tx_disable_channel: Skipping transceiver 49 (not applicable for this transceiver type)
WARNING tests.platform_tests.api.test_sfp:test_sfp.py:862 test_tx_disable_channel: Skipping transceiver 50 (not applicable for this transceiver type)
WARNING tests.platform_tests.api.test_sfp:test_sfp.py:862 test_tx_disable_channel: Skipping transceiver 51 (not applicable for this transceiver type)
WARNING tests.platform_tests.api.test_sfp:test_sfp.py:862 test_tx_disable_channel: Skipping transceiver 52 (not applicable for this transceiver type)
PASSED
================= 1 passed, 154 warnings in 287.91s (0:04:47) ==================
```
All four SFP+ uplinks (49–52) are now correctly identified as DAC and
skipped.
#### Any platform specific information?
First reported on Nokia 7215 (4× SFP+ DAC uplinks), but the regression
affects any platform where an SFP/SFP+ DAC's `specification_compliance`
is reported as a dict-formatted string (which is the standard / default
representation for SFP DAC).
#### Supported testbed topology if it's a new test case?
N/A — fix to an existing test case.
### Documentation
N/A
Signed-off-by: Zhijian Li <zhijianli@microsoft.com>
Signed-off-by: mssonicbld <sonicbld@microsoft.com>
Co-authored-by: Zhijian Li <zhijianli@microsoft.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…p test p… (#24449) …eer ranges before teardown (#24040) What: Added explicit deletion of test-specific BGP_PEER_RANGE entries (BGPSLBPassive, BGPSLBPassive2, BGPSLBPassiveV6, BGPSLBPassiveV62) in dual_asn_teardown() before rollback, with multi-ASIC support. Why: test_bgp_dual_asn_v4 splits the VLAN subnet (192.168.0.0/21) into two /22 halves. At teardown, rollback_or_reload() tries to restore BGPVac (/21) but FRR rejects it with "Listen range overlaps" because the test's /22 ranges are still active — a race condition with async bgpcfgd processing. How: Delete all test peer ranges with module_ignore_errors=True before the setup_env fixture runs rollback, preventing the overlap conflict. Testing: Ran test_bgp_dual_asn_v4 on Broadcom 7060x6 (202511). No bgpcfgd overlap errors. BGPVac correctly restored. All CI checks passed. <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Priyansh Tratiya <ptratiya@microsoft.com>
…ment configuration (#22006) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes #18077 Modify the gratuitous arp service such that arp or neighbor-discovery packets are sent based on the L3 configuration of the testbed topology. For testbed topologies which do not configure IPv4 addressing, such as isolated-v6 testbeds, we skip the creation of arp packets and do not send them. In this case, only IPv6 neighbor-discovery packets are created and sent. The inverse is true as well when a testbed topology only specifies the use of IPv4 connections. ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [x] 202505 ### Approach #### What is the motivation for this PR? This change was made as a part of the isolated-v6 testbed qualification effort. Running acl tests on isolated-v6 testbeds was previously unsupported, and skipped via conditional mark, as the tests expected to receive IPv4 arp packets when no IPv4 connectivity was established. #### How did you verify/test it? The full suite of acl tests was run against a testbed where IPv4 and IPv6 connectivity was specified in the topology config, and no regressions were seen. The same suite of tests was run against a isolated-v6 testbed, and new test passes were observed: some new failures were also observed for tests which were previously skipped. However @r12f asked that we submit this change, and follow up with the remaining failing tests individually in subsequent changes as this change improves overall test coverage. #### Any platform specific information? None. Signed-off-by: Will Rideout <wrideout@arista.com> Co-authored-by: wrideout-arista <wrideout@arista.com>
<!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR This is a cherry-pick of sonic-net/sonic-mgmt#22801 and sonic-net/sonic-mgmt#21863. #22801 is needed fix for aristanetworks/sonic-qual.msft#1266, but its relies on another missing PR (#21863). See respective PRs for details <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # aristanetworks/sonic-qual.msft#1266 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> --------- Signed-off-by: rajkumar1 <rajkumar1@arista.com> Signed-off-by: Rustiqly <rustiqly@users.noreply.github.com> Co-authored-by: gshemesh2 <gshemesh@nvidia.com> Co-authored-by: Rustiqly (agent of lihuay) <245760149+rustiqly@users.noreply.github.com> Co-authored-by: Rustiqly <rustiqly@users.noreply.github.com>
…4345) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? We were seeing failures in the `base_packet_trimming.py::test_trimming_counters` testcase related to trim counters being zero. #### How did you do it? We could not reproduce the issue while using a debugger, leading us to believe it was timing related. After adding a sleep before reading the counters, we did not see the issue. We opted to implement logic that uses `wait_for` instead of sleeping. #### How did you verify/test it? Verified the failures were no longer happening with this change. #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Ryan Garofano <rgarofano@arista.com>
…serdes/FEC errors (#24611) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: The Mellanox SDK recently changed its log format to include 'client_pid=N, ' before the file path, causing existing ignore patterns to no longer match. Additionally, FEC alignment lock polling generates errors on pre-SPC4 platforms (SN2700, SN4600C, SN4700) and orchagent emits errors for ports without serdes objects. These unignored ERR-level syslogs cause loganalyzer teardown failures across all Mellanox platforms affecting any test using loganalyzer. ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### How did you do it? New patterns added: - SAI_UTILS FEC_ALIGNMENT_LOCK get_dispatch_attribs_handler (new SDK format) - SAI_UTILS sai_get_attributes failures (new SDK format with client_pid) - SAI_PORT mlnx_port_state_get FEC alignment lock on pre-SPC4 platforms - orchagent clearPortPhySerdesAttrCounterMap for ports without serdes objects #### How did you verify/test it? Manual testing #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: sourabh kumar <kumarsourabh@microsoft.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Sourabh Kumar <kumarsourabh@microsoft.com>
…ress objects (#24624) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: In the sFlow ptftests, parse the Agent IDs into ipaddress objects for more accurate comparison between expected and actual values. ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? In the sFlow ptftests, the Agent IDs, which are IP addresses, are compared as strings. This causes problems in `--mgmtIpv6Only` setups, since this comparison doesn't recognize normalized and fully-expanded IPv6 addresses as the same. #### How did you do it? This change transforms the strings into `IPv6Address`/`IPv4Address` object from the `ipaddress` library, so that the comparison will be accurate. #### How did you verify/test it? I confirmed in a Pikez M0 cluster with `--mgmtIpv6Only` that this change fixes the test. #### Any platform specific information? N/A #### Supported testbed topology if it's a new test case? N/A ### Documentation N/A <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Vitor Mendonca <vitor@arista.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: vitor-arista <vitor@arista.com>
…capture_and_check_packet_on_dut (#24621) **Why** PR #22876 cut the default `tcpdump_buffer_size` in `capture_and_check_packet_on_dut` from 102400 KiB (100 MiB) to 4096 KiB (4 MiB) to address the 1 GiB explicit override in `test_dhcp_counter_stress` causing a ~2 GiB memory spike. The 4 MiB default, however, is too small for normal stress-test workloads with bursty packet rates on lower-perf platforms (720dt, 7215). **Note: PR #22876 was not actually validated.** The `How did you verify/test it?` section of #22876 only states: > - Code passes flake8 with max-line-length=120 > - Fix matches the exact unit interpretation from tcpdump documentation No test execution. The 4 MiB value was a unit-correction choice based on tcpdump docs, not an empirically-tested minimum. As shown by the data below, 4 MiB drops 7.2% of relayed DHCP packets on real hardware — i.e. the new default was never validated against the stress-test workload that motivated the change. Empirical measurements on `testbed-bjw2-can-720dt-6` (Arista-720DT-G48S4, SONiC.20251110.26) running `test_dhcp_counter_stress[discover]` (25 pps × 48 servers × 120 s): | Buffer | tcpdump drop rate vs dhcpmon counter | test result | |---:|---:|---| | 1 MiB | 13.0% | FAIL | | **4 MiB (current default, set by #22876)** | **7.2%** | **FAIL** | | 16 MiB | 1.47% | FAIL | | 64 MiB | <0.01% | PASS | **How** Bump default from `4096` (4 MiB) to `131072` (128 MiB). Gives comfortable headroom for stress tests on slow platforms while remaining 8× smaller than the historical 1 GiB explicit override that caused memory spikes. **Back port request** - [x] 202511 Refs: PR #22876 (commit 58a6f0b), PR #20580, PR #24592 (companion 202511-only cleanup). Signed-off-by: Xichen96 <lukelin0907@gmail.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Xichen96 <lukelin0907@gmail.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…n on Cisco-8102 due to timemaster.service (#24614) ### Description of PR Summary: On Cisco-8102 lab DUTs running recent SONiC images, `timemaster.service` fails benignly (no reachable NTP/PTP source in the lab management network). This causes `systemctl is-system-running` to report `degraded`, which trips `config_system_checks_passed` in `test_reload_configuration`. The pre-reload system-state gate at `platform_tests/test_reload_config.py:63` then polls for 360s and asserts: ``` $ systemctl is-system-running degraded (rc=1) $ systemctl list-units --state=failed * timemaster.service loaded failed failed Synchronize system clock to NTP and PTP time sources 1 loaded units listed. ``` The failure is unrelated to the data plane and unrelated to `config reload` itself - the assertion fires before the test ever invokes `config reload`. Reproduced on multiple Cisco-8102 testbeds in the same nightly plan. This PR adds an `xfail` clause scoped strictly to `platform == 'x86_64-8102_64h_o-r0'` while the underlying `timemaster.service` issue is tracked separately on the platform/image side. The test still runs (so XPASS will surface once the platform issue is fixed); only the assertion is allowed to fail without breaking the plan. Fixes # (issue) ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Unblock the Cisco-8102 nightly pipeline while the platform-side `timemaster.service` issue is tracked separately. The current assertion provides no actionable signal - it only reports that a non-data-plane systemd unit is in a failed state, which is already known. #### How did you do it? YAML-only change in `conditional_mark`. Added an `xfail` block next to the existing `skip` block on `platform_tests/test_reload_config.py::test_reload_configuration`, with a single condition matching the 8102 platform string. ```yaml xfail: reason: "timemaster.service fails benignly on Cisco-8102 lab DUTs ..." conditions: - "platform in ['x86_64-8102_64h_o-r0']" ``` #### How did you verify/test it? - Inspected pytest log from a failing nightly run: assertion at `test_reload_config.py:63` with `timemaster.service` as the sole failed unit on str2-8102-01. - Confirmed `conditional_mark` supports concurrent `skip` + `xfail` blocks on the same test - precedent at `platform_tests/test_reboot.py::test_watchdog_reboot` in the same YAML file. - Confirmed the platform string `x86_64-8102_64h_o-r0` is already used in this YAML (see `test_watchdog_reboot`). - Other Cisco-8000 SKUs (8101, 8111, 8800-LC) and all non-Cisco platforms are unaffected because the condition matches the exact 8102 platform string only. #### Any platform specific information? Cisco-8102 only (`x86_64-8102_64h_o-r0`). #### Supported testbed topology if it's a new test case? N/A - existing test, no topology change. ### Documentation N/A - no user-facing behavior or feature changes. Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: ShiyanWangMS <shiyanwang@microsoft.com> Co-authored-by: wsycqyz <wsycqyz@users.noreply.github.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…unter_stress (#24592) Partial back-port of #23939 — only the buffer-size removal part. The other changes in #23939 (adding `relay_agent`/`downlink_vlan_iface_name` ptf_runner params, dropping `.json` from `count_file` path) depend on PR #19198 which was reverted from 202511 by #23714, so they don't apply here. **Why** After PR #22876 was back-ported to 202511, the framework default for `tcpdump_buffer_size` (in `capture_and_check_packet_on_dut`) dropped from 100 MiB to 4 MiB. The explicit `BUFFER_SIZE = 1024` (1 MiB) override in this test is now smaller than the framework default. Removing the override lets the test use the (larger) framework default and matches master. **Note** The 4 MiB default is still too small to fully pass on 720dt-class hardware under this test's stress load. A follow-up PR will raise the default in `capture_and_check_packet_on_dut`. This PR is pure cleanup to bring 202511 in line with master. **Tested** Applied locally on `internal-202511` @ `da7363c`, `testbed-bjw2-can-720dt-6` (Arista-720DT-G48S4, SONiC.20251110.26): - Before (1 MiB): tcpdump drops ~13% of relayed packets, FAIL - After (4 MiB framework default): drops reduced to ~7%, still over the 0.01% margin (follow-up PR will fix) - For reference, 64 MiB passes cleanly. Refs: #23939, #22876, #20580. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com> Signed-off-by: Xichen96 <lukelin0907@gmail.com> Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…sic dut (#24632) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) - This PR fixes the issue #20896 - It adds multi-asic support for the new test '_test_verify_copp_configuration_' and also fixes the issue with '_test_policer_' as part of the changes introduced by #18326 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [x] 202505 ### Approach #### What is the motivation for this PR? - To fix the issue #20896 and also to fix the issue with '_test_policer_' as part of the changes introduced by #18326 #### How did you do it? - Verify if the DUT is multi_asic and modified commands based on the result. #### How did you verify/test it? - Ran all the COPP test cases on T2 multi-asic DUT and made sure all the tests are passed. #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> <img width="387" height="718" alt="image" src="https://github.com/user-attachments/assets/a291b892-2e61-48c2-98c0-2a53329ffe41" /> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: sanjair-git <114024719+sanjair-git@users.noreply.github.com>
Cherry-pick of #24608 to 202511. Original PR: sonic-net/sonic-mgmt#24608 ### Description of PR Summary: Fix failure on qos/test_qos_sai.py:testQosSaiBufferPoolWatermark on Q200 (Cisco-8102-C64). The SMS usage is changing without any traffic; initial watermark fluctuation caused failures. Adjusts margin to 6 pkts for `gb` ASIC. Note: A minor merge conflict on `tests/qos/files/cisco/qos_param_generator.py` was resolved by preserving 202511's existing `extra_cap_margin = 20` for lossless (the change in master from 20 -> 25 is unrelated to this PR) and adding the new `gb` blocks from #24608. ### Type of change - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Cherry-pick of #24608 into 202511. #### How did you do it? Cherry-picked commit 667279bef0682e1152ff08052df1d1d2eaaf535a from #24608. Resolved conflict in `tests/qos/files/cisco/qos_param_generator.py`. #### How did you verify/test it? Verified on m64 Cisco-8102-C64 in original PR #24608. #### Any platform specific information? Q200. #### Supported testbed topology if it's a new test case? N/A ### Documentation N/A Signed-off-by: Zhixin Zhu <zhixzhu@cisco.com> Co-authored-by: Zhixin Zhu <zhixzhu@cisco.com>
### Description of PR When running on isolated-v6 testbeds, use IPv6 addressing in the outermost L3 header of IPinIP packets, as IPv4 addresses are not configured and are unresolvable on these testbeds. Signed-off-by: Will Rideout <wrideout@arista.com>
…arnings in global LogAnalyzer ignore list (#24562) Add two global LogAnalyzer ignore patterns for ctrmgrd ERR lines that are emitted when kubeadm reports Docker version incompatibility (Docker 28.x vs old k8s cluster). These warnings bleed into downstream test LogAnalyzer windows, causing false failures. ### Description of PR Summary: When `test_kubesonic_join_and_disjoin` runs (or fails and retries), `ctrmgrd` calls `kubeadm join`, which emits Docker/kubelet version warnings to syslog as ERR lines: ``` ERR ctrmgrd.py: Refer file /tmp/tmpXXXXkube_hints_ for troubleshooting tips ERR ctrmgrd.py: [WARNING SystemVerification]: Docker version is not on the list of validated versions: 28.2.2. Latest validated version: 20.10 ``` These ERR lines appear ~seconds after the kubesonic test completes and fall inside the **next test's** LogAnalyzer window, causing false failures in unrelated tests such as `snmp/test_snmp_queue.py::test_snmp_queues`. Root cause: Docker 28.x is not on kubeadm's validated versions list for the currently deployed k8s version. This is a known infra limitation (k8s upgrade is in progress). Fix: add the two patterns to `loganalyzer_common_ignore.txt` (global, not per-test) because any test following a kubesonic join test may be affected. This is the defense-in-depth complement to #24159 (which fixes kubesonic teardown to call `config kube server disable on`, stopping ctrmgrd from retrying k8s join). ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Docker 28.x (installed on test hosts) is not on kubeadm's validated versions list for the k8s version currently deployed in the test environment. This causes ctrmgrd to emit ERR-level log lines when kubesonic join/disjoin runs. These ERR lines fall inside downstream LogAnalyzer windows and cause false failures. The k8s team is working on an upgrade; this ignore list entry prevents false failures in the interim. #### How did you do it? Added 2 regex patterns to `ansible/roles/test/files/tools/loganalyzer/loganalyzer_common_ignore.txt`: ``` r, ".* ERR ctrmgrd\.py: Refer file .azure-pipelines .flake8 .git .github .gitignore .hooks .markdownlint.json .pre-commit-config.yaml .pre-commit-hooks.yaml for troubleshooting tips.*" r, ".* ERR ctrmgrd\.py:.*\[WARNING SystemVerification\]:.*Docker version is not on the list of validated versions.*" ``` Patterns are scoped to ctrmgrd Docker/kubelet version warnings and do not suppress unrelated ctrmgrd errors. #### How did you verify/test it? Confirmed via log analysis of nightly job [69e79fb88e43924279229609](https://elastictest.org/scheduler/testplan/69e79fb88e43924279229609) that these two exact lines triggered the LogAnalyzer failure in test_snmp_queues teardown. #### Any platform specific information? None - global ignore entry affects all platforms. #### Supported testbed topology if it's a new test case? N/A ### Documentation N/A ADO: https://msazure.visualstudio.com/One/_workitems/edit/37717660 Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Liping Xu <108326363+lipxu@users.noreply.github.com>
… (#24604) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes #24386 ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [x] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? The mgmt VRF table ID got bumped from 5000 to 6000 (sonic-net/sonic-buildimage#26410). But `verify_show_command`, a function called as part of the module's test setup, was failing because it still expected the mgmt VRF table ID to be 5000. This caused the entirety of `tests/mvrf/test_mgmtvrf.py` tests to fail. #### How did you do it? Updated `verify_show_command` to expect 6000 as the mgmt VRF table ID #### How did you verify/test it? Run `tests/mvrf/test_mgmtvrf.py` on a DUT; the tests should run normally. Without the changes, the following error shows up. ```sh _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ def verify_show_command(duthost, mvrf=True): show_mgmt_vrf = duthost.shell("show mgmt-vrf")["stdout"] mvrf_interfaces = {} if mvrf: mvrf_interfaces["mgmt"] = r"\d+:\s+mgmt:\s+<NOARP,MASTER,UP,LOWER_UP> mtu\s+\d+\s+qdisc\s+noqueue\s+state\s+UP" mvrf_interfaces["vrf_table"] = "vrf table 5000" mvrf_interfaces["eth0"] = r"\d+:\s+eth0+:\s+<BROADCAST,MULTICAST,UP,LOWER_UP>.*master mgmt\s+state\s+UP " mvrf_interfaces["lo"] = r"\d+:\s+lo-m:\s+<BROADCAST,NOARP,UP,LOWER_UP>.*master mgmt" if "ManagementVRF : Enabled" not in show_mgmt_vrf: raise Exception("'ManagementVRF : Enabled' not in output of 'show mgmt vrf'") for _, pattern in list(mvrf_interfaces.items()): if not re.search(pattern, show_mgmt_vrf): > raise Exception("Unexpected output for MgmtVRF=enabled") E Exception: Unexpected output for MgmtVRF=enabled _ = 'vrf_table' pattern = 'vrf table 5000' mvrf/test_mgmtvrf.py:245: Exception ``` #### Any platform specific information? n/a #### Supported testbed topology if it's a new test case? n/a ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: donggyu-nexthop <donggyu@nexthop.ai> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: donggyu-nexthop <donggyu@nexthop.ai>
…GP convergence check (#24682) Summary: Fix IPv6-only topology support in generic_patch BGP convergence check Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? PR #22895 added BGP session convergence wait before DB comparison in `generic_patch_add_t0()`, but unconditionally checks both IPv4 and IPv6 BGP sessions. On IPv6-only topologies (e.g. `t1-isolated-v6-d56u1-lag`), `tor_data["ip"]["remote"]` is empty, causing is_bgp_session_established() to fail. #### How did you do it? Fix by checking whether each neighbor IP exists before waiting for the BGP session, consistent with the `chk_any_bgp_session()` approach from PR #21591. #### How did you verify/test it? Regression test pass #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: weguo-NV <154216071+weiguo-nvidia@users.noreply.github.com>
… stale (#24691) ### Description of PR Summary: When `sudo monit validate` is run just before `sudo monit status`, the status output may still carry the **old** &Azure#34;data collected&Azure#34; timestamp because monit hasn&Azure#39;t finished its internal refresh cycle yet. This causes the memory-utilization plugin to read stale baseline data before or after a test run. This PR adds a **freshness-retry** mechanism: 1. `record_monit_baseline_from_validate_output(validate_output)` — parses the System-block &Azure#34;data collected&Azure#34; timestamp from the `sudo monit validate` stdout and saves it as a baseline. Called in both `pytest_runtest_setup` and `pytest_runtest_teardown` (in `__init__.py`) right after `sudo monit validate`. 2. `read_monit_status_with_freshness_retry(cmd)` — executes `sudo monit status`, compares the System-block &Azure#34;data collected&Azure#34; timestamp against the saved baseline, and if they still match (stale), sleeps `MONIT_STATUS_FRESHNESS_WAIT_SECONDS` (60 s) and retries, up to `MONIT_STATUS_FRESHNESS_MAX_RETRIES` (3) times. Used only for the `monit` command entry. Both constants are module-level tunables so they can be overridden in tests. ### Type of change - [x] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [x] 202511 ### Approach #### What is the motivation for this PR? Intermittent false-positive memory alarms were caused by the monit daemon not having refreshed its internal data by the time `sudo monit status` was issued right after `sudo monit validate`. The stale status output contained pre-test memory readings which then incorrectly appeared as the &Azure#34;before test&Azure#34; baseline, making normal memory usage look like an increase. #### How did you do it? - Added `record_monit_baseline_from_validate_output()` to capture the System-block &Azure#34;data collected&Azure#34; timestamp immediately after `sudo monit validate`. - Added `read_monit_status_with_freshness_retry()` to compare the current monit status timestamp against the saved baseline; if still stale, sleep and retry (up to 3 times, 60 s each). - Hooked both functions into `pytest_runtest_setup` and `pytest_runtest_teardown` in `__init__.py`. - Only the `monit` command entry uses the freshness-retry path; all other memory commands (`top`, `free`, `docker stats`, FRR) are unchanged. #### How did you verify/test it? - Manually verified on a VS testbed that `_parse_monit_memory_data_collected_timestamp` correctly extracts the System-block timestamp while ignoring Filesystem/Process/Program block timestamps. - Unit-tested the retry logic by mocking `execute_command` to return stale output for the first N calls and fresh output on the final call. #### Any platform specific information? The retry wait time (60 s) matches the monit default poll cycle; can be lowered if the target device uses a shorter cycle. #### Supported testbed topology if it&Azure#39;s a new test case? N/A — this is a framework fix for the memory-utilization plugin, not a new test case. ### Documentation No documentation update required — this is an internal framework fix. ### Verification Elastic test jobs for `generic_config_updater` (branch: `dev/xuliping/20260512_internal-202511_monit-freshness-retry`, image: `internal-202511`): | Testbed | Job Link | |---------|----------| | testbed-bjw2-can-t0-7260-9 | https://elastictest.org/scheduler/testplan/6a03157feb4c0d0f5d30bd70 | | testbed-bjw2-can-t0-7260-1 | https://elastictest.org/scheduler/testplan/6a031580a907302e5e8240cb | | testbed-bjw3-can-t0-7060-7 | https://elastictest.org/scheduler/testplan/6a0315c99f3385605e3ddb9b | | testbed-bjw3-can-t0-7060-6 | https://elastictest.org/scheduler/testplan/6a0315c9ea3a02a739d03786 | 12/05/2026 17:35:56 memory_utilization.read_monit_status_wit L0126 INFO | [MemoryUtilization] status data refreshed on retry 1/3 (System block ts: Tue, 12 May 2026 17:35:34) Signed-off-by: xuliping <xuliping@microsoft.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: Liping Xu <108326363+lipxu@users.noreply.github.com>
…tform tests (#24690) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> Summary: This change increases `config_reload_timeout` from 180s to 240s in `platform_tests/test_reload_config.py` for Nokia-M0-7215 and Nokia-7215. The goal is to avoid false test failures caused by longer `config reload -y` completion time after recent platform specific changes. test_reload_configuration_checks is failing on Nokia-7215 because `config reload -y` may not finish within the current 180 sec timeout. The command is triggered asynchronously and although the handler is returned the reload flow can still be in progress when the test reaches its timeout. In 202511, cherry-pick of PR sonic-net/sonic-utilities#4390 ( PR sonic-net/sonic-utilities#4174 in master ) added extra logic in `_restart_services()` for `armhf-nokia_ixs7215_52x-r0`, including an explicit 15 sec sleep, swss & syncd stop/reset/restart operations and management interface recovery handling. This introduces additional delay on top of the existing config reload sequence which pushes its completion beyond the current 180 sec timeout. Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [x] 202511 ### Approach - This is a test timeout adjustment only with scope limited to Nokia related HWSKUs. - No functional behavior is changed in config reload itself. - This only aligns the wait time in the test to match the longer reload execution path. #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: fountzou <ioannis.fountzoulas@nokia.com> Signed-off-by: mssonicbld <sonicbld@microsoft.com> Co-authored-by: fountzou <169114916+fountzou@users.noreply.github.com>
…and remove stale parametrize-keyed entries (#24673) Manual cherry-pick of #24626 to 202511 (auto-cherry-pick blocked by `Cherry Pick Conflict_202511` -- surrounding `decap/test_subnet_decap.py` entries diverge between master and 202511; this PR replays only the `decap/test_decap.py`-scoped portion, which is the entire content of #24626). Same diff as master commit `19fdea04` (+3 / -63). ## What this changes 1. Adds `'Arista-720DT' in hwsku` to the top-level `decap/test_decap.py:` skip block. Arista-720DT (TD3-X2 / BCM56873) does not honor `SAI_TUNNEL_DSCP_MODE_UNIFORM_MODEL`; platform-level concession in `aristanetworks/sonic-qual.msft#1176`. 2. Removes 8 stale `decap/test_decap.py::test_decap[ttl=*, dscp=*, vxlan=*]:` entries that have matched zero collected items since PR #20304 (2025-08-28) refactored `tests/decap/conftest.py` to a single non-parametrized collection. ## Verification Empirically verified on `testbed-bjw2-can-720dt-6` (m0, internal-202511): - Baseline: `1 error in 8.56s` (`test_decap` proceeds past conditional_mark, errors at `duthosts` fixture) - With this patch: `1 skipped, 9 warnings in 2.98s` -- SKIPPED before any fixture fires ## Related - Master PR: #24626 - Tracking: `aristanetworks/sonic-qual.msft#1176` - Refactor that orphaned the parametrize entries: #20304 Signed-off-by: Xichen Lin <lukelin0907@gmail.com>
…4635) <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) Manual cherry-pick of sonic-net/sonic-mgmt#24490 Add testbed specific delays for ACL and Everflow to allow higher wait for platforms which are slower <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 Some platforms need longer time for acl programming leading to test failures. Instead of increasing delay for all platform, move this to the inventory file, where platform specific delays can be specified. If no additional delay is specified, it defaults to the current value Use wait time specified in inventory if available Run acl and everfow test suite <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> --------- <!-- Please make sure you've read and understood our contributing guidelines; https://github.com/sonic-net/SONiC/blob/gh-pages/CONTRIBUTING.md Please provide following information to help code review process a bit easier: --> ### Description of PR <!-- - Please include a summary of the change and which issue is fixed. - Please also include relevant motivation and context. Where should reviewer start? background context? - List any dependencies that are required for this change. --> Summary: Fixes # (issue) ### Type of change <!-- - Fill x for your type of change. - e.g. - [x] Bug fix --> - [ ] Bug fix - [ ] Testbed and Framework(new/improvement) - [ ] New Test case - [ ] Skipped for non-supported platforms - [ ] Test case improvement ### Back port request - [ ] 202205 - [ ] 202305 - [ ] 202311 - [ ] 202405 - [ ] 202411 - [ ] 202505 - [ ] 202511 ### Approach #### What is the motivation for this PR? #### How did you do it? #### How did you verify/test it? #### Any platform specific information? #### Supported testbed topology if it's a new test case? ### Documentation <!-- (If it's a new feature, new test case) Did you update documentation/Wiki relevant to your implementation? Link to the wiki page? --> Signed-off-by: Tejaswini Chadaga <tchadaga@microsoft.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.