feat(amdgpu): wave-scope subgroupOr / subgroupAnd / subgroupBarrier by lohiaj · Pull Request #26 · ROCm/quadrants

lohiaj · 2026-05-04T11:06:34Z

Summary

Adds wave-scope subgroup boolean reductions and barrier to the AMDGPU LLVM
codegen path. qd.simt.subgroup.reduce_or_i32(value) and reduce_and_i32(value)
now produce single-instruction-equivalent wave reductions on AMDGPU
(v_cmp_ne_u32 + s_cmp_lg_u64 / s_cmp_eq_u64 after LLVM lowering),
matching what the SPIRV codegen already emits via
OpGroupNonUniformBitwiseOr / OpGroupNonUniformBitwiseAnd.

qd.simt.subgroup.barrier() is also wired up on AMDGPU (was failing at
codegen with LLVMRuntime function subgroupBarrier not found).

Motivation

The Python frontend qd.simt.subgroup exposes wave-scope primitives
(reduce_or, reduce_and, barrier, etc.) that compile cleanly on Vulkan/
Metal via the SPIRV codegen but error at codegen time on the LLVM-runtime
archs (CPU/CUDA/AMDGPU) with LLVMRuntime function subgroupOr not found.
The error message is misleading — users assume the function doesn't exist
anywhere, not that it just isn't patched on their backend.

Without these primitives, AMDGPU users have to write LDS-based fallbacks
that cost ~75-100 cycles per iteration (1 LDS write + barrier + 64 LDS
reads with sequential OR-reduction) for what should be a 4-instruction
ballot. This PR closes the gap on AMDGPU for the boolean case (sufficient
for wave-vote convergence checks, the most common use case).

Implementation

Follows the existing AMDGPU intrinsic patches in this file (e.g.
block_barrier → amdgcn_s_barrier, cuda_shfl_down_sync_i32 →
amdgcn_ds_bpermute-based custom impl):

C++ stubs in runtime.cpp (no-op bodies) so the LLVM linker resolves
the symbol.
patch_intrinsic in llvm_context.cpp (AMDGPU section) rewrites the
stub body at codegen time to use:
- amdgcn.icmp.i32 for the wave-wide ballot of arg != 0
- amdgcn.ballot(true) to read the EXEC mask (for AND case)
- icmp against zero (OR) or against EXEC (AND) for the boolean result
Function is marked always_inline so the patcher's body becomes part
of the calling kernel after linking.

Generated AMDGCN at the call site

```
v_cmp_ne_u32_e32 vcc, 0, v4 ; wave-wide ballot of (arg != 0)
s_cmp_lg_u64 vcc, 0 ; ballot != 0?
s_cselect_b64 s[0:1], -1, 0 ; predicate
v_cndmask_b32_e64 v4, 0, 1, s[0:1] ; zero-extend i1 -> i32
```

Files changed (204 lines, no deletions)

```
python/quadrants/lang/simt/subgroup.py | +22 Python frontend
quadrants/inc/internal_ops.inc.h | +7 PER_INTERNAL_OP registrations
quadrants/ir/type_system.cpp | +6 PLAIN_OP type signatures
quadrants/runtime/llvm/llvm_context.cpp | +58 AMDGPU patch_intrinsic
quadrants/runtime/llvm/runtime_module/runtime.cpp | +22 stub functions for linker
tests/python/test_simt.py | +89 3 new tests (single + multi-wave)
```

Test plan

`tests/python/test_simt.py` adds 3 AMDGPU-gated tests:

`test_subgroup_reduce_or_i32_single_wave` — `block_dim=64`, one wave.
- All zeros input → all outputs 0.
- One non-zero lane → all outputs 1 (correct broadcast of OR result).
`test_subgroup_reduce_and_i32_single_wave` — same setup.
- All ones input → all outputs 1.
- One zero lane → all outputs 0 (proves AND fires on a single dissent).
`test_subgroup_reduce_or_i32_multi_wave` — `block_dim=128`, two waves.
- Lane 64 (wave 1) is the only non-zero lane.
- Wave 0 (lanes 0-63) outputs all zeros — proves the reduction is
  wave-scope, not block-scope.
- Wave 1 (lanes 64-127) outputs all ones — proves the non-zero
  correctly propagates within wave 1.

All 3 tests `PASSED` on gfx942.

Targeted regression sweep on the test files most likely to catch
codegen / IR regressions (`test_simt.py` + `test_basics.py` +
`test_internal_func.py` + `test_shared_array_codegen_bug.py`):
39 passed, 21 skipped, 0 failed.

Compatibility / risk

Zero behavior change for any code that doesn't use `subgroupOr_i32`,
`subgroupAnd_i32`, or `subgroupBarrier`. Stubs are added but unused;
the patcher only runs when the function is referenced.
No deletions of existing code paths.
No SPIRV changes — the polymorphic `subgroupOr` / `subgroupAnd`
used by the SPIRV path is untouched. New names are
`subgroupOr_i32` / `subgroupAnd_i32` (type-suffixed, matching the
`cuda_shfl_down_sync_i32` / `cuda_shfl_down_sync_f32` convention).

Limitations / follow-ups (acknowledged scope cuts)

Boolean-only. i32 input is treated as 0 vs non-zero. Generalising
to true bitwise OR/AND of arbitrary i32 values needs DPP-based tree
reduction (~50 lines of additional codegen). This is sufficient for
the most common use case (wave-vote convergence checks); generalising
is a clean follow-up.
AMDGPU-only. CUDA path falls back to no-op stubs. CUDA users
already have `cuda_ballot_i32` for this purpose.
OR / AND only. Add / Mul / Min / Max / Xor variants are
straightforward extensions but not implemented here to keep PR scope
small.

PR perf(amdgpu): amdgpu perf + force_inline #15 (`force_inline`): precedent for small AMDGPU codegen contribution
via the same `patch_intrinsic` mechanism. Same pattern, same file
organisation.
PR feat(amdgpu): per-kernel LLVM function attributes via @qd.kernel(fn_attrs=...) #11 (`fn_attrs`): precedent for adding AMDGPU-targeted compiler
features without touching CUDA.

Adds wave-scope subgroup boolean reductions and barrier to the AMDGPU LLVM codegen path. `qd.simt.subgroup.reduce_or_i32(value)` and `reduce_and_i32(value)` now produce single-instruction-equivalent wave reductions on AMDGPU (amdgcn.icmp + ballot + s_or_b64 / s_and_b64 with EXEC after LLVM lowering), matching what the SPIRV codegen emits via OpGroupNonUniformBitwiseOr / And. `qd.simt.subgroup.barrier()` is also wired up on AMDGPU (was failing at codegen with `LLVMRuntime function subgroupBarrier not found`). Implementation follows the existing AMDGPU patch_intrinsic pattern in llvm_context.cpp (e.g. block_barrier -> amdgcn_s_barrier, cuda_shfl_down_sync_i32 -> amdgcn_ds_bpermute custom impl): 1. C++ stubs in runtime.cpp so the LLVM linker resolves the symbols 2. AMDGPU patch_intrinsic rewrites the stub bodies at codegen time using amdgcn.icmp.i32 (ballot of arg != 0), amdgcn.ballot(true) (EXEC mask), and icmp against 0 (OR) or against EXEC (AND) 3. Functions marked always_inline so the body inlines at the call site Generated AMDGCN at the call site is the expected 4-instruction sequence: v_cmp_ne_u32_e32 vcc, 0, v4 s_cmp_lg_u64 vcc, 0 s_cselect_b64 s[0:1], -1, 0 v_cndmask_b32_e64 v4, 0, 1, s[0:1] v1 limitations (acknowledged): - Boolean-only (i32 input treated as 0 vs non-zero). Generalising to bitwise OR/AND of arbitrary i32 values needs DPP-based tree reduction (~50 lines, clean follow-up). - AMDGPU-only. CUDA path falls back to no-op stubs (CUDA users already have cuda_ballot_i32 for this purpose). - OR / AND only. Add / Mul / Min / Max / Xor variants are straightforward extensions but kept out of v1 to keep PR scope small. Tests: 3 new AMDGPU-gated tests in tests/python/test_simt.py: - test_subgroup_reduce_or_i32_single_wave (block_dim=64, 1 wave) - test_subgroup_reduce_and_i32_single_wave (block_dim=64, 1 wave) - test_subgroup_reduce_or_i32_multi_wave (block_dim=128, 2 waves) explicitly verifies WAVE-SCOPE (not block-scope) semantics: a non-zero lane in wave 1 must NOT propagate to wave 0's outputs. All 3 pass. Targeted regression sweep (test_simt + test_basics + test_internal_func + test_shared_array_codegen_bug): 39 passed, 21 skipped, 0 failed. Zero behavior change for code that doesn't use the new primitives.

lohiaj · 2026-05-04T12:50:10Z

The 4 failing CI checks (Linters, Linux ubuntu-22.04 Build, pyright_linter / Pyright Linter, test_gpu / Start AMD GPU Runner) are pre-existing on amd-integration and unrelated to this PR.

Verified by checking PR #15 (the most recently merged PR into amd-integration, force_inline by @deepsek): the exact same 4 checks failed there too, and the PR was merged anyway.

Per-failure breakdown:

Linters (black + clang-format) — black flagged tests/python/test_fn_attrs.py only ("1 file reformatted, 438 files left unchanged"); clang-format flagged quadrants/runtime/llvm/llvm_runtime_executor.cpp:723. Neither file is touched by this PR. My modified files (subgroup.py, test_simt.py, llvm_context.cpp, runtime.cpp, internal_ops.inc.h, type_system.cpp) all pass black 25.1.0 (-l 120), ruff 0.4.8, and pylint 3.3.4 with the project's exact .pre-commit-config.yaml settings.
Linux ubuntu-22.04 Build — fails with RHI Error: Can not create Vulkan instance. Vulkan-on-CI infra issue, no code in this PR touches the RHI/Vulkan path.
pyright_linter — flags torch private import errors in _func_base.py, _py_tensor.py, field.py, util.py. None of those files are in this PR's diff.
test_gpu / Start AMD GPU Runner — failed in 3 seconds (runner didn't start). Infrastructure.

The functional GPU tests for this PR were validated locally on an MI300X / gfx942 box: all 3 new tests in test_simt.py PASS, and the targeted regression sweep (test_simt.py + test_basics.py + test_internal_func.py + test_shared_array_codegen_bug.py) reports 39 passed, 21 skipped, 0 failed.

Happy to fix the pre-existing test_fn_attrs.py black drift in a follow-up if it's helpful, but keeping this PR focused on the one feature.

yaoliu13 · 2026-05-05T03:58:01Z

/run-ci

yaoliu13 · 2026-05-05T10:01:40Z

/run-ci

yaoliu13 · 2026-05-05T23:58:45Z

1405358.1 and 4746

vecheruk-amd

LGTM overall. AMDGPU intrinsic patching looks correct and the multi-wave test gives confidence in wave-scope semantics.
Minor nits: please add reduce_or_i32 / reduce_and_i32 to __all__ in subgroup.py, and add a short note that non-AMDGPU stubs currently return 0 (backend-limited behavior).

yaoliu13 · 2026-05-06T06:39:12Z

/run-ci

s_barrier is the workgroup-scope barrier and would deadlock when subgroupBarrier() is called from non-uniform control flow with block_dim > wave_size -- non-participating waves never arrive at the barrier (per Dipto's review). subgroupBarrier is meant to match SPIR-V OpControlBarrier with Subgroup execution scope. Switch the lowering to llvm.amdgcn.wave.barrier: a discardable compiler + memory-ordering barrier scoped to the current wave. It emits no hardware sync (the 64 lanes of an AMDGPU wave already execute in lockstep) and carries IntrConvergent + IntrHasSideEffects to prevent illegal CFG transforms and memory-op reordering. Add two AMDGPU regression tests in tests/python/test_simt.py: - test_subgroup_barrier_partial_participation: exercises the exact deadlock pattern from the review thread (block_dim=128, only lanes with `if i < 64:` call the barrier). With s_barrier this hung; with wave.barrier it completes. - test_subgroup_barrier_in_uniform_path: read-after-write across the barrier, verifies compiler ordering is preserved. Also addresses vecheruk-amd's review nits: - Add reduce_or_i32 / reduce_and_i32 to subgroup.py __all__. - Document that non-AMDGPU LLVM-runtime backends (CPU, CUDA today) fall through to the runtime stubs (return 0 / no-op) and that Vulkan/Metal users should keep using the polymorphic reduce_or / reduce_and. Tests: AMDGPU subgroup tests 5/5 pass. Targeted regression sweep (test_simt + test_basics + test_internal_func + test_shared_array_codegen_bug) clean: 41 passed, 21 skipped, 0 failed.

lohiaj · 2026-05-06T16:54:46Z

Pushed e46931bd7 addressing both rounds of review:

Dipto's s_barrier → wave.barrier fix (the key one)

subgroupBarrier now lowers to llvm.amdgcn.wave.barrier instead of llvm.amdgcn.s.barrier in llvm_context.cpp. As Dipto called out in the internal thread, s_barrier is the workgroup-scope barrier and would have deadlocked on partial-wave participation - the canonical foot-gun being:

@qd.kernel
def k():
    qd.loop_config(block_dim=128)  # 2 waves
    for i in range(128):
        if i < 64:
            qd.simt.subgroup.barrier()  # only wave 0 arrives → wave 0 hangs forever

wave.barrier is the right call: per LLVM IntrinsicsAMDGPU.td, it's a discardable barrier carrying IntrConvergent + IntrHasSideEffects — compiler + memory-ordering only, emits no hardware sync, scoped to the current wave (the 64 lanes of an AMDGPU wave already execute in lockstep so there's nothing for the hardware to wait on). No partial-participation hazard.

Added two regression tests to tests/python/test_simt.py to prevent recurrence:

test_subgroup_barrier_partial_participation - runs the exact block_dim=128 / if i < 64: pattern from the thread. With s_barrier this hung; with wave.barrier it completes.
test_subgroup_barrier_in_uniform_path - read-after-write across the barrier in a uniform path, sanity-checks compiler ordering is preserved.

vecheruk-amd's review nits

reduce_or_i32, reduce_and_i32 added to __all__ in subgroup.py.
Module docstring + barrier() docstring + a corresponding comment block in runtime.cpp explaining backend availability - AMDGPU has the full patcher; CPU/CUDA LLVM-runtime backends currently fall through to the runtime stubs (return 0 / no-op) and that's intentional placeholder behaviour for follow-up; Vulkan/Metal users should keep using the polymorphic reduce_or / reduce_and.

Tests

AMDGPU subgroup tests: 5/5 PASS (test_subgroup_reduce_or_i32_single_wave, test_subgroup_reduce_and_i32_single_wave, test_subgroup_reduce_or_i32_multi_wave, test_subgroup_barrier_partial_participation, test_subgroup_barrier_in_uniform_path).
Targeted regression sweep (test_simt + test_basics + test_internal_func + test_shared_array_codegen_bug): 41 passed, 21 skipped, 0 failed (up from 39 in the prior round because of the 2 new barrier tests).

Diff summary: +134 / -16 across 4 files. Net PR is now +338 / -16 across 6 files.

Ready for re-review @vecheruk-amd @dipto-amd.

Pre-commit `clang-format` v19.1.7 wants patch_intrinsic("subgroupBarrier", llvm::Intrinsic::amdgcn_wave_barrier, false); instead of the wrap I had. Pure whitespace; semantically identical.

lohiaj · 2026-05-06T17:11:19Z

Pushed 2c2300259 — clang-format fix for the new patch_intrinsic("subgroupBarrier", …) call I added in e46931bd7. Two-line whitespace-only diff:

-      patch_intrinsic("subgroupBarrier",
-                      llvm::Intrinsic::amdgcn_wave_barrier, false);
+      patch_intrinsic("subgroupBarrier", llvm::Intrinsic::amdgcn_wave_barrier,
+                      false);

Verified locally with the project's pinned clang-format v19.1.7: --dry-run --Werror passes on lines 583–610 (the range I touched). Subgroup tests still 5/5 PASS. Broader regression sweep still 41 passed / 21 skipped / 0 failed.

About the remaining `Linters` failures on this CI run

The job will still flag clang-format / black violations on three files that this PR doesn't touch:

quadrants/runtime/llvm/llvm_runtime_executor.cpp:723 (clang-format on a QD_TRACE line) — introduced in PR perf(amdgpu): amdgpu perf + force_inline #15 (e9a546405, deepsek)
quadrants/runtime/llvm/runtime_module/runtime.cpp:1701 (clang-format on gpu_parallel_range_for_fixed_config's __attribute__((always_inline)) placement) — pre-dates PR feat(amdgpu): wave-scope subgroupOr / subgroupAnd / subgroupBarrier #26
tests/python/test_fn_attrs.py:83/153/170 (black wants un-wrapped one-liners) — introduced in PR feat(amdgpu): per-kernel LLVM function attributes via @qd.kernel(fn_attrs=...) #11 (f71121d08, kevinjosephamd)

Same situation as the prior CI run on eaba867ac, and the same pattern that PR #15 was merged with (CI status failure, but the failing files are unrelated drift). Confirmed by diffing the file lists between the two Linters runs — my push has REDUCED the flagged-file count (the prior run on eaba867ac flagged 12 files; this run flags 3 — my clang-format fix cleared the rest).

Happy to push a chore: format pre-existing drift commit on top if it's preferred to keep the PR's CI fully green, but it would touch four files outside this PR's feature scope. Otherwise the same merge-despite-pre-existing-Linters-failure precedent from PR #15 should apply.

deepsek

lgtm after the change. Lets get a sign off from @diptorupd too.

yaoliu13 · 2026-05-06T17:29:10Z

/run-ci

diptorupd

LGTM aside from the nits about wavesize.

yaoliu13 · 2026-05-07T04:34:58Z

/run-ci

yaoliu13 · 2026-05-07T05:40:57Z

/run-ci

Per Dipto's review on PR #26: keep the implementation flexible enough to be predicated on wave size so it is upstream-friendly (the community cares about RDNA, all wave32) and forward-compatible with MI450 (expected wave32 per AMD). The wave-vote ballot intrinsics (`llvm.amdgcn.icmp` and `llvm.amdgcn.ballot`) are overloaded by their return-type argument: i64 on wave64 targets (CDNA1/2/3, gfx9xx) and i32 on wave32 targets (RDNA, expected MI450). The previous lowering hard-coded i64, which would either fail to lower or produce wrong code on a wave32 target. Resolution at codegen time: unsigned get_amdgpu_wave_size(llvm::Function *F): 1. If F has a `+wavefrontsize32` / `+wavefrontsize64` feature in its target-features attribute, honor that. 2. Else fall back to the mcpu-based default: gfx9xx -> 64, gfx10xx+ -> 32. Quadrants's AMDGPU codegen path currently sets target-features to "" (see llvm_context_pass.h:88), so step 2 is what's actually used today. The fallback list assumes `gfx9xx` is always wave64 -- true for every shipped MI300/MI325/MI355 part. If MI450 ships under a `gfx9...` mcpu prefix yet defaults to wave32, Quadrants's AMDGPU codegen path should also be updated to set "+wavefrontsize32" explicitly so step 1 picks it up; that's outside this PR's scope. The mask type is then used uniformly: - llvm.amdgcn.icmp.{mask_ty}.i32(arg, 0, ICMP_NE) -> ballot - For OR: ballot != 0 - For AND: ballot == amdgcn.ballot.{mask_ty}(true) (active-lane mask) No semantic change on wave64 hardware -- the i64 path is identical to what landed in eaba867. The new branch is purely additive for wave32. Tests - All 5 AMDGPU subgroup tests pass on MI300X/wave64: test_subgroup_reduce_or_i32_single_wave test_subgroup_reduce_and_i32_single_wave test_subgroup_reduce_or_i32_multi_wave test_subgroup_barrier_partial_participation test_subgroup_barrier_in_uniform_path - All 5 Vulkan subgroup tests pass. - Targeted regression sweep (test_simt + test_basics + test_internal_func + test_shared_array_codegen_bug): 41 passed, 21 skipped, 0 failed. - Wave32 path is correct-by-construction (symmetric to wave64) but untested locally -- no RDNA hardware in the bench setup.

lohiaj · 2026-05-07T06:59:45Z

@dipto-amd thanks again for the wavesize-flexibility ask. Pushed 427919ea3 on top of the wave-barrier commit:

New get_amdgpu_wave_size(llvm::Function *F) helper: prefers the function's +wavefrontsize{32,64} feature in its target-features attribute, falls back to the mcpu-based default (gfx9xx -> 64, gfx10xx+ -> 32) when the attribute is empty.
subgroupOr_i32 / subgroupAnd_i32 lowering now picks the ballot mask type from that helper (i64Ty on wave64, i32Ty on wave32) and threads it through llvm.amdgcn.icmp.{mask_ty}.i32 and llvm.amdgcn.ballot.{mask_ty}. No behavior change on wave64.
subgroupBarrier doesn't need wavesize predication. llvm.amdgcn.wave.barrier is wave-scope by definition (no return type, discarded at codegen on every subtarget).

Forward-looking note: Quadrants's AMDGPU codegen path currently sets target-features="" (llvm_context_pass.h:88), so the mcpu fallback is what actually fires today. If MI450 ships under a gfx9... mcpu prefix but defaults to wave32, Quadrants's codegen path will need to set "+wavefrontsize32" explicitly so the feature-attribute branch picks it up. That'd be a separate small follow-up PR. out of scope here.

Validation

All 5 AMDGPU subgroup tests pass on MI300X (wave64): 3 reduce + 2 barrier (incl. test_subgroup_barrier_partial_participation which is the exact deadlock pattern from your review).
All 5 Vulkan subgroup tests pass.
Targeted regression sweep (test_simt + test_basics + test_internal_func + test_shared_array_codegen_bug): 41 passed, 21 skipped, 0 failed.
Wave32 path is correct-by-construction (symmetric to wave64) but I have no RDNA hardware locally to exercise it

Ready for another look whenever you have a moment

yaoliu13 · 2026-05-07T11:46:17Z

/run-ci

vecheruk-amd self-requested a review May 6, 2026 03:25

vecheruk-amd approved these changes May 6, 2026

View reviewed changes

style(amdgpu): clang-format the new subgroupBarrier patch_intrinsic call

2c23002

Pre-commit `clang-format` v19.1.7 wants patch_intrinsic("subgroupBarrier", llvm::Intrinsic::amdgcn_wave_barrier, false); instead of the wrap I had. Pure whitespace; semantically identical.

deepsek self-requested a review May 6, 2026 17:22

deepsek approved these changes May 6, 2026

View reviewed changes

deepsek requested a review from diptorupd May 6, 2026 17:25

diptorupd reviewed May 6, 2026

View reviewed changes

Comment thread quadrants/runtime/llvm/llvm_context.cpp

diptorupd reviewed May 7, 2026

View reviewed changes

Comment thread quadrants/runtime/llvm/llvm_context.cpp Outdated

diptorupd reviewed May 7, 2026

View reviewed changes

Comment thread quadrants/runtime/llvm/llvm_context.cpp Outdated

diptorupd reviewed May 7, 2026

View reviewed changes

lohiaj mentioned this pull request May 8, 2026

perf(amdgpu): skip arg-buffer H2D when host bytes match prior launch #28

Open

4 tasks

Conversation

lohiaj commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Motivation

Implementation

Generated AMDGCN at the call site

Files changed (204 lines, no deletions)

Test plan

Compatibility / risk

Limitations / follow-ups (acknowledged scope cuts)

Related

Uh oh!

lohiaj commented May 4, 2026

Uh oh!

yaoliu13 commented May 5, 2026

Uh oh!

yaoliu13 commented May 5, 2026

Uh oh!

yaoliu13 commented May 5, 2026

Uh oh!

vecheruk-amd left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yaoliu13 commented May 6, 2026

Uh oh!

lohiaj commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lohiaj commented May 6, 2026

About the remaining Linters failures on this CI run

Uh oh!

deepsek left a comment

Choose a reason for hiding this comment

Uh oh!

yaoliu13 commented May 6, 2026

Uh oh!

Uh oh!

Uh oh!

Uh oh!

diptorupd left a comment

Choose a reason for hiding this comment

Uh oh!

yaoliu13 commented May 7, 2026

Uh oh!

yaoliu13 commented May 7, 2026

Uh oh!

lohiaj commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

yaoliu13 commented May 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

lohiaj commented May 4, 2026 •

edited

Loading

vecheruk-amd left a comment •

edited

Loading

lohiaj commented May 6, 2026 •

edited

Loading

About the remaining `Linters` failures on this CI run

lohiaj commented May 7, 2026 •

edited

Loading