Generate_GPU_ALPAKA in ROperator_Conv.hxx has a per-group im2col + GEMM loop that handles grouped convolution by partitioning input channels and weights into G groups. The code is present, but every Conv ONNX model in test/input_models/ has group=1, so this branch is never code-generated and the grouped path has never been tested.
The fix is going to be adding ONNX models with groups 2 and 4, along with reference outputs and TEST_F cases. A combined batch>1 + groups>1 model (e.g. batch=4, groups=2) can also be added to validate that the outer batch loop and inner group loop nest correctly.
Generate_GPU_ALPAKAinROperator_Conv.hxxhas a per-group im2col + GEMM loop that handles grouped convolution by partitioning input channels and weights into G groups. The code is present, but every Conv ONNX model in test/input_models/ has group=1, so this branch is never code-generated and the grouped path has never been tested.The fix is going to be adding ONNX models with groups 2 and 4, along with reference outputs and TEST_F cases. A combined batch>1 + groups>1 model (e.g. batch=4, groups=2) can also be added to validate that the outer batch loop and inner group loop nest correctly.