Skip to content

Cooperative matrix support#316

Merged
kvark merged 4 commits intomainfrom
claude/revive-cooperative-matrix-ops-2BQyi
Mar 20, 2026
Merged

Cooperative matrix support#316
kvark merged 4 commits intomainfrom
claude/revive-cooperative-matrix-ops-2BQyi

Conversation

@kvark
Copy link
Owner

@kvark kvark commented Mar 20, 2026

Summary

Add cooperative matrix operations support to blade, enabling efficient GPU matrix multiplication via VK_KHR_cooperative_matrix on Vulkan and simdgroup_matrix on Metal.

Changes

  • Cooperative matrix support: Auto-detect VK_KHR_cooperative_matrix during Vulkan device initialization and enable the feature when available. On Metal, the capability is reported as always available (Apple GPUs support simdgroup matrix ops).
  • New matmul example: End-to-end matrix multiplication example using cooperative matrix intrinsics in WGSL, with a CPU reference check and graceful fallback to a naive kernel when cooperative matrix is not supported by the hardware.
  • Naga upgrade: Bump naga from git-pinned v28 to published v29 (<crates.io>) for cooperative matrix WGSL support.
  • ContextDesc::device_id changed to Option<u32>: Previously 0 meant auto-select, preventing users from choosing a device with actual ID 0. Now None = auto, Some(id) = explicit selection.

Test plan

  • cargo check passes
  • matmul example runs on lavapipe (falls back to naive kernel when cooperative matrix is unsupported)
  • matmul example runs on hardware with VK_KHR_cooperative_matrix support
  • Verify Metal builds

@kvark kvark enabled auto-merge (rebase) March 20, 2026 23:00
claude added 4 commits March 20, 2026 23:09
Wire up VK_KHR_cooperative_matrix through the full blade stack:
- Add `cooperative_matrix` flag to ContextDesc and Capabilities
- Query and enable the Vulkan extension and features (including
  the required VulkanMemoryModel) during device initialization
- Enable naga's COOPERATIVE_MATRIX validation capability
- The SPIR-V backend automatically emits the required capabilities
  and extensions (SPV_KHR_cooperative_matrix, SPV_KHR_vulkan_memory_model)

Naga already has full IR support for cooperative matrix types
(CooperativeMatrix, CooperativeLoad, CooperativeMultiplyAdd,
CooperativeStore) and WGSL front-end parsing. This change makes
blade able to create shaders that use these operations.

https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
- Remove cooperative_matrix from ContextDesc; detect it
  automatically like binding_array and dual_source_blending
- Metal: detect cooperative matrix via Apple7/Mac2/Metal3 GPU
  families (simdgroup_matrix requires MSL 2.3+, blade uses 2.4)
- Add cooperative-matmul example: headless 64x64 f32 matrix
  multiply using 8x8 cooperative tiles, verified against CPU
- Add changelog entry

https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
naga 29.0.0 was published to crates.io and includes cooperative matrix
support that was previously only available via the git pin.

https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
Previously device_id == 0 meant "auto-select", which prevented users
from selecting an actual device with ID 0. Using Option<None> for auto
and Option<Some(id)> for explicit selection resolves this ambiguity.

https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
@kvark kvark force-pushed the claude/revive-cooperative-matrix-ops-2BQyi branch from c96e694 to 94f76cd Compare March 20, 2026 23:09
@kvark kvark merged commit 69e0c54 into main Mar 20, 2026
12 checks passed
@kvark kvark deleted the claude/revive-cooperative-matrix-ops-2BQyi branch March 20, 2026 23:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants