Merged
Conversation
Wire up VK_KHR_cooperative_matrix through the full blade stack: - Add `cooperative_matrix` flag to ContextDesc and Capabilities - Query and enable the Vulkan extension and features (including the required VulkanMemoryModel) during device initialization - Enable naga's COOPERATIVE_MATRIX validation capability - The SPIR-V backend automatically emits the required capabilities and extensions (SPV_KHR_cooperative_matrix, SPV_KHR_vulkan_memory_model) Naga already has full IR support for cooperative matrix types (CooperativeMatrix, CooperativeLoad, CooperativeMultiplyAdd, CooperativeStore) and WGSL front-end parsing. This change makes blade able to create shaders that use these operations. https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
- Remove cooperative_matrix from ContextDesc; detect it automatically like binding_array and dual_source_blending - Metal: detect cooperative matrix via Apple7/Mac2/Metal3 GPU families (simdgroup_matrix requires MSL 2.3+, blade uses 2.4) - Add cooperative-matmul example: headless 64x64 f32 matrix multiply using 8x8 cooperative tiles, verified against CPU - Add changelog entry https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
naga 29.0.0 was published to crates.io and includes cooperative matrix support that was previously only available via the git pin. https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
Previously device_id == 0 meant "auto-select", which prevented users from selecting an actual device with ID 0. Using Option<None> for auto and Option<Some(id)> for explicit selection resolves this ambiguity. https://claude.ai/code/session_01UgNYSynoC6pHF9eaFURzKV
c96e694 to
94f76cd
Compare
4 tasks
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Add cooperative matrix operations support to blade, enabling efficient GPU matrix multiplication via
VK_KHR_cooperative_matrixon Vulkan andsimdgroup_matrixon Metal.Changes
VK_KHR_cooperative_matrixduring Vulkan device initialization and enable the feature when available. On Metal, the capability is reported as always available (Apple GPUs support simdgroup matrix ops).matmulexample: End-to-end matrix multiplication example using cooperative matrix intrinsics in WGSL, with a CPU reference check and graceful fallback to a naive kernel when cooperative matrix is not supported by the hardware.ContextDesc::device_idchanged toOption<u32>: Previously0meant auto-select, preventing users from choosing a device with actual ID 0. NowNone= auto,Some(id)= explicit selection.Test plan
cargo checkpassesmatmulexample runs on lavapipe (falls back to naive kernel when cooperative matrix is unsupported)matmulexample runs on hardware withVK_KHR_cooperative_matrixsupport