Skip to content

Add matrix-free action assembly (stiffness_action / mass_action)#242

Merged
lxmota merged 2 commits intomainfrom
feature/matrix-free-action
Mar 15, 2026
Merged

Add matrix-free action assembly (stiffness_action / mass_action)#242
lxmota merged 2 commits intomainfrom
feature/matrix-free-action

Conversation

@lxmota
Copy link
Contributor

@lxmota lxmota commented Mar 15, 2026

Summary

  • Adds assemble_matrix_free_action! — a new assembly path that accumulates K·v (or M·v) directly as a vector per element, without forming the full element stiffness/mass matrix
  • Adds abstract function stubs stiffness_action and mass_action to Physics.jl for physics implementations to dispatch on
  • Exports assemble_matrix_free_action!, stiffness_action, and mass_action

Motivation

The existing assemble_matrix_action! path accumulates a full SMatrix{24,24} element stiffness matrix (576 Float64 = 4.6 KB) per GPU thread before multiplying by v_el. This far exceeds the GPU VGPR budget (~256–512 registers), causing register spilling to global memory and ~570 ms/call on an RX 7600 for a 530k-DOF torsion problem in Carina.jl.

The new matrix-free path accumulates Kv_el = SVector{24} directly by calling func_action(physics, ..., v_el, ...) per quadrature point, keeping only ~42 doubles of intermediates. Benchmark results on RX 7600 (530k DOFs):

Operation Old (assemble_matrix_action!) New (assemble_matrix_free_action!) Speedup
Stiffness matvec ~570 ms ~64 ms ~9×
Mass matvec ~124 ms ~4 ms ~31×
Combined (K + c_M·M)·v ~68 ms ~10×

Interface

Physics implementations define:

FEC.stiffness_action(physics, interps, x_el, t, dt, u_el, u_el_old, v_el,
                     state_old_q, state_new_q, props_el)  SVector
FEC.mass_action(physics, interps, x_el, t, dt, u_el, u_el_old, v_el,
                state_old_q, state_new_q, props_el)  SVector

Assembly call:

FEC.assemble_matrix_free_action!(assembler, FEC.stiffness_action, Uu, Vu, p)

    New physics function interface for element-level matrix-vector products:
    - stiffness_action(physics, ..., v_el, ...) → SVector: computes K_q·v_el
      per quadrature point as G·A_v·(G'·v_el), avoiding the 24×24 K_q matrix
    - mass_action(physics, ..., v_el, ...) → SVector: computes M_q·v_el as
      JxW·ρ·(N_vec·v_el)·N_vec, avoiding the 24×24 M_q matrix

    New assembly path:
    - assemble_matrix_free_action!(assembler, func_action, Uu, Vu, p)
    - CPU implementation: _assemble_block_matrix_free_action!
    - GPU kernel: _assemble_block_matrix_free_action_kernel!
      Accumulates Kv_el (SVector{24}) per element instead of K_el (SMatrix{24,24}),
      eliminating the GPU register spilling bottleneck.
@lxmota lxmota requested a review from cmhamel March 15, 2026 20:27
@codecov
Copy link

codecov bot commented Mar 15, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 62.43%. Comparing base (4de17f2) to head (71365ed).
⚠️ Report is 3 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #242      +/-   ##
==========================================
+ Coverage   61.90%   62.43%   +0.53%     
==========================================
  Files          60       60              
  Lines        3344     3381      +37     
==========================================
+ Hits         2070     2111      +41     
+ Misses       1274     1270       -4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lxmota lxmota merged commit 238fef9 into main Mar 15, 2026
14 of 18 checks passed
@lxmota lxmota deleted the feature/matrix-free-action branch March 15, 2026 20:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants