`@noinline` `average_bulk_microphysics_tendencies` to reduce register pressure by petebachant · Pull Request #713 · CliMA/CloudMicrophysics.jl

petebachant · 2026-05-08T00:05:55Z

This kernel is now the hottest in prog EDMF 1M AMIP by a long shot, and this change produces a ~10% speedup (kernel analysis notebook). Disclaimer: Explanatory comments written by Claude--I don't yet have a deep understanding of what's going on here!

codecov · 2026-05-08T00:19:20Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 92.02%. Comparing base (14ef3e6) to head (96c5bcf).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files

@@           Coverage Diff           @@
##             main     #713   +/-   ##
=======================================
  Coverage   92.02%   92.02%           
=======================================
  Files          54       54           
  Lines        2321     2321           
=======================================
  Hits         2136     2136           
  Misses        185      185

Components	Coverage Δ
src	`92.99% <100.00%> (ø)`
ext	`69.47% <ø> (ø)`

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

dennisYatunin

Claude's explanation is a bit suspicious given that the quadrature loop isn't being unrolled, but a 10% speedup sounds great! I'll think about how we can turn this into a simpler example for ClimaCore's compiler stress tests.

petebachant · 2026-05-11T21:02:23Z

Hypothesis: If Claude is correct, the issue comes from the quadrature loop being unrolled, so we de-unroll that we may be able to get these benefits without the performance hit on CPU.

Might be possible by dropping quadrature order from the type. Move from type into the value. Type can be int. Value is number of quadrature loops.

trontrytel · 2026-05-20T21:21:47Z

Is this PR something that should be merged or closed?

petebachant · 2026-05-20T21:26:47Z

I opened CliMA/ClimaAtmos.jl#4503 to retain the GPU performance gains and move changes to Atmos and avoid the 1.12 regression, but that one is a little uglier. Any preference from your end?

noinline functions to reduce register pressure

f631a4a

petebachant requested a review from dennisYatunin May 8, 2026 00:06

petebachant added this to Performance May 8, 2026

petebachant moved this to In review in Performance May 8, 2026

dennisYatunin approved these changes May 8, 2026

View reviewed changes

petebachant self-assigned this May 11, 2026

Move comments inside docstrings

96c5bcf

petebachant moved this from In review to In progress in Performance May 18, 2026

petebachant mentioned this pull request May 20, 2026

Switch inlining for quadrature point evaluation based on device CliMA/ClimaAtmos.jl#4503

Draft

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

`@noinline` `average_bulk_microphysics_tendencies` to reduce register pressure#713

`@noinline` `average_bulk_microphysics_tendencies` to reduce register pressure#713
petebachant wants to merge 2 commits into
mainfrom
pb/perf

petebachant commented May 8, 2026

Uh oh!

codecov Bot commented May 8, 2026 •

edited

Loading

Uh oh!

dennisYatunin left a comment

Uh oh!

petebachant commented May 11, 2026 •

edited

Loading

Uh oh!

trontrytel commented May 20, 2026

Uh oh!

petebachant commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

petebachant commented May 8, 2026

Uh oh!

codecov Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

dennisYatunin left a comment

Choose a reason for hiding this comment

Uh oh!

petebachant commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

trontrytel commented May 20, 2026

Uh oh!

petebachant commented May 20, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

codecov Bot commented May 8, 2026 •

edited

Loading

petebachant commented May 11, 2026 •

edited

Loading