Skip to content

[AIROCMLIR-217] Replace MITuna with tuningRunner in CI#2310

Open
mirza-halilcevic wants to merge 41 commits intodevelopfrom
tuning-runner-ci
Open

[AIROCMLIR-217] Replace MITuna with tuningRunner in CI#2310
mirza-halilcevic wants to merge 41 commits intodevelopfrom
tuning-runner-ci

Conversation

@mirza-halilcevic
Copy link
Copy Markdown
Contributor

@mirza-halilcevic mirza-halilcevic commented Mar 24, 2026

Motivation

tuningRunner.py is now more stable and useful, in addition to offering multi-gpu support, and it should replace MITuna in the CI.

Technical Details

  • Call tuningRunner.py instead of tuna-script in Jenkins CI
  • Remove MITuna traces from the project
  • The per-verification timeout is now configurable and defaults to disabled; this is to avoid killing legitimately slow kernels, and we have a global CI timeout anyway.
  • Added NumaNodeLock (shared/exclusive) so CPU verification gets exclusive memory bandwidth on its NUMA node, preventing contention from concurrent compile threads
  • Log info-level message on successful config completion when the progress bar is disabled (CI mode), preventing false Jenkins activity timeouts
  • rocmlir-gen behvaior change; fail instead of falling back to cpu verificatoin when gpu verification is requested but unsupported
  • relDiff and RMS threshold relaxed for CPU verification; some Attention configs were failing because they do not support GPU verification and the threshold was too tight considering the CPU reference is expected to produce a larger discrepancy in results. Attention was affected due to GPU verification not being supported, but other ops are bound to face similar issues using CPU verification, so the threshold is increased across the board (relDiff = 0.0001; RMS = 0.15).

Test Plan

CI passes.

Test Result

  • PR CI
  • Weekly CI

Submission Checklist

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 24, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

Additional details and impacted files
@@             Coverage Diff             @@
##           develop    #2310      +/-   ##
===========================================
- Coverage    79.50%   79.19%   -0.32%     
===========================================
  Files          100      123      +23     
  Lines        31016    40444    +9428     
  Branches      4819     6638    +1819     
===========================================
+ Hits         24659    32026    +7367     
- Misses        4245     5594    +1349     
- Partials      2112     2824     +712     
Flag Coverage Δ
gfx950 79.01% <ø> (?)
mfma ?
navi4x 79.14% <ø> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.
see 102 files with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@dorde-antic dorde-antic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It passed Tune selected MLIR configs… using the tuningRunner (and not the tuna script), but I would still recommend running this in a custom weekly job (tuning-only) before merging

@mirza-halilcevic mirza-halilcevic marked this pull request as ready for review March 26, 2026 12:41
Comment thread mlir/utils/jenkins/Jenkinsfile
Comment thread mlir/utils/jenkins/Jenkinsfile Outdated
@justinrosner justinrosner mentioned this pull request Apr 6, 2026
2 tasks
Comment thread mlir/utils/jenkins/Jenkinsfile
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants