Add AccPhase attribute to pto.tmatmul / pto.tmatmul.acc#638
Add AccPhase attribute to pto.tmatmul / pto.tmatmul.acc#638zhangstevenunity wants to merge 1 commit intomainfrom
Conversation
The CCEC TMATMUL / TMATMUL_ACC entry points accept a templated `AccPhase` parameter (`Unspecified` / `Partial` / `Final`) that distinguishes K-loop final accumulation from intermediate iterations. The PTO IR previously had no way to carry this distinction, so the emit-C pattern always rendered the un-templated form and a downstream scheduler would have to guard each call with a switch over the phase flag at runtime. * Add PTO_AccPhaseEnum / PTO_AccPhaseAttr to PTOAttrs.td, mirroring the encoding of the existing PTO_STPhaseEnum (Unspecified=0, Partial=2, Final=3). * Attach a DefaultValuedAttr<PTO_AccPhaseAttr, ...Unspecified> $accPhase to TMatmulOp and TMatmulAccOp. Default = Unspecified keeps every existing producer's IR byte-identical. * PTOToEmitC: when the attribute is non-default, populate emitc.call_opaque's template_args with `AccPhase::Final` / `AccPhase::Partial` so the rendered call becomes `TMATMUL<AccPhase::Final>(...)` etc. * Add a lit test exercising all six (acc x phase) forms. Note: this only gives the IR the *expressive capacity*. No upstream pass yet writes the attribute, so today every `pto.tmatmul` still lowers to `TMATMUL(...)`. A follow-up K-loop scheduler change is needed to set Partial / Final on the relevant calls. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
Code Review
This pull request introduces an accumulation phase attribute (AccPhase) for TMatmulOp and TMatmulAccOp in the PTO dialect, enabling the lowering process to generate templated TMATMUL calls for different loop iterations. The changes include defining the new enum attribute, updating the operation definitions, and modifying the PTOToEmitC conversion patterns. A review comment suggests refactoring the accumulation phase handling in the lowering logic to use a switch statement for better maintainability and to ensure all enum cases are explicitly covered.
| if (phase == pto::AccPhase::Unspecified) | ||
| return ArrayAttr{}; | ||
| StringRef tmpl = phase == pto::AccPhase::Final ? "AccPhase::Final" | ||
| : "AccPhase::Partial"; | ||
| return rewriter.getArrayAttr( | ||
| {emitc::OpaqueAttr::get(rewriter.getContext(), tmpl)}); |
There was a problem hiding this comment.
For improved maintainability and robustness against future changes, consider using a switch statement instead of the current if-ternary structure. This approach will ensure that any new members added to the pto::AccPhase enum are explicitly handled, preventing potential silent miscompilations where a new phase might be incorrectly treated as Partial. Using llvm_unreachable in the default case is a standard practice in LLVM/MLIR for ensuring all enum cases are covered.
switch (phase) {
case pto::AccPhase::Unspecified:
return ArrayAttr{};
case pto::AccPhase::Partial:
return rewriter.getArrayAttr(
{emitc::OpaqueAttr::get(rewriter.getContext(), "AccPhase::Partial")});
case pto::AccPhase::Final:
return rewriter.getArrayAttr(
{emitc::OpaqueAttr::get(rewriter.getContext(), "AccPhase::Final")});
}
llvm_unreachable("unhandled pto::AccPhase");
Codex Review该评论由 review 机器人自动更新。
Summary发现 1 个阻断问题:PR 新增的 Findings
|
The CCEC TMATMUL / TMATMUL_ACC entry points accept a templated
AccPhaseparameter (Unspecified/Partial/Final) that distinguishes K-loop final accumulation from intermediate iterations. The PTO IR previously had no way to carry this distinction, so the emit-C pattern always rendered the un-templated form and a downstream scheduler would have to guard each call with a switch over the phase flag at runtime.AccPhase::Final/AccPhase::Partialso the rendered call becomesTMATMUL<AccPhase::Final>(...)etc.Note: this only gives the IR the expressive capacity. No upstream pass yet writes the attribute, so today every
pto.tmatmulstill lowers toTMATMUL(...). A follow-up K-loop scheduler change is needed to set Partial / Final on the relevant calls.