Skip to content

Add AccPhase attribute to pto.tmatmul / pto.tmatmul.acc#638

Draft
zhangstevenunity wants to merge 1 commit intomainfrom
codex/tmatmul-acc-phase
Draft

Add AccPhase attribute to pto.tmatmul / pto.tmatmul.acc#638
zhangstevenunity wants to merge 1 commit intomainfrom
codex/tmatmul-acc-phase

Conversation

@zhangstevenunity
Copy link
Copy Markdown
Collaborator

The CCEC TMATMUL / TMATMUL_ACC entry points accept a templated AccPhase parameter (Unspecified / Partial / Final) that distinguishes K-loop final accumulation from intermediate iterations. The PTO IR previously had no way to carry this distinction, so the emit-C pattern always rendered the un-templated form and a downstream scheduler would have to guard each call with a switch over the phase flag at runtime.

  • Add PTO_AccPhaseEnum / PTO_AccPhaseAttr to PTOAttrs.td, mirroring the encoding of the existing PTO_STPhaseEnum (Unspecified=0, Partial=2, Final=3).
  • Attach a DefaultValuedAttr<PTO_AccPhaseAttr, ...Unspecified> $accPhase to TMatmulOp and TMatmulAccOp. Default = Unspecified keeps every existing producer's IR byte-identical.
  • PTOToEmitC: when the attribute is non-default, populate emitc.call_opaque's template_args with AccPhase::Final / AccPhase::Partial so the rendered call becomes TMATMUL<AccPhase::Final>(...) etc.
  • Add a lit test exercising all six (acc x phase) forms.

Note: this only gives the IR the expressive capacity. No upstream pass yet writes the attribute, so today every pto.tmatmul still lowers to TMATMUL(...). A follow-up K-loop scheduler change is needed to set Partial / Final on the relevant calls.

The CCEC TMATMUL / TMATMUL_ACC entry points accept a templated
`AccPhase` parameter (`Unspecified` / `Partial` / `Final`) that
distinguishes K-loop final accumulation from intermediate iterations.
The PTO IR previously had no way to carry this distinction, so the
emit-C pattern always rendered the un-templated form and a downstream
scheduler would have to guard each call with a switch over the phase
flag at runtime.

* Add PTO_AccPhaseEnum / PTO_AccPhaseAttr to PTOAttrs.td, mirroring
  the encoding of the existing PTO_STPhaseEnum (Unspecified=0,
  Partial=2, Final=3).
* Attach a DefaultValuedAttr<PTO_AccPhaseAttr, ...Unspecified> $accPhase
  to TMatmulOp and TMatmulAccOp. Default = Unspecified keeps every
  existing producer's IR byte-identical.
* PTOToEmitC: when the attribute is non-default, populate
  emitc.call_opaque's template_args with `AccPhase::Final` /
  `AccPhase::Partial` so the rendered call becomes
  `TMATMUL<AccPhase::Final>(...)` etc.
* Add a lit test exercising all six (acc x phase) forms.

Note: this only gives the IR the *expressive capacity*. No upstream
pass yet writes the attribute, so today every `pto.tmatmul` still
lowers to `TMATMUL(...)`. A follow-up K-loop scheduler change is
needed to set Partial / Final on the relevant calls.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces an accumulation phase attribute (AccPhase) for TMatmulOp and TMatmulAccOp in the PTO dialect, enabling the lowering process to generate templated TMATMUL calls for different loop iterations. The changes include defining the new enum attribute, updating the operation definitions, and modifying the PTOToEmitC conversion patterns. A review comment suggests refactoring the accumulation phase handling in the lowering logic to use a switch statement for better maintainability and to ensure all enum cases are explicitly covered.

Comment on lines +4196 to +4201
if (phase == pto::AccPhase::Unspecified)
return ArrayAttr{};
StringRef tmpl = phase == pto::AccPhase::Final ? "AccPhase::Final"
: "AccPhase::Partial";
return rewriter.getArrayAttr(
{emitc::OpaqueAttr::get(rewriter.getContext(), tmpl)});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

For improved maintainability and robustness against future changes, consider using a switch statement instead of the current if-ternary structure. This approach will ensure that any new members added to the pto::AccPhase enum are explicitly handled, preventing potential silent miscompilations where a new phase might be incorrectly treated as Partial. Using llvm_unreachable in the default case is a standard practice in LLVM/MLIR for ensuring all enum cases are covered.

  switch (phase) {
  case pto::AccPhase::Unspecified:
    return ArrayAttr{};
  case pto::AccPhase::Partial:
    return rewriter.getArrayAttr(
        {emitc::OpaqueAttr::get(rewriter.getContext(), "AccPhase::Partial")});
  case pto::AccPhase::Final:
    return rewriter.getArrayAttr(
        {emitc::OpaqueAttr::get(rewriter.getContext(), "AccPhase::Final")});
  }
  llvm_unreachable("unhandled pto::AccPhase");

@reedhecre
Copy link
Copy Markdown

Codex Review

该评论由 review 机器人自动更新。

Summary

发现 1 个阻断问题:PR 新增的 accPhasepto-view-to-memref 重建 tmatmul/tmatmul.acc 时被丢弃,导致 EmitC 侧无法按 Partial/Final 生成模板调用,功能在标准流水线中会静默失效。

Findings

  1. P1 `accPhase` 属性在 `pto-view-to-memref` 中被丢弃,导致新加模板语义失效 lib/PTO/Transforms/PTOViewToMemref.cpp:1705

accPhase 被加到 pto.tmatmul/pto.tmatmul.acc 后,PTOViewToMemref 仍在 Stage 3 里用 replaceOpWithNewOp 仅按操作数重建这两个 op,没有复制原属性。结果是显式的 {accPhase = #pto<acc_phase partial/final>} 会在该 pass 后退化为默认 Unspecified。而本 PR 的 EmitC lowering (op.getAccPhase()) 正依赖该属性决定是否生成 TMATMUL<AccPhase::...> / TMATMUL_ACC<AccPhase::...>,因此在标准 ptoas pipeline 中新功能会被静默失效,并可能导致 K-loop 累加阶段语义错误(中间轮次/最终轮次不再区分)。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants