Skip to content

refactor: enforce tilebuf-only pipeline#612

Open
Zhendong404 wants to merge 1 commit intohw-native-sys:mainfrom
Zhendong404:squash-tilebuf-planmem
Open

refactor: enforce tilebuf-only pipeline#612
Zhendong404 wants to merge 1 commit intohw-native-sys:mainfrom
Zhendong404:squash-tilebuf-planmem

Conversation

@Zhendong404
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request transitions the PTOAS compilation pipeline to a tilebuf-native model, replacing legacy memref dependencies with a unified tile_buf/view/ptr system. This involves introducing a semantic extraction layer and updating PlanMemory and InsertSync to consume these native types. Feedback identifies several critical issues: a unit mismatch between bits and bytes in address planning, incorrect stride calculations for column-major subviews in the sync translator, and the breakage of global workspace planning due to hardcoded local-memory logic. Additionally, the reviewer noted that scalar load/store operations are missing from liveness analysis and pointed out redundant logic for buffer alias extraction that should be unified.

Comment on lines 111 to +112
SmallVector<uint64_t> offsets = getAllocatedOffsets(
op, currentMemRefType, buffer2Offsets, fallbackNextOffset);
op, tileType, buffer2Offsets, fallbackNextOffset);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There is a unit mismatch between PlanMemory and AllocToPointerCast. PlanMemory calculates offsets in bits (using semantics.constBits), while AllocToPointerCast's fallback logic and alignment constants use bytes. If a buffer is planned, its offset in buffer2Offsets will be in bits, but it is used here as a byte address, leading to incorrect hardware addresses. Ensure that offsets are consistently handled in bytes or bits throughout the pipeline, or perform the necessary conversion here.

Comment on lines +66 to 71
SmallVector<int64_t, 4> strides(shape.size(), 1);
for (int i = static_cast<int>(shape.size()) - 2; i >= 0; --i) {
if (shape[i + 1] == ShapedType::kDynamic)
return {-1, -1};
strides[i] = strides[i + 1] * shape[i + 1];
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The stride calculation for pto.subview assumes a row-major layout. If the source tile has a column-major layout (specified in its TileBufType config), the calculated totalOffset will be incorrect, leading to wrong hazard detection in sync insertion. The calculation should check the blayout attribute of the source tile type and adjust the stride accumulation accordingly.

Comment on lines +852 to 861
auto memorySpaceAttr = getPlanningBufferSpaceAttr(operand);
if (isLocalMemPlan() && isLocalBuffer(memorySpaceAttr)) {
if (!memorySpaceAttr.has_value())
llvm::report_fatal_error("local buffer must have memory space");
return GetBufferInfo(op, operand,
memorySpaceAttr.value().getAddressSpace());
return GetBufferInfo(op, operand, memorySpaceAttr.value().getAddressSpace(),
out);
}
llvm_unreachable("buffer must has BufferInfo !");
op->emitError("expects local tile buffer result for PlanMemory");
return failure();
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

GenerateBufferInfo is currently hardcoded to only support local memory planning. If planMode is GLOBAL_WORKSPACE_PLAN, the function will emit an error and return failure for any buffer, effectively breaking global workspace planning. Additionally, getPlanningBufferSpaceAttr does not handle GM views, which are the primary targets for global workspace planning. The logic should be updated to handle non-local buffers when in global workspace planning mode and to correctly resolve the GM address space for views.

Comment on lines 435 to 437
} else if (auto tprintOp = dyn_cast<pto::TPrintOp>(op)) {
// TPrintOp only reads from buffer, similar to LoadOp
OpKillHandle(curOpInfo, live, op->getBlock());
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

pto::LoadScalarOp and pto::StoreScalarOp are not handled in RecursionIR. Since they read/write buffers, they should trigger OpKillHandle or UpdateOpGenInfo to ensure accurate liveness analysis. Skipping them may lead to incorrect memory planning if they are the last users of a buffer.

    } else if (auto tprintOp = dyn_cast<pto::TPrintOp>(op)) {
      // TPrintOp only reads from buffer, similar to LoadOp
      OpKillHandle(curOpInfo, live, op->getBlock());
    } else if (auto loadScalarOp = dyn_cast<pto::LoadScalarOp>(op)) {
      OpKillHandle(curOpInfo, live, op->getBlock());
    } else if (auto storeScalarOp = dyn_cast<pto::StoreScalarOp>(op)) {
      UpdateStoreOpInfo(curOpInfo, storeScalarOp.getPtr(), live);

Comment on lines +32 to +46
// Returns (alias_result, source) for planning aliases, including tile views.
std::optional<std::pair<Value, Value>> getBufferAliasInfo(Operation *op) {
if (auto genericAlias = getOperationAliasInfo(op))
return genericAlias;

if (auto bindOp = dyn_cast<pto::BindTileOp>(op))
return std::make_pair(bindOp.getResult(), bindOp.getSource());
if (auto subviewOp = dyn_cast<pto::SubViewOp>(op))
return std::make_pair(subviewOp.getResult(), subviewOp.getSource());
if (auto bitcastOp = dyn_cast<pto::BitcastOp>(op))
return std::make_pair(bitcastOp.getResult(), bitcastOp.getSrc());
if (auto treshapeOp = dyn_cast<pto::TReshapeOp>(op))
return std::make_pair(treshapeOp.getResult(), treshapeOp.getSrc());
return std::nullopt;
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

getBufferAliasInfo is redundant with getOperationAliasInfo defined in Utils.cpp. Both functions now handle the same set of PTO ops (BindTileOp, SubViewOp, BitcastOp, TReshapeOp). These should be unified to avoid duplicate logic and potential inconsistencies.

std::optional<std::pair<Value, Value>> getBufferAliasInfo(Operation *op) {
  return getOperationAliasInfo(op);
}

@reedhecre
Copy link
Copy Markdown

reedhecre commented May 1, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: refactor: enforce tilebuf-only pipeline #612 refactor: enforce tilebuf-only pipeline
  • Author: Zhendong404
  • Base/Head: main / squash-tilebuf-planmem
  • Head SHA: 19bbcd3eca5f
  • Trigger: PR 有新提交
  • Generated At: 2026-05-06T02:13:36Z
  • Previous Head SHA: c3e1f7cffb2d
  • Status: completed

Summary

PR breaks existing memref-based IR/tests and miscomputes subview alias offsets for non-row-major tiles.

Findings

  1. P1 Memref-based IR is no longer accepted include/PTO/IR/PTOOps.td:37

This removes AnyMemRef from PTODpsType and the drivers also stop registering MemRefDialect. The repo still has a large lit corpus with memref<...> inputs (for example the _gss.pto companions under test/lit/pto/), so those files will now fail to parse/verify instead of round-tripping. That is a CI/regression blocker, not just a pipeline refactor.

  1. P2 Subview alias offsets now assume row-major layout lib/PTO/Transforms/InsertSync/PTOIRTranslator.cpp:66

getStaticOffsetAndSize() rebuilds strides from the logical shape and ignores the tile's actual layout/preserved stride. For pto.subview on col-major or non-compact tiles, the computed baseAddresses will be wrong, which feeds the sync/memory alias analysis and can suppress required waits or insert spurious ones.

@Zhendong404 Zhendong404 marked this pull request as ready for review May 1, 2026 14:22
@Zhendong404 Zhendong404 force-pushed the squash-tilebuf-planmem branch from 5daaa8e to c3e1f7c Compare May 1, 2026 14:23
@Zhendong404 Zhendong404 force-pushed the squash-tilebuf-planmem branch from c3e1f7c to 19bbcd3 Compare May 6, 2026 02:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants