Skip to content

Add FA PTO lit regression cases#609

Open
zhangstevenunity wants to merge 5 commits intomainfrom
codex/add-fa-lit-tests
Open

Add FA PTO lit regression cases#609
zhangstevenunity wants to merge 5 commits intomainfrom
codex/add-fa-lit-tests

Conversation

@zhangstevenunity
Copy link
Copy Markdown
Collaborator

No description provided.

@zhangstevenunity zhangstevenunity marked this pull request as ready for review April 30, 2026 07:32
@zhangstevenunity
Copy link
Copy Markdown
Collaborator Author

/run a3 test/lit/pto/fa_perf.pto

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa_perf,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A3 板测成功

  • 触发方式:manual
  • 源码提交:10c08b6ca232
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_153506_manual_pr609.log
  • 结果 TSV:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_153506_manual_pr609.tsv
  • 手动指令:/run a3 test/lit/pto/fa_perf
  • 触发人:zhangstevenunity
  • 指定用例:test/lit/pto/fa_perf
  • 触发评论:Add FA PTO lit regression cases #609 (comment)

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 test/lit/pto/fa.pto

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces new test files for the pto service. The review identified critical issues regarding incorrect pipe initialization parameters, specifically the dir_mask and tensor view shapes. Additionally, several opportunities for code cleanup were noted, such as removing redundant constant definitions and moving loop-invariant constants outside of loops.

Comment thread test/lit/pto/fa.pto Outdated
%qk_slot_desc = pto.make_tensor_view %21, shape = [%c128, %c256], strides = [%c256, %c1] : !pto.tensor_view<128x256xf32>
pto.aiv_initialize_pipe{id = 25, dir_mask = 1, slot_size = 131072} (gm_slot_tensor = %qk_slot_desc : !pto.tensor_view<128x256xf32>)
%pv_slot_desc = pto.make_tensor_view %22, shape = [%c128, %c128_0], strides = [%c128_0, %c1] : !pto.tensor_view<128x128xf32>
pto.aiv_initialize_pipe{id = 27, dir_mask = 1, slot_size = 65536} (gm_slot_tensor = %pv_slot_desc : !pto.tensor_view<128x128xf32>)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

The dir_mask for pipe 27 in vector_kernel appears to be incorrect. This kernel acts as a consumer for pipe 27 (as shown by tpop_from_aic), so the dir_mask should be 2 (consumer), not 1 (producer).

    pto.aiv_initialize_pipe{id = 27, dir_mask = 2, slot_size = 65536} (gm_slot_tensor = %pv_slot_desc : !pto.tensor_view<128x128xf32>)

Comment thread test/lit/pto/fa_perf.pto
Comment on lines +304 to +305
%pv_slot_desc = pto.make_tensor_view %22, shape = [%c64, %c128_0], strides = [%c128_0, %c1] : !pto.tensor_view<64x128xf32>
pto.aiv_initialize_pipe{id = 27, dir_mask = 1, slot_size = 65536} (gm_slot_tensor = %pv_slot_desc : !pto.tensor_view<64x128xf32>)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

There appear to be two inconsistencies in the initialization of pipe 27:

  1. The dir_mask should be 2 (consumer), not 1, because this kernel consumes from pipe 27 (see tpop_from_aic calls).
  2. The gm_slot_tensor shape is 64x128xf32, which mismatches the producer's (cube_kernel) shape of 128x128xf32 for the same pipe. The global memory layout of a pipe slot should be consistent.
    %pv_slot_desc = pto.make_tensor_view %22, shape = [%c128, %c128_0], strides = [%c128_0, %c1] : !pto.tensor_view<128x128xf32>
    pto.aiv_initialize_pipe{id = 27, dir_mask = 2, slot_size = 65536} (gm_slot_tensor = %pv_slot_desc : !pto.tensor_view<128x128xf32>)

Comment thread test/lit/pto/fa.pto Outdated
Comment on lines +100 to +103
%c256_13 = arith.constant 256 : index
%c0_14 = arith.constant 0 : index
%51 = arith.addi %c256_13, %c0_14 : index
%52 = pto.partition_view %41, offsets = [%c0, %51], sizes = [%c128_0, %c128_1] : !pto.tensor_view<?x?xf16> -> !pto.partition_tensor_view<128x128xf16>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constant %c256_13 is redefined inside the loop, and then used in a redundant addition with zero. You can simplify this by using the existing %c256 constant (defined at line 9) directly in the pto.partition_view operation.

      %52 = pto.partition_view %41, offsets = [%c0, %c256], sizes = [%c128_0, %c128_1] : !pto.tensor_view<?x?xf16> -> !pto.partition_tensor_view<128x128xf16>

Comment thread test/lit/pto/fa.pto Outdated
%42 = pto.make_tensor_view %arg1, shape = [%c2048, %c128_0], strides = [%c128_0, %c1] : !pto.tensor_view<?x?xf32>
scf.for %arg2 = %14 to %18 step %c1 {
%43 = arith.muli %arg2, %c128 : index
%c394752_i64 = arith.constant 394752 : i64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constant %c394752_i64 is loop-invariant and can be defined once outside the scf.for loop at line 297. This applies to other similar constant definitions inside this loop as well (e.g., lines 318, 342, 349, etc.).

Comment thread test/lit/pto/fa_perf.pto
Comment on lines +97 to +100
%c256_13 = arith.constant 256 : index
%c0_14 = arith.constant 0 : index
%51 = arith.addi %c256_13, %c0_14 : index
%52 = pto.partition_view %41, offsets = [%c0, %51], sizes = [%c128_0, %c128_1] : !pto.tensor_view<?x?xf16> -> !pto.partition_tensor_view<128x128xf16>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constant %c256_13 is redefined inside the loop, and then used in a redundant addition with zero. You can simplify this by using the existing %c256 constant (defined at line 9) directly in the pto.partition_view operation.

      %52 = pto.partition_view %41, offsets = [%c0, %c256], sizes = [%c128_0, %c128_1] : !pto.tensor_view<?x?xf16> -> !pto.partition_tensor_view<128x128xf16>

Comment thread test/lit/pto/fa_perf.pto
%42 = pto.make_tensor_view %arg1, shape = [%c2048, %c128_0], strides = [%c128_0, %c1] : !pto.tensor_view<?x?xf32>
scf.for %arg2 = %14 to %18 step %c1 {
%43 = arith.muli %arg2, %c128 : index
%c394752_i64 = arith.constant 394752 : i64
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The constant %c394752_i64 is loop-invariant and can be defined once outside the scf.for loop at line 340. This applies to other similar constant definitions inside this loop as well.

@reedhecre
Copy link
Copy Markdown

A3 板测成功

  • 触发方式:manual
  • 源码提交:10c08b6ca232
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_154005_manual_pr609.log
  • 结果 TSV:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_154005_manual_pr609.tsv
  • 手动指令:/run a3 test/lit/pto/fa
  • 触发人:HecreReed
  • 指定用例:test/lit/pto/fa
  • 触发评论:Add FA PTO lit regression cases #609 (comment)

@reedhecre
Copy link
Copy Markdown

reedhecre commented Apr 30, 2026

Codex Review

该评论由 review 机器人自动更新。

  • PR: Add FA PTO lit regression cases #609 Add FA PTO lit regression cases
  • Author: zhangstevenunity
  • Base/Head: main / codex/add-fa-lit-tests
  • Head SHA: 9f12bae0e2ba
  • Trigger: PR 有新提交
  • Generated At: 2026-05-07T11:50:41Z
  • Previous Head SHA: 5100115c10e7
  • Status: completed

Summary

PR #609 adds FA lit cases, but they only assert successful compilation and do not actually guard the FA preload/split lowering behavior the PR is trying to regress.

Findings

  1. P2 The new FA lit cases are compile-smoke tests only and never assert the fixed lowering behavior test/lit/pto/fa_perf.pto:1

Both test/lit/pto/fa_perf.pto and test/lit/pto/fa_perf_smoke.pto run ptoas --enable-insert-sync and discard the output to /dev/null, so they only catch hard parse/assert failures. The branch history already had to fix FA-specific preload ordering and PV split-mode details, but those are lowering-contract bugs that can still produce compilable output. As written, a future regression that emits the wrong split axis / wrong pipe entry shape / missing inserted syncs will still pass these tests, so the new cases do not actually lock in the regression they were added to cover.

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 ../lit/pto/fa.pto

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 ../lit/pto/fa,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A3 板测成功

  • 触发方式:manual
  • 源码提交:10c08b6ca232
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_155905_manual_pr609.log
  • 结果 TSV:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_155905_manual_pr609.tsv
  • 手动指令:/run a3 ../lit/pto/fa
  • 触发人:HecreReed
  • 指定用例:../lit/pto/fa
  • 触发评论:Add FA PTO lit regression cases #609 (comment)

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 test/lit/pto/fa_perf.pto

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa_perf.pto,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A3 板测失败

  • 触发方式:manual
  • 源码提交:10c08b6ca232
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_162805_manual_pr609.log
  • 手动指令:/run a3 test/lit/pto/fa_perf.pto
  • 触发人:HecreReed
  • 直接PTO:test/lit/pto/fa_perf.pto
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:generate-direct-pto-fa_perf / exit=1

日志尾部

/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":664:18): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":665:18): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":689:21): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":690:21): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":691:21): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":692:21): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":693:21): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":715:13): error: unexpected 'addr' operand: only supported when --pto-level=level3
loc("/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_162805_manual_pr609/repo/test/lit/pto/fa_perf.pto":723:13): error: unexpected 'addr' operand: only supported when --pto-level=level3
===== END STAGE generate-direct-pto-fa_perf rc=1 @ 2026-04-30 16:30:17 =====

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 test/lit/pto/fa_perf.pto --pto-level=level3

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa_perf.pto --pto-level=level3,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A3 板测失败

  • 触发方式:manual
  • 源码提交:10c08b6ca232
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_163105_manual_pr609.log
  • 手动指令:/run a3 test/lit/pto/fa_perf.pto --pto-level=level3
  • 触发人:HecreReed
  • 直接PTO:test/lit/pto/fa_perf.pto
  • PTOAS 参数:--pto-level=level3
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:board-validation / exit=1

日志尾部

odel: posix
InstalledDir: /usr/local/Ascend/cann-8.5.0/bin
[2026-04-30 16:33:23] ASCEND_HOME_PATH=/usr/local/Ascend/cann-8.5.0
[2026-04-30 16:33:23] Detected A3 board from simulator dir fallback: /usr/local/Ascend/cann-8.5.0/aarch64-linux/simulator/Ascend910B1/lib
[2026-04-30 16:33:23] SIM_SOC_VERSION=Ascend910A
[2026-04-30 16:33:23] PTOAS_BOARD_IS_A3=1
[2026-04-30 16:33:23] === NPU Device Check ===
uid=1038(zhongxuan) gid=1038(zhongxuan) groups=1038(zhongxuan),10(wheel)
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  0 Apr 20 21:14 /dev/davinci0
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  1 Apr 20 21:14 /dev/davinci1
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 10 Apr 20 21:14 /dev/davinci10
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 11 Apr 20 21:14 /dev/davinci11
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 12 Apr 20 21:14 /dev/davinci12
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 13 Apr 20 21:14 /dev/davinci13
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 14 Apr 20 21:14 /dev/davinci14
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 15 Apr 20 21:14 /dev/davinci15
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  2 Apr 20 21:14 /dev/davinci2
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  3 Apr 20 21:14 /dev/davinci3
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  4 Apr 20 21:14 /dev/davinci4
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  5 Apr 20 21:14 /dev/davinci5
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  6 Apr 20 21:14 /dev/davinci6
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  7 Apr 20 21:14 /dev/davinci7
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  8 Apr 20 21:14 /dev/davinci8
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  9 Apr 20 21:14 /dev/davinci9
crw-rw---- 1 HwHiAiUser HwHiAiUser 510,  0 Apr 20 21:14 /dev/davinci_manager
[2026-04-30 16:33:23] ERROR: /dev/davinciauto not found
===== END STAGE board-validation rc=1 @ 2026-04-30 16:33:23 =====

@learning-chip
Copy link
Copy Markdown
Contributor

learning-chip commented Apr 30, 2026

A3 板测成功

What is the driver program (C++ main entry) to run this on-board test?

I tried launching with torch-npu here in ir_ref/launch_kernel

But got run-time error: (I am using ptoas 0.36 release to generate the cpp)

Traceback (most recent call last):
  File "/workdir/pto-dsl/examples/aot/flash_attention/ir_ref/launch_kernel/./run.py", line 599, in <module>
    main()
  File "/workdir/pto-dsl/examples/aot/flash_attention/ir_ref/launch_kernel/./run.py", line 575, in main
    test_flash(default_lib, device, num_tiles=first_nt)
  File "/workdir/pto-dsl/examples/aot/flash_attention/ir_ref/launch_kernel/./run.py", line 377, in test_flash
    torch.npu.synchronize()
  File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch_npu/npu/utils.py", line 72, in synchronize
    return torch_npu._C._npu_synchronize()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: npuSynchronizeDevice:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:575 NPU function error: AclrtSynchronizeDeviceWithTimeout, error code is 507015
[ERROR] 2026-04-30-08:36:53 (PID:14525, Device:0, RankID:-1) ERR00100 PTA call acl api failed
[Error]: The aicore execution is abnormal. 
        Rectify the fault based on the error information in the ascend log.
EZ9999: Inner Error!
EZ9999[PID: 14525] 2026-04-30-08:36:53.485.397 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 78, there is an exception of fftsplus aivector error, core id is 27, error code = 0, dump info: pc start: 0x1240000035a0, current: 0x124000002284, vec error info: 0x8600008021, mte error info: 0x1e06000084, ifu error info: 0x7fe30f8e00040, ccu error info: 0x40e0080004000097, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000080.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:347]
        TraceBack (most recent call last):
       The extend info: errcode:(0, 0x4000, 0) errorStr: CCU instruction address check error. fixp_error0 info: 0x6000084, fixp_error1 info: 0x1e, fsmId:0, tslot:6, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:360]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1493]
       rtDeviceSynchronizeWithTimeout execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:61]
       wait for compute device to finish failed, runtime result = 507015.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]

[W430 08:36:53.748261061 compiler_depend.ts:595] Warning: NPU warning, error code is 507015[Error]: 
[Error]: The aicore execution is abnormal. 
        Rectify the fault based on the error information in the ascend log.
EE9999: Inner Error!
EE9999[PID: 14525] 2026-04-30-08:36:53.489.962 (EE9999):  rtDeviceSynchronizeWithTimeout execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:61]
        TraceBack (most recent call last):
       wait for compute device to finish failed, runtime result = 507015.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
 (function npuSynchronizeUsedDevices)
[W430 08:36:53.749083207 compiler_depend.ts:577] Warning: NPU warning, error code is 507015[Error]: 
[Error]: The aicore execution is abnormal. 
        Rectify the fault based on the error information in the ascend log.
EE9999: Inner Error!
EE9999[PID: 14525] 2026-04-30-08:36:53.490.887 (EE9999):  rtDeviceSynchronizeWithTimeout execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:61]
        TraceBack (most recent call last):
       wait for compute device to finish failed, runtime result = 507015.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
 (function npuSynchronizeDevice)

In comparison the manual C++ runs fine cpp_ref/split_pipe

My test environment is this Dockerfile as used by huawei-csl/pto-dsl#130

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 test/lit/pto/fa_perf.pto --pto-level=level3

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa_perf.pto --pto-level=level3,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@HecreReed
Copy link
Copy Markdown
Collaborator

A3 板测成功

What is the driver program (C++ main entry) to run this on-board test?

I tried launching with torch-npu here in ir_ref/launch_kernel

But got run-time error: (I am using ptoas 0.36 release to generate the cpp)

Traceback (most recent call last):
  File "/workdir/pto-dsl/examples/aot/flash_attention/ir_ref/launch_kernel/./run.py", line 599, in <module>
    main()
  File "/workdir/pto-dsl/examples/aot/flash_attention/ir_ref/launch_kernel/./run.py", line 575, in main
    test_flash(default_lib, device, num_tiles=first_nt)
  File "/workdir/pto-dsl/examples/aot/flash_attention/ir_ref/launch_kernel/./run.py", line 377, in test_flash
    torch.npu.synchronize()
  File "/usr/local/python3.11.14/lib/python3.11/site-packages/torch_npu/npu/utils.py", line 72, in synchronize
    return torch_npu._C._npu_synchronize()
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: npuSynchronizeDevice:build/CMakeFiles/torch_npu.dir/compiler_depend.ts:575 NPU function error: AclrtSynchronizeDeviceWithTimeout, error code is 507015
[ERROR] 2026-04-30-08:36:53 (PID:14525, Device:0, RankID:-1) ERR00100 PTA call acl api failed
[Error]: The aicore execution is abnormal. 
        Rectify the fault based on the error information in the ascend log.
EZ9999: Inner Error!
EZ9999[PID: 14525] 2026-04-30-08:36:53.485.397 (EZ9999):  The error from device(chipId:0, dieId:0), serial number is 78, there is an exception of fftsplus aivector error, core id is 27, error code = 0, dump info: pc start: 0x1240000035a0, current: 0x124000002284, vec error info: 0x8600008021, mte error info: 0x1e06000084, ifu error info: 0x7fe30f8e00040, ccu error info: 0x40e0080004000097, cube error info: 0, biu error info: 0, aic error mask: 0x6500020bd00028c, para base: 0x12c100000080.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:347]
        TraceBack (most recent call last):
       The extend info: errcode:(0, 0x4000, 0) errorStr: CCU instruction address check error. fixp_error0 info: 0x6000084, fixp_error1 info: 0x1e, fsmId:0, tslot:6, thread:0, ctxid:0, blk:0, sublk:0, subErrType:4.[FUNC:PrintCoreInfo][FILE:device_error_core_proc.cc][LINE:360]
       Kernel task happen error, retCode=0x26, [aicore exception].[FUNC:PreCheckTaskErr][FILE:davinci_kernel_task.cc][LINE:1493]
       rtDeviceSynchronizeWithTimeout execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:61]
       wait for compute device to finish failed, runtime result = 507015.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]

[W430 08:36:53.748261061 compiler_depend.ts:595] Warning: NPU warning, error code is 507015[Error]: 
[Error]: The aicore execution is abnormal. 
        Rectify the fault based on the error information in the ascend log.
EE9999: Inner Error!
EE9999[PID: 14525] 2026-04-30-08:36:53.489.962 (EE9999):  rtDeviceSynchronizeWithTimeout execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:61]
        TraceBack (most recent call last):
       wait for compute device to finish failed, runtime result = 507015.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
 (function npuSynchronizeUsedDevices)
[W430 08:36:53.749083207 compiler_depend.ts:577] Warning: NPU warning, error code is 507015[Error]: 
[Error]: The aicore execution is abnormal. 
        Rectify the fault based on the error information in the ascend log.
EE9999: Inner Error!
EE9999[PID: 14525] 2026-04-30-08:36:53.490.887 (EE9999):  rtDeviceSynchronizeWithTimeout execution failed, reason=aicore exception[FUNC:FuncErrorReason][FILE:error_message_manage.cc][LINE:61]
        TraceBack (most recent call last):
       wait for compute device to finish failed, runtime result = 507015.[FUNC:ReportCallError][FILE:log_inner.cpp][LINE:148]
 (function npuSynchronizeDevice)

In comparison the manual C++ runs fine cpp_ref/split_pipe

My test environment is this Dockerfile as used by huawei-csl/pto-dsl#130

not successful just now,bug because github robot

@reedhecre
Copy link
Copy Markdown

A3 板测失败

  • 触发方式:manual
  • 源码提交:8624ed577953
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_170205_manual_pr609.log
  • 手动指令:/run a3 test/lit/pto/fa_perf.pto --pto-level=level3
  • 触发人:HecreReed
  • 直接PTO:test/lit/pto/fa_perf.pto
  • PTOAS 参数:--pto-level=level3
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:board-validation / exit=1

日志尾部

odel: posix
InstalledDir: /usr/local/Ascend/cann-8.5.0/bin
[2026-04-30 17:04:25] ASCEND_HOME_PATH=/usr/local/Ascend/cann-8.5.0
[2026-04-30 17:04:25] Detected A3 board from simulator dir fallback: /usr/local/Ascend/cann-8.5.0/aarch64-linux/simulator/Ascend910B1/lib
[2026-04-30 17:04:25] SIM_SOC_VERSION=Ascend910A
[2026-04-30 17:04:25] PTOAS_BOARD_IS_A3=1
[2026-04-30 17:04:25] === NPU Device Check ===
uid=1038(zhongxuan) gid=1038(zhongxuan) groups=1038(zhongxuan),10(wheel)
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  0 Apr 20 21:14 /dev/davinci0
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  1 Apr 20 21:14 /dev/davinci1
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 10 Apr 20 21:14 /dev/davinci10
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 11 Apr 20 21:14 /dev/davinci11
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 12 Apr 20 21:14 /dev/davinci12
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 13 Apr 20 21:14 /dev/davinci13
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 14 Apr 20 21:14 /dev/davinci14
crw-rw---- 1 HwHiAiUser HwHiAiUser 509, 15 Apr 20 21:14 /dev/davinci15
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  2 Apr 20 21:14 /dev/davinci2
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  3 Apr 20 21:14 /dev/davinci3
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  4 Apr 20 21:14 /dev/davinci4
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  5 Apr 20 21:14 /dev/davinci5
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  6 Apr 20 21:14 /dev/davinci6
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  7 Apr 20 21:14 /dev/davinci7
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  8 Apr 20 21:14 /dev/davinci8
crw-rw---- 1 HwHiAiUser HwHiAiUser 509,  9 Apr 20 21:14 /dev/davinci9
crw-rw---- 1 HwHiAiUser HwHiAiUser 510,  0 Apr 20 21:14 /dev/davinci_manager
[2026-04-30 17:04:25] ERROR: /dev/davinciauto not found
===== END STAGE board-validation rc=1 @ 2026-04-30 17:04:25 =====

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 test/lit/pto/fa.pto --pto-level=level3

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa.pto --pto-level=level3,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A3 板测失败

  • 触发方式:manual
  • 源码提交:8624ed577953
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_171705_manual_pr609.log
  • 手动指令:/run a3 test/lit/pto/fa.pto --pto-level=level3
  • 触发人:HecreReed
  • 直接PTO:test/lit/pto/fa.pto
  • PTOAS 参数:--pto-level=level3
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:board-validation / exit=127

日志尾部

026-04-30 17:19:15 =====
direct PTO generated: test/lit/pto/fa.pto -> test/samples/ManualPto/fa-pto.cpp testcase=fa
pto-isa vendor cache hit: repo=https://gitcode.com/cann/pto-isa.git requested_commit=662d7f2a916d6bbde3109ce4a16ed5c28f5d900a actual_commit=662d7f2a916d6bbde3109ce4a16ed5c28f5d900a

===== STAGE board-validation @ 2026-04-30 17:19:16 =====
task-submit cwd=/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_171705_manual_pr609/payload
task-submit env-file=/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_171705_manual_pr609/board-validation.env
task-submit run-script:
set -euo pipefail
cd /home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_171705_manual_pr609/payload
export DEVICE_ID=${TASK_DEVICE:-auto}
bash ./test/npu_validation/scripts/run_remote_npu_validation.sh
task-submit wrapped-command: bash -lc "set -euo pipefail; cd /home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_171705_manual_pr609/payload; export DEVICE_ID=${TASK_DEVICE:-auto}; bash ./test/npu_validation/scripts/run_remote_npu_validation.sh"
task-submit submit-cmd: /usr/local/bin/task-submit --device auto --max-time 0 --env-file /home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_171705_manual_pr609/board-validation.env 'bash -lc "set -euo pipefail; cd /home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_171705_manual_pr609/payload; export DEVICE_ID=${TASK_DEVICE:-auto}; bash ./test/npu_validation/scripts/run_remote_npu_validation.sh"'
task_20260430_171916_34230069245
task-submit task-id: task_20260430_171916_34230069245
等待任务执行: task_20260430_171916_34230069245 (Ctrl+C 终止任务)
/bin/bash: line 1: npu-lock: command not found
=== 任务失败 (exit=127) ===
task-submit wait rc=127
completed (exit=127)
===== END STAGE board-validation rc=127 @ 2026-04-30 17:19:17 =====

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a3 test/lit/pto/fa.pto --pto-level=level3

@reedhecre
Copy link
Copy Markdown

已接收 /run a3 test/lit/pto/fa.pto --pto-level=level3,A3 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A3 板测失败

  • 触发方式:manual
  • 源码提交:8624ed577953
  • 结果汇总:OK 0 / FAIL 1 / SKIP 0
  • 日志:/home/zhongxuan/ptoas-board-monitor/runtime/logs/20260430_172606_manual_pr609.log
  • 手动指令:/run a3 test/lit/pto/fa.pto --pto-level=level3
  • 触发人:HecreReed
  • 直接PTO:test/lit/pto/fa.pto
  • PTOAS 参数:--pto-level=level3
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:board-validation / exit=1

失败用例

  • fa (run, exit=2)

@reedhecre
Copy link
Copy Markdown

A3 板测失败详情:PR #609

fa

stage=run info=exit=2

/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:201:5: error: use of undeclared identifier 'TALLOC'
    TALLOC<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v40, v75);
    ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:239:5: error: use of undeclared identifier 'TALLOC'
    TALLOC<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v40, v86);
    ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:279:7: error: use of undeclared identifier 'TALLOC'
      TALLOC<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v102);
      ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:317:7: error: use of undeclared identifier 'TALLOC'
      TALLOC<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v40, v113);
      ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:348:7: error: use of undeclared identifier 'TALLOC'
      TALLOC<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v123);
      ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:386:7: error: use of undeclared identifier 'TALLOC'
      TALLOC<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v40, v134);
      ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:419:5: error: use of undeclared identifier 'TALLOC'
    TALLOC<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v143);
    ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:443:5: error: use of undeclared identifier 'TALLOC'
    TALLOC<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v149);
    ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:576:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v62);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:620:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v70);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:635:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v46, v79);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:675:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v85);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:692:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v46, v92);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:730:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v98);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:750:7: error: no matching function for call to 'TFREE'
      TFREE<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v46, v105);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:788:7: error: no matching function for call to 'TFREE'
      TFREE<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v111);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:805:7: error: no matching function for call to 'TFREE'
      TFREE<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v46, v117);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:843:7: error: no matching function for call to 'TFREE'
      TFREE<TPipe<0, Direction::DIR_C2V, 131072, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 256>, pto::Stride<32768, 32768, 32768, 256, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v43, v123);
      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/npu_validation/ManualPto/fa/fa_kernel.cpp:863:5: error: no matching function for call to 'TFREE'
    TFREE<TPipe<2, Direction::DIR_C2V, 65536, 8, 8, true>, GlobalTensor<float, pto::Shape<1, 1, 1, 128, 128>, pto::Stride<16384, 16384, 16384, 128, 1>, pto::Layout::ND>, TileSplitAxis::TILE_NO_SPLIT>(v46, v129);
    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1728:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'Split'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/payload/pto-isa/include/pto/common/pto_instr.hpp:1736:22: note: candidate template ignored: invalid explicitly-specified argument for template parameter 'WaitEvents'
PTO_INST RecordEvent TFREE(Pipe &pipe, WaitEvents &... events)
                     ^
fatal error: too many errors emitted, stopping now [-ferror-limit=]
20 errors generated.
gmake[2]: *** [CMakeFiles/fa_kernel.dir/build.make:76: CMakeFiles/fa_kernel.dir/fa_kernel.cpp.o] Error 1
gmake[2]: *** Waiting for unfinished jobs....
gmake[1]: *** [CMakeFiles/Makefile2:85: CMakeFiles/fa_kernel.dir/all] Error 2
gmake: *** [Makefile:91: all] Error 2
[2026-04-30 17:28:23] ERROR: testcase failed (exit 2): fa
[2026-04-30 17:28:23] === SUMMARY ===
[2026-04-30 17:28:23] OK=0 FAIL=1 SKIP=0
[2026-04-30 17:28:23] RESULTS_TSV=/home/zhongxuan/ptoas-board-monitor/runtime/runs/20260430_172606_manual_pr609/remote_npu_validation_results.tsv
[npu-lock] 已释放设备 4 的锁
=== 任务失败 (exit=1) ===
task-submit wait rc=1
completed (exit=1)

@zhangstevenunity
Copy link
Copy Markdown
Collaborator Author

pto-isa-feature-subtile/tests/npu/a2a3/src/st/testcase/fa_ptoas_gm_pipe_smoke/main.cpp
/**
Copyright (c) 2026 Huawei Technologies Co., Ltd.
This program is free software, you can redistribute it and/or modify it under the terms and conditions of
CANN Open Software License Agreement Version 2.0 (the "License").
Please refer to the License for details. You may not use this file except in compliance with the License.
THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
See LICENSE in the root of the software repository for the full text of the License.
*/

#include "acl/acl.h"
#include "runtime/rt.h"
#include "test_common.h"
#include <gtest/gtest.h>

using namespace std;
using namespace PtoTestCommon;

template <int32_t tilingKey>
void LaunchFaPtoasGmPipe(uint8_t *ffts, uint8_t *q, uint8_t *kt, uint8_t *v, uint8_t *pFifo, uint8_t *out,
uint8_t *qkFifo, uint8_t *pvFifo, void *stream);

class FaPtoasGmPipeTest : public testing::Test {
protected:
void SetUp() override {}
void TearDown() override {}
};

static std::string GetGoldenDir()
{
const testing::TestInfo *testInfo = testing::UnitTest::GetInstance()->current_test_info();
return "../" + std::string(testInfo->test_suite_name()) + "." + testInfo->name();
}

TEST_F(FaPtoasGmPipeTest, case_half_128x4096)
{
constexpr uint32_t s0 = 128;
constexpr uint32_t s1 = 4096;
constexpr uint32_t head = 128;
constexpr uint32_t blocks = 1;
constexpr size_t blockPFifoHalf = 262144;
constexpr size_t blockQkFifoFloats = 262144;
constexpr size_t blockPvFifoFloats = 131072;

size_t qSize = static_cast<size_t>(s0) * head * sizeof(aclFloat16);
size_t ktSize = static_cast<size_t>(s1) * head * sizeof(aclFloat16);
size_t vSize = static_cast<size_t>(s1) * head * sizeof(aclFloat16);
size_t outElems = static_cast<size_t>(s0) * head;
size_t outSize = outElems * sizeof(float);
size_t pFifoSize = static_cast<size_t>(blocks) * blockPFifoHalf * sizeof(aclFloat16);
size_t qkFifoSize = static_cast<size_t>(blocks) * blockQkFifoFloats * sizeof(float);
size_t pvFifoSize = static_cast<size_t>(blocks) * blockPvFifoFloats * sizeof(float);

aclInit(nullptr);
aclrtSetDevice(0);

aclrtStream stream;
aclrtCreateStream(&stream);

uint8_t *qHost = nullptr;
uint8_t *ktHost = nullptr;
uint8_t *vHost = nullptr;
uint8_t *outHost = nullptr;
aclrtMallocHost(reinterpret_cast<void **>(&qHost), qSize);
aclrtMallocHost(reinterpret_cast<void **>(&ktHost), ktSize);
aclrtMallocHost(reinterpret_cast<void **>(&vHost), vSize);
aclrtMallocHost(reinterpret_cast<void **>(&outHost), outSize);

uint8_t *qDevice = nullptr;
uint8_t *ktDevice = nullptr;
uint8_t *vDevice = nullptr;
uint8_t *outDevice = nullptr;
uint8_t *pFifoDevice = nullptr;
uint8_t *qkFifoDevice = nullptr;
uint8_t *pvFifoDevice = nullptr;
aclrtMalloc(reinterpret_cast<void **>(&qDevice), qSize, ACL_MEM_MALLOC_HUGE_FIRST);
aclrtMalloc(reinterpret_cast<void **>(&ktDevice), ktSize, ACL_MEM_MALLOC_HUGE_FIRST);
aclrtMalloc(reinterpret_cast<void **>(&vDevice), vSize, ACL_MEM_MALLOC_HUGE_FIRST);
aclrtMalloc(reinterpret_cast<void **>(&outDevice), outSize, ACL_MEM_MALLOC_HUGE_FIRST);
aclrtMalloc(reinterpret_cast<void **>(&pFifoDevice), pFifoSize, ACL_MEM_MALLOC_HUGE_FIRST);
aclrtMalloc(reinterpret_cast<void **>(&qkFifoDevice), qkFifoSize, ACL_MEM_MALLOC_HUGE_FIRST);
aclrtMalloc(reinterpret_cast<void **>(&pvFifoDevice), pvFifoSize, ACL_MEM_MALLOC_HUGE_FIRST);

ReadFile(GetGoldenDir() + "/q.bin", qSize, qHost, qSize);
ReadFile(GetGoldenDir() + "/kt.bin", ktSize, ktHost, ktSize);
ReadFile(GetGoldenDir() + "/v.bin", vSize, vHost, vSize);

aclrtMemcpy(qDevice, qSize, qHost, qSize, ACL_MEMCPY_HOST_TO_DEVICE);
aclrtMemcpy(ktDevice, ktSize, ktHost, ktSize, ACL_MEMCPY_HOST_TO_DEVICE);
aclrtMemcpy(vDevice, vSize, vHost, vSize, ACL_MEMCPY_HOST_TO_DEVICE);
aclrtMemset(outDevice, outSize, 0, outSize);
aclrtMemset(pFifoDevice, pFifoSize, 0, pFifoSize);
aclrtMemset(qkFifoDevice, qkFifoSize, 0, qkFifoSize);
aclrtMemset(pvFifoDevice, pvFifoSize, 0, pvFifoSize);

uint64_t ffts = 0;
uint32_t fftsLen = 0;
rtGetC2cCtrlAddr(&ffts, &fftsLen);

LaunchFaPtoasGmPipe<1>(reinterpret_cast<uint8_t *>(ffts), qDevice, ktDevice, vDevice, pFifoDevice, outDevice,
                       qkFifoDevice, pvFifoDevice, stream);
aclrtSynchronizeStream(stream);

aclrtMemcpy(outHost, outSize, outDevice, outSize, ACL_MEMCPY_DEVICE_TO_HOST);
WriteFile(GetGoldenDir() + "/output.bin", outHost, outSize);

std::vector<float> golden(outElems);
std::vector<float> actual(outElems);
ReadFile(GetGoldenDir() + "/golden.bin", outSize, golden.data(), outSize);
ReadFile(GetGoldenDir() + "/output.bin", outSize, actual.data(), outSize);

bool ret = ResultCmp<float>(golden, actual, 0.01f);

aclrtFree(qDevice);
aclrtFree(ktDevice);
aclrtFree(vDevice);
aclrtFree(outDevice);
aclrtFree(pFifoDevice);
aclrtFree(qkFifoDevice);
aclrtFree(pvFifoDevice);
aclrtFreeHost(qHost);
aclrtFreeHost(ktHost);
aclrtFreeHost(vHost);
aclrtFreeHost(outHost);
aclrtDestroyStream(stream);
aclrtResetDevice(0);
aclFinalize();

EXPECT_TRUE(ret);

}

@zhangstevenunity
Copy link
Copy Markdown
Collaborator Author

pto-isa-feature-subtile/tests/npu/a2a3/src/st/testcase/fa_ptoas_gm_pipe_smoke/fa_ptoas_gm_pipe_smoke_kernel.cpp
/**
Copyright (c) 2026 Huawei Technologies Co., Ltd.
This program is free software, you can redistribute it and/or modify it under the terms and conditions of
CANN Open Software License Agreement Version 2.0 (the "License").
Please refer to the License for details. You may not use this file except in compliance with the License.
THIS SOFTWARE IS PROVIDED ON AN "AS IS" BASIS, WITHOUT WARRANTIES OF ANY KIND, EITHER EXPRESS OR IMPLIED,
INCLUDING BUT NOT LIMITED TO NON-INFRINGEMENT, MERCHANTABILITY, OR FITNESS FOR A PARTICULAR PURPOSE.
See LICENSE in the root of the software repository for the full text of the License.
*/

#include "fa_perf_smoke_c220.inc"

template <int32_t tilingKey>
void LaunchFaPtoasGmPipe(uint8_t *ffts, uint8_t *q, uint8_t *kt, uint8_t *v, uint8_t *pFifo, uint8_t *out,
uint8_t *qkFifo, uint8_t *pvFifo, void *stream)
{
if constexpr (tilingKey == 1) {
call_both<<<1, nullptr, stream>>>(
reinterpret_cast<int64_t *>(ffts), reinterpret_cast<half *>(q), reinterpret_cast<half *>(kt),
reinterpret_cast<half *>(v), reinterpret_cast<half *>(pFifo), reinterpret_cast<float *>(out),
reinterpret_cast<float *>(qkFifo), reinterpret_cast<float *>(pvFifo));
}
}

template void LaunchFaPtoasGmPipe<1>(uint8_t *ffts, uint8_t *q, uint8_t *kt, uint8_t *v, uint8_t *pFifo,
uint8_t *out, uint8_t *qkFifo, uint8_t *pvFifo, void *stream);

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a5

@reedhecre
Copy link
Copy Markdown

已接收 /run a5,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:1c38346a560b
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260507_192705_manual_pr609.log
  • 手动指令:/run a5
  • 触发人:HecreReed
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

erated: test_intercore_sync_a5_functional-pto.cpp
Sync(test_intercore_sync_a5_ptoisa_vec.py) OK   generated: test_intercore_sync_a5_ptoisa_vec-pto.cpp
Sync(test_intercore_sync_a5.py) OK   generated: test_intercore_sync_a5-pto.cpp
Sync(test_mem_inject_sync_basic.py) OK   generated: test_mem_inject_sync_basic-pto.cpp
Sync(test_set_wait_unified_api.py) OK   generated: test_set_wait_unified_api-pto.cpp
Sync(test_tmov_col_major_16x1_align_a5.pto) OK   generated: test_tmov_col_major_16x1_align_a5.cpp
Sync(test_tmov_col_major_16x1_align_a5.py) OK   generated: test_tmov_col_major_16x1_align_a5-pto.cpp
Sync(test_tmov_row_major_1x16_control_a5.pto) OK   generated: test_tmov_row_major_1x16_control_a5.cpp
Sync(test_tmov_row_major_1x16_control_a5.py) OK   generated: test_tmov_row_major_1x16_control_a5-pto.cpp
Sync(tmatmulk_autosync_a5.py) OK   generated: tmatmulk_autosync_a5-pto.cpp
TileSetGetValue(tile_getval_mat_invalid.py) XFAIL python failed as expected
TileSetGetValue(tileSetGetValue.py) OK   generated: tileSetGetValue-pto.cpp
TInsert(tinsert_fp.py) OK   generated: tinsert_fp-pto.cpp
TInsert(tinsert.py) OK   generated: tinsert-pto.cpp
TPrefetch(tprefetch.py) OK   generated: tprefetch-pto.cpp
Trans(trans.py) OK   generated: trans-pto.cpp
Trap(trap.py) OK   generated: trap-pto.cpp
TTri(ttri.py) OK   generated: ttri-pto.cpp
VectorAddition(vadd_pto_ir.py) OK   generated: vadd_pto_ir-pto.cpp
VectorAddition(vadd_validshape_hyper.py) OK   generated: vadd_validshape_hyper-pto.cpp
VectorAddition(vectorAddition.py) OK   generated: vectorAddition-pto.cpp
Xors(xors.py) OK   generated: xors-pto.cpp
Xor(xor.py)  OK   generated: xor-pto.cpp
-----------------------------
OK=217  FAIL=2  SKIP=21
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-05-07 19:31:33 =====

@HecreReed
Copy link
Copy Markdown
Collaborator

/run a5

@reedhecre
Copy link
Copy Markdown

已接收 /run a5,A5 板测器会处理这条请求。

页面会自动刷新,可以直接看当前阶段、排队情况和最近结果。

@reedhecre
Copy link
Copy Markdown

A5 板测失败

  • 触发方式:manual
  • 源码提交:1c38346a560b
  • 结果汇总:OK 0 / FAIL 0 / SKIP 0
  • 日志:/root/ptoas-board-monitor-a5/logs/20260507_193405_manual_pr609.log
  • 手动指令:/run a5
  • 触发人:HecreReed
  • 触发评论:Add FA PTO lit regression cases #609 (comment)
  • 失败阶段:sample-build-and-test / exit=1

日志尾部

erated: test_intercore_sync_a5_functional-pto.cpp
Sync(test_intercore_sync_a5_ptoisa_vec.py) OK   generated: test_intercore_sync_a5_ptoisa_vec-pto.cpp
Sync(test_intercore_sync_a5.py) OK   generated: test_intercore_sync_a5-pto.cpp
Sync(test_mem_inject_sync_basic.py) OK   generated: test_mem_inject_sync_basic-pto.cpp
Sync(test_set_wait_unified_api.py) OK   generated: test_set_wait_unified_api-pto.cpp
Sync(test_tmov_col_major_16x1_align_a5.pto) OK   generated: test_tmov_col_major_16x1_align_a5.cpp
Sync(test_tmov_col_major_16x1_align_a5.py) OK   generated: test_tmov_col_major_16x1_align_a5-pto.cpp
Sync(test_tmov_row_major_1x16_control_a5.pto) OK   generated: test_tmov_row_major_1x16_control_a5.cpp
Sync(test_tmov_row_major_1x16_control_a5.py) OK   generated: test_tmov_row_major_1x16_control_a5-pto.cpp
Sync(tmatmulk_autosync_a5.py) OK   generated: tmatmulk_autosync_a5-pto.cpp
TileSetGetValue(tile_getval_mat_invalid.py) XFAIL python failed as expected
TileSetGetValue(tileSetGetValue.py) OK   generated: tileSetGetValue-pto.cpp
TInsert(tinsert_fp.py) OK   generated: tinsert_fp-pto.cpp
TInsert(tinsert.py) OK   generated: tinsert-pto.cpp
TPrefetch(tprefetch.py) OK   generated: tprefetch-pto.cpp
Trans(trans.py) OK   generated: trans-pto.cpp
Trap(trap.py) OK   generated: trap-pto.cpp
TTri(ttri.py) OK   generated: ttri-pto.cpp
VectorAddition(vadd_pto_ir.py) OK   generated: vadd_pto_ir-pto.cpp
VectorAddition(vadd_validshape_hyper.py) OK   generated: vadd_validshape_hyper-pto.cpp
VectorAddition(vectorAddition.py) OK   generated: vectorAddition-pto.cpp
Xors(xors.py) OK   generated: xors-pto.cpp
Xor(xor.py)  OK   generated: xor-pto.cpp
-----------------------------
OK=217  FAIL=2  SKIP=21
=============================
===== END STAGE sample-build-and-test rc=1 @ 2026-05-07 19:38:33 =====

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants