[pull] main from NVIDIA:main by pull[bot] · Pull Request #601 · phu0ngng/TransformerEngine

pull · 2026-05-11T10:32:43Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@greptile-apps

* Fix bug in NVFP4 quantize test where we set scale instead of amax Refactor test tensor wrapper by removing recipe-specific logic whenever possible. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Only get fp32 scale when tensor is expected to have fp32 scale Signed-off-by: Tim Moon <tmoon@nvidia.com> * Create dedicated class for managing GPU/CPU buffers Signed-off-by: Tim Moon <tmoon@nvidia.com> * Fix bugs in C++ test tensor infrastructure - Fix syntax error in switch case (:: -> :) - Fix double-underscore typo in variable name - Fix wrong buffer passed to set_amax_columnwise - Fix unique_ptr assignment from raw pointer (use reset()) - Remove dead duplicate NVTE_MXFP8_1D_SCALING branch in get_scales() - Rename cpu_data -> cpu_buffer to match Buffer class API - Remove const from Tensor::to_cpu/from_cpu and their callers, since both methods write to the CPU buffer Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug compilation errors Signed-off-by: Tim Moon <tmoon@nvidia.com> * Remove type check when accessing raw pointers CPU and GPU types are inconsistent, so the type checks cause too many problems. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Debug distributed C++ tests Also adopt review suggestions from @greptile-apps. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Remove unused header Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * Copy-paste error Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * Use shared buffer for FP8 row-wise scale-inv and col-wise scale-inv Signed-off-by: Tim Moon <tmoon@nvidia.com> * Typo Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * Debug merge conflicts with #2931 Also do some cleanup and improve documentation. Signed-off-by: Tim Moon <tmoon@nvidia.com> * Address code review feedback - Restore amax buffer size assertion in compare_rowwise_amax - Remove set_tensor_amax alias in favor of set_amax - Extract fill_uniform_buffer helper to anonymous namespace, eliminating duplication in fill_uniform_{rowwise,columnwise}_scale_inv Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>

pull Bot locked and limited conversation to collaborators May 11, 2026

pull Bot added the ⤵️ pull label May 11, 2026

pull Bot merged commit 25934ac into phu0ngng:main May 11, 2026
8 of 10 checks passed

pull Bot had a problem deploying to github-pages May 11, 2026 10:34 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from NVIDIA:main#601

[pull] main from NVIDIA:main#601
pull[bot] merged 1 commit into
phu0ngng:mainfrom
NVIDIA:main

pull Bot commented May 11, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

pull Bot commented May 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

pull Bot commented May 11, 2026 •

edited

Loading