[pull] main from NVIDIA:main by pull[bot] · Pull Request #600 · phu0ngng/TransformerEngine

pull · 2026-05-09T04:32:04Z

See Commits and Changes for more details.

Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

@greptile-apps

* Remove internal PyTorch testing helper Signed-off-by: Tim Moon <tmoon@nvidia.com> * Review suggestion from @greptile-apps Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>

@greptile-apps

* fix nvfp4 convert_and_update_tensor shape check Signed-off-by: 乙划 <zht108229@antgroup.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add headers and check 2D shapes Signed-off-by: 乙划 <zht108229@antgroup.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Przemyslaw Tredak <ptrendx@gmail.com> * add unittest and doctring Signed-off-by: 乙划 <zht108229@antgroup.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [PyTorch] Fix NVFP4 shape check for N-D tensors in convert_and_update_tensor Introduce get_2d_dims() in common.h/cpp to flatten an N-D shape to 2D dims (flat_first, flat_last), replacing the ad-hoc compressShapeTo2D helper from the contributor PR. The helper takes NVTEShape as its core argument (stack-allocated) with a header-only vector<T> overload, and supports a transpose flag for the shape[1:] flattening direction. Use get_2d_dims in NVFP4Quantizer::convert_and_update_tensor to compare row-wise and column-wise shapes under 2D equivalence — fixing a false mismatch when the logical shape is 3D (columnwise data is always stored 2D). Also restructure the if-block to treat row-wise data as the ground truth when present. Fixes #2607 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [PyTorch] Add test for updating N-D quantized tensors via copy_ Replaces the NVFP4-only test_nvfp4_3d_shape_quantization in test_nvfp4_quantize_exact.py with a broader test_update_nd_tensor in TestQuantizedTensor that covers all quantization formats. The test constructs an N-D quantized tensor, updates it with copy_, and checks both shape preservation and numerical accuracy. The "nvfp4_2d" variant is appended to the parametrize list inline to cover both NVFP4 quantization modes without affecting the shared _quantization_list. Also adds "fp8_blockwise" to quantization_tols in utils.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [PyTorch] Propagate get_2d_dims helper across C++ extensions Replace ad-hoc loops computing flat_first_dim/flat_last_dim and equivalent product(shape)/shape.back() patterns in quantizer.cpp, cast.cpp, gemm.cpp, normalization.cpp, swizzle.cpp, and transpose.cpp. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [PyTorch] Drop intermediate variable in cast.cpp get_2d_dims call Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Check that tensor shape is not too large Suggestion from @greptile-apps. Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: 乙划 <zht108229@antgroup.com> Signed-off-by: Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: 乙划 <zht108229@antgroup.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Tim Moon <tmoon@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

timmoon10 and others added 2 commits May 8, 2026 17:16

pull Bot locked and limited conversation to collaborators May 9, 2026

pull Bot added the ⤵️ pull label May 9, 2026

pull Bot merged commit 0e28953 into phu0ngng:main May 9, 2026

pull Bot had a problem deploying to github-pages May 9, 2026 04:33 Failure

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[pull] main from NVIDIA:main#600

[pull] main from NVIDIA:main#600
pull[bot] merged 2 commits into
phu0ngng:mainfrom
NVIDIA:main

pull Bot commented May 9, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

pull Bot commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

pull Bot commented May 9, 2026 •

edited

Loading