[pull] main from NVIDIA:main#600
Merged
Merged
Conversation
* Remove internal PyTorch testing helper Signed-off-by: Tim Moon <tmoon@nvidia.com> * Review suggestion from @greptile-apps Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
* fix nvfp4 convert_and_update_tensor shape check Signed-off-by: 乙划 <zht108229@antgroup.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * add headers and check 2D shapes Signed-off-by: 乙划 <zht108229@antgroup.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Fix Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Signed-off-by: Przemyslaw Tredak <ptrendx@gmail.com> * add unittest and doctring Signed-off-by: 乙划 <zht108229@antgroup.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * [PyTorch] Fix NVFP4 shape check for N-D tensors in convert_and_update_tensor Introduce get_2d_dims() in common.h/cpp to flatten an N-D shape to 2D dims (flat_first, flat_last), replacing the ad-hoc compressShapeTo2D helper from the contributor PR. The helper takes NVTEShape as its core argument (stack-allocated) with a header-only vector<T> overload, and supports a transpose flag for the shape[1:] flattening direction. Use get_2d_dims in NVFP4Quantizer::convert_and_update_tensor to compare row-wise and column-wise shapes under 2D equivalence — fixing a false mismatch when the logical shape is 3D (columnwise data is always stored 2D). Also restructure the if-block to treat row-wise data as the ground truth when present. Fixes #2607 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [PyTorch] Add test for updating N-D quantized tensors via copy_ Replaces the NVFP4-only test_nvfp4_3d_shape_quantization in test_nvfp4_quantize_exact.py with a broader test_update_nd_tensor in TestQuantizedTensor that covers all quantization formats. The test constructs an N-D quantized tensor, updates it with copy_, and checks both shape preservation and numerical accuracy. The "nvfp4_2d" variant is appended to the parametrize list inline to cover both NVFP4 quantization modes without affecting the shared _quantization_list. Also adds "fp8_blockwise" to quantization_tols in utils.py. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [PyTorch] Propagate get_2d_dims helper across C++ extensions Replace ad-hoc loops computing flat_first_dim/flat_last_dim and equivalent product(shape)/shape.back() patterns in quantizer.cpp, cast.cpp, gemm.cpp, normalization.cpp, swizzle.cpp, and transpose.cpp. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [PyTorch] Drop intermediate variable in cast.cpp get_2d_dims call Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci * Check that tensor shape is not too large Suggestion from @greptile-apps. Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> * [pre-commit.ci] auto fixes from pre-commit.com hooks for more information, see https://pre-commit.ci --------- Signed-off-by: 乙划 <zht108229@antgroup.com> Signed-off-by: Przemyslaw Tredak <ptrendx@gmail.com> Signed-off-by: Tim Moon <tmoon@nvidia.com> Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com> Co-authored-by: 乙划 <zht108229@antgroup.com> Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Przemyslaw Tredak <ptrendx@gmail.com> Co-authored-by: greptile-apps[bot] <165735046+greptile-apps[bot]@users.noreply.github.com> Co-authored-by: Tim Moon <tmoon@nvidia.com> Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com> Co-authored-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
See Commits and Changes for more details.
Created by
pull[bot] (v2.0.0-alpha.4)
Can you help keep this open source service alive? 💖 Please sponsor : )