Skip to content

[pull] main from NVIDIA:main#601

Merged
pull[bot] merged 1 commit into
phu0ngng:mainfrom
NVIDIA:main
May 11, 2026
Merged

[pull] main from NVIDIA:main#601
pull[bot] merged 1 commit into
phu0ngng:mainfrom
NVIDIA:main

Conversation

@pull
Copy link
Copy Markdown

@pull pull Bot commented May 11, 2026

See Commits and Changes for more details.


Created by pull[bot] (v2.0.0-alpha.4)

Can you help keep this open source service alive? 💖 Please sponsor : )

* Fix bug in NVFP4 quantize test where we set scale instead of amax

Refactor test tensor wrapper by removing recipe-specific logic whenever possible.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Only get fp32 scale when tensor is expected to have fp32 scale

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Create dedicated class for managing GPU/CPU buffers

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Fix bugs in C++ test tensor infrastructure

- Fix syntax error in switch case (:: -> :)
- Fix double-underscore typo in variable name
- Fix wrong buffer passed to set_amax_columnwise
- Fix unique_ptr assignment from raw pointer (use reset())
- Remove dead duplicate NVTE_MXFP8_1D_SCALING branch in get_scales()
- Rename cpu_data -> cpu_buffer to match Buffer class API
- Remove const from Tensor::to_cpu/from_cpu and their callers,
  since both methods write to the CPU buffer

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Debug compilation errors

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Remove type check when accessing raw pointers

CPU and GPU types are inconsistent, so the type checks cause too many problems.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Debug distributed C++ tests

Also adopt review suggestions from @greptile-apps.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Remove unused header

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* Copy-paste error

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* Use shared buffer for FP8 row-wise scale-inv and col-wise scale-inv

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Typo

Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>

* Debug merge conflicts with #2931

Also do some cleanup and improve documentation.

Signed-off-by: Tim Moon <tmoon@nvidia.com>

* Address code review feedback

- Restore amax buffer size assertion in compare_rowwise_amax
- Remove set_tensor_amax alias in favor of set_amax
- Extract fill_uniform_buffer helper to anonymous namespace,
  eliminating duplication in fill_uniform_{rowwise,columnwise}_scale_inv

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Signed-off-by: Tim Moon <tmoon@nvidia.com>

---------

Signed-off-by: Tim Moon <tmoon@nvidia.com>
Signed-off-by: Tim Moon <4406448+timmoon10@users.noreply.github.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
@pull pull Bot locked and limited conversation to collaborators May 11, 2026
@pull pull Bot added the ⤵️ pull label May 11, 2026
@pull pull Bot merged commit 25934ac into phu0ngng:main May 11, 2026
8 of 10 checks passed
@pull pull Bot had a problem deploying to github-pages May 11, 2026 10:34 Failure
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant