-
Notifications
You must be signed in to change notification settings - Fork 4k
Fused dLN + add in backwards pass #3384
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
asolergi-nv
merged 32 commits into
NVIDIA:main
from
CarlosGomes98:cgomes/ds_fuse_dLN_add
Mar 9, 2026
Merged
Changes from all commits
Commits
Show all changes
32 commits
Select commit
Hold shift + click to select a range
2593e53
add fused te fused layernorm
CarlosGomes98 d74e006
revert changes to normalization parameter, add fusion flag instead
CarlosGomes98 6c1364f
Refactor TEFusedResidualRMSNorm properly wrapping it for compatibilit…
CarlosGomes98 57596cf
add more spots where tuple outputs break mcore
CarlosGomes98 ba5c4a5
remove excessive comments
CarlosGomes98 20393df
add quantization
CarlosGomes98 5d1c460
add rmsnorm residual fusion test
CarlosGomes98 c58a797
fix tests
CarlosGomes98 5d2bddc
dont use residual_add when not necessary
CarlosGomes98 6779413
Remove quantization for now
CarlosGomes98 d490afe
formatting changes
CarlosGomes98 7dac784
add check tuple has len 2 to pre_mlp_layernorm
CarlosGomes98 678b8e9
fix formatting
CarlosGomes98 f366acb
Add checks for tuple length in MultiTokenPredictionLayer and Transfor…
CarlosGomes98 bc1cf5b
Revert changes to attention.py
CarlosGomes98 934d02d
remove unnecessary unpacking
CarlosGomes98 2d9ba48
guard has_residual behind TENorm check
CarlosGomes98 818e0e7
autoformat
CarlosGomes98 d1f59cc
add missing copyright header
CarlosGomes98 a1ad51e
remove quantize arg from test
CarlosGomes98 c5c9b25
add arg to golden_dict
CarlosGomes98 e89bb23
compact TENorm
CarlosGomes98 650355f
format
CarlosGomes98 1866618
leverage spec to simplify build of has_residual layernorm
CarlosGomes98 bc0cfe7
format
CarlosGomes98 fac5e6a
fix rebase
CarlosGomes98 4766110
fix issue with build_module used with layernorm
CarlosGomes98 3fcb878
fix docs error
CarlosGomes98 71bd099
Update megatron/core/transformer/transformer_config.py
ericharper c5325b9
Merge branch 'main' into cgomes/ds_fuse_dLN_add
ericharper bd7395a
Update megatron/core/transformer/transformer_config.py
ericharper 3d5501a
Merge branch 'main' into cgomes/ds_fuse_dLN_add
Phlip79 File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Some comments aren't visible on the classic Files Changed page.
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.