-
Notifications
You must be signed in to change notification settings - Fork 77
NVFUSER_DUMP=segmented_fusion prints transforms #5912
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
for multi-GPU debugging. Multi-GPU scheduling happens before segmentation and the shardings are encoded as loop transforms.
|
!test |
|
Review updated until commit d19ab85 Description
|
| Relevant files | |||
|---|---|---|---|
| Enhancement |
|
PR Reviewer Guide
Here are some key observations to aid the review process:
| 🧪 No relevant tests |
| 🔒 No security concerns identified |
| ⚡ Recommended focus areas for review |
Debug output change
printMath() to print() in the SegmentedFusion::print() function is intended to show transforms for multi-GPU debugging. While this appears to be a straightforward debug improvement, the reviewer should verify that print() indeed provides the expected transform information and that the output format is appropriate for debugging multi-GPU scheduling scenarios. |
Greptile OverviewGreptile SummaryChanged Key changes:
Context: Confidence Score: 5/5
Important Files Changed
Sequence DiagramsequenceDiagram
participant User
participant Runtime as FusionKernelRuntime
participant SF as SegmentedFusion
participant CF as CompleteFusion
participant Printer as IrTransformPrinter
User->>Runtime: Set NVFUSER_DUMP=segmented_fusion
Runtime->>Runtime: isDebugDumpEnabled(FusionSegments)?
Runtime->>SF: print()
SF->>SF: debug() << header
SF->>CF: completeFusion()->print()
CF->>CF: Print inputs & outputs
CF->>CF: Print kernel expressions
CF->>Printer: IrTransformPrinter.handle(this)
Note over Printer: NEW: Prints tensor transforms<br/>for multi-GPU debugging
Printer-->>CF: Transform details
CF-->>SF: Complete fusion output
SF->>SF: debug() << footer
SF->>SF: debug() << this (segmented info)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, no comments
Priya2698
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM otherwise.
|
!build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
1 file reviewed, no comments
for multi-GPU debugging. Multi-GPU scheduling happens before segmentation and the shardings are encoded as loop transforms.