Skip to content

Add ROCM kernel skill#343

Open
01xjw wants to merge 8 commits intohuggingface:mainfrom
01xjw:add-rocm-kernels-skill
Open

Add ROCM kernel skill#343
01xjw wants to merge 8 commits intohuggingface:mainfrom
01xjw:add-rocm-kernels-skill

Conversation

@01xjw
Copy link

@01xjw 01xjw commented Mar 13, 2026

Add ROCm Triton kernels skill for MI355X/R9700

  • RMSNorm, RoPE 3D, GEGLU, AdaLN kernel patterns
  • Benchmark scripts (micro + e2e for LTX-Video)
  • HuggingFace Kernels integration example
  • Reference docs: optimization guides, templates, troubleshooting

01xjw added 3 commits March 13, 2026 06:43
- RMSNorm, RoPE 3D, GEGLU, AdaLN kernel patterns
- Benchmark scripts (micro + e2e for LTX-Video)
- HuggingFace Kernels integration example
- Reference docs: optimization guides, templates, troubleshooting
@01xjw 01xjw changed the title Add ROCM kernel skill [Draft]Add ROCM kernel skill Mar 13, 2026
@01xjw 01xjw marked this pull request as draft March 13, 2026 11:55
@01xjw 01xjw marked this pull request as ready for review March 16, 2026 07:56
@01xjw 01xjw changed the title [Draft]Add ROCM kernel skill Add ROCM kernel skill Mar 16, 2026
@sayakpaul
Copy link
Member

Cc: @burtenshaw @danieldk

@burtenshaw
Copy link
Contributor

Really cool PR! Could you please share a trace with a coding harness like claude code, codex, or opencode.

Copy link
Member

@sayakpaul sayakpaul left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this 🔥

I will let @burtenshaw do the final approval. Some comments:

@@ -0,0 +1,252 @@
# Diffusers Pipeline Integration Guide (ROCm)

Integrating custom Triton kernels into HuggingFace diffusers pipelines on AMD GPUs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also enlist any dependencies?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No problem, I will add some dependencies in 24 hours!

@Abdennacer-Badaoui
Copy link
Member

This is really good 🔥 Thanks @01xjw

@01xjw
Copy link
Author

01xjw commented Mar 25, 2026

Really cool PR! Could you please share a trace with a coding harness like claude code, codex, or opencode.

Hi @burtenshaw, the show results are in the blog PR, could you help review it also? Thanks ~
huggingface/blog#3308

@01xjw
Copy link
Author

01xjw commented Mar 25, 2026

Thanks for this 🔥

I will let @burtenshaw do the final approval. Some comments:

OK — I’ll add the ROCm kernel skills to this repo following the CLI skills docs.
I’ve already shared the results in the blog PR. Would you like the video or those results included in this PR as well? If so, I’ll attach them here.
huggingface/blog#3308

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants