Skip to content

[Megatron-Bridge] Add support for Mamba SFT posttraining with Megatron-Bridge#587

Draft
clairesonglee wants to merge 13 commits intomainfrom
dev/clairlee/hybrid-posttraining
Draft

[Megatron-Bridge] Add support for Mamba SFT posttraining with Megatron-Bridge#587
clairesonglee wants to merge 13 commits intomainfrom
dev/clairlee/hybrid-posttraining

Conversation

@clairesonglee
Copy link
Copy Markdown
Collaborator

No description provided.

@clairesonglee clairesonglee marked this pull request as draft March 9, 2026 17:34
kailashg26 and others added 10 commits April 24, 2026 21:09
…32B Configs for MI300X & MI355X (#556)

YF: Only SFT related config and Doc changes, bypassing unit CI tests

This PR introduces post-training documentation and updates Qwen3 32B
model configuration files to support AMD MI300X and MI355X accelerators.

---

- **Added `posttraining.md`**
  - New comprehensive guide for post-training workflows
  - Covers setup instructions, configuration details, and usage examples

- **Updated `docs/README.md`**
  - Added a new section referencing post-training documentation
  - Improved documentation organization and navigation

---

- **Updated Qwen3_32B model YAML configs**
  - Added/modified configurations optimized for:
    - MI300X
    - MI355X
  - Adjusted parameters for compatibility and stable execution

---

- Verified updated configs load and execute successfully on MI300X and
MI355X environments
- Confirmed documentation links and structure render correctly

---

- [x] Added `posttraining.md`
- [x] Updated `docs/README.md`
- [x] Modified Qwen3_32B YAML configs
- [x] Verified changes locally
Co-authored-by: Mingyu Yang <Mingyu.Yang@amd.com>
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Co-authored-by: Kailash Gogineni <gkailashnath1998@gmail.com>
Co-authored-by: HuangWei-95 <Wei.Huang4@amd.com>
Co-authored-by: HuangWei-95 <weihuan@amd.com>
Co-authored-by: Xiaoming-AMD <Xiaoming.Peng@amd.com>
Co-authored-by: WangLingxun <linxwang@amd.com>
@clairesonglee clairesonglee force-pushed the dev/clairlee/hybrid-posttraining branch from 4e0c462 to 41e0680 Compare April 24, 2026 21:27
@clairesonglee clairesonglee force-pushed the dev/clairlee/hybrid-posttraining branch from 41e0680 to 52e8298 Compare April 24, 2026 21:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants