Skip to content

[Blog] Muon Optimizer Support in DeepSpeed#7962

Open
delock wants to merge 17 commits intomasterfrom
gma/muon_blog
Open

[Blog] Muon Optimizer Support in DeepSpeed#7962
delock wants to merge 17 commits intomasterfrom
gma/muon_blog

Conversation

@delock
Copy link
Copy Markdown
Collaborator

@delock delock commented Apr 8, 2026

Author: @PKUWZP & @delock
Blog post introducing Muon optimizer support in DeepSpeed, covering how it integrates with
ZeRO Stage 2/3, measured convergence and memory results, and the roadmap ahead.

delock and others added 16 commits April 8, 2026 23:56
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Replace placeholder claims with actual experiment results:
- Add lr sweep results for both AdamW and Muon optimizers
- Report measured GPU memory: AdamW 34.5 GiB vs Muon 31.4 GiB (9% savings)
- Remove old convergence chart (adamw_vs_muon_3b.png)
- Fix inaccurate claims (Muon 19% better, Adam OOM on 2xA100)
- Add hybrid optimizer explanation and separate lr config docs

Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
Signed-off-by: Ma, Guokai <guokai.ma@gmail.com>
@delock delock marked this pull request as ready for review April 10, 2026 13:04
@delock delock requested review from loadams and tjruwase as code owners April 10, 2026 13:04
@chatgpt-codex-connector
Copy link
Copy Markdown

Codex usage limits have been reached for code reviews. Please check with the admins of this repo to increase the limits by adding credits.
Credits must be used to enable repository wide code reviews.

…g fixes

Signed-off-by: Guokai Ma <guokai.ma@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant