NVIDIA-NeMo · yaoyu-33 · Apr 15, 2026 · Apr 13, 2026 · Apr 13, 2026 · Apr 13, 2026
diff --git a/README.md b/README.md
@@ -12,6 +12,8 @@
 
 ## 📣 News
 
+- [04/12/2026] [**MiniMax-M2.5 / M2.7**](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/models/minimax_m2) are now supported! Both models share the same architecture as MiniMax-M2 and work with the existing bridge out of the box — checkpoint conversion and inference verified on real FP8 checkpoints.
+
 - [04/10/2026] [**Qwen3-ASR**](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/models/audio_lm/qwen3_asr) is now supported! Checkpoint conversion and inference for [Qwen3's ASR model](https://github.com/QwenLM/Qwen3-ASR) are available on **main**.
 
 - [04/09/2026] [**Bailing MoE V2**](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/examples/models/bailing) is now supported! Checkpoint conversion and inference for the Bailing MoE V2 model are available on **main**. Thank you to [@ccclyu](https://github.com/ccclyu) for the community contribution!
@@ -181,7 +183,7 @@ Megatron Bridge provides out-of-the-box bridges and training recipes for a wide
 - [Mamba](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/mamba)
 - [Ministral](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/ministral3) — [recipes (3B/8B/14B)](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/ministral3/ministral3.py)
 - [Mistral](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/mistral)
-- [MiniMax-M2](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/minimax_m2)
+- [MiniMax-M2 / M2.5 / M2.7](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/minimax_m2)
 - [Moonlight](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/deepseek) — [recipes (16B)](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/moonlight/moonlight_16b.py)
 - [OlMoE](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/olmoe) — [recipes (7B)](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/olmoe/olmoe_7b.py)
 - [Qwen2 / Qwen2.5](https://github.com/NVIDIA-NeMo/Megatron-Bridge/tree/main/src/megatron/bridge/models/qwen) — [recipes](https://github.com/NVIDIA-NeMo/Megatron-Bridge/blob/main/src/megatron/bridge/recipes/qwen/qwen2.py)

diff --git a/docs/models/llm/README.md b/docs/models/llm/README.md
@@ -14,6 +14,7 @@ Megatron Bridge supports the following LLM families:
 | **Gemma 3** | [gemma3.md](gemma3.md) | Google Gemma 3 models |
 | **GLM-4.5** | [glm45.md](glm45.md) | GLM-4.5 model family |
 | **GPT-OSS** | [gpt-oss.md](gpt-oss.md) | Open-source GPT-style models |
+| **MiniMax-M2** | — | MiniMax-M2 / M2.5 / M2.7 (456B MoE, FP8) |
 | **LLaMA 3** | [llama3.md](llama3.md) | Meta LLaMA 3 models |
 | **LLaMA Nemotron** | [llama-nemotron.md](llama-nemotron.md) | NVIDIA LLaMA Nemotron models |
 | **Mistral** | [mistral.md](mistral.md) | Mistral AI models |

diff --git a/examples/models/minimax_m2/README.md b/examples/models/minimax_m2/README.md
@@ -2,6 +2,8 @@
 
 This directory contains example scripts for [MiniMax-M2](https://huggingface.co/MiniMaxAI/MiniMax-M2), a large sparse MoE model with 456B total parameters (45.9B active), 256 experts, and FP8 quantization.
 
+> **M2.5 / M2.7 compatibility:** [MiniMax-M2.5](https://huggingface.co/MiniMaxAI/MiniMax-M2.5) and [MiniMax-M2.7](https://huggingface.co/MiniMaxAI/MiniMax-M2.7) share the same architecture (`MiniMaxM2ForCausalLM`) and work with the same bridge. Replace the model ID in the scripts below (e.g. `MiniMaxAI/MiniMax-M2.5`).
+
 ## Hardware Requirements
 
 MiniMax-M2 requires **at least 2 nodes (16 GPUs)** for inference and conversion. The model cannot fit on a single 8-GPU node because:

diff --git a/src/megatron/bridge/models/minimax_m2/minimax_m2_bridge.py b/src/megatron/bridge/models/minimax_m2/minimax_m2_bridge.py
@@ -98,6 +98,9 @@ class MiniMaxM2Bridge(MegatronModelBridge):
     """
     Megatron Bridge for MiniMax-M2 MoE Causal LM.
 
+    Also supports MiniMax-M2.5 and MiniMax-M2.7, which share the same
+    ``model_type`` (``minimax_m2``) and ``MiniMaxM2ForCausalLM`` architecture.
+
     MiniMax-M2 is a sparse MoE model (256 experts, top-8 routing with sigmoid
     scoring and expert bias correction). Use the native transformers >= 5.0
     implementation (no ``trust_remote_code`` required).