Skip to content

Add MiniMax-M3 NVFP4 B300 single-node vLLM benchmark (EAGLE3 spec decode)#1929

Merged
Oseltamivir merged 6 commits into
mainfrom
minimaxm3-fp4-b300-vllm-mtp
Jun 26, 2026
Merged

Add MiniMax-M3 NVFP4 B300 single-node vLLM benchmark (EAGLE3 spec decode)#1929
Oseltamivir merged 6 commits into
mainfrom
minimaxm3-fp4-b300-vllm-mtp

Add MiniMax-M3 NVFP4 B300 single-node vLLM benchmark with EAGLE3 spec…

e0d970e
Select commit
Loading
Failed to load commit list.
Claude / Claude Code Review completed Jun 25, 2026 in 8m 1s

Code review found 2 potential issues

Found 6 candidates, confirmed 2. See review comments for details.

Details

Severity Count
🔴 Important 0
🟡 Nit 2
🟣 Pre-existing 0
Severity File:Line Issue
🟡 Nit benchmarks/single_node/fixed_seq_len/minimaxm3_fp4_b300_mtp.sh:25-32 vLLM patch curl loop silently swallows failures for 2 of 3 files
🟡 Nit perf-changelog.yaml:4194 Placeholder PR link XXX in perf-changelog

Annotations

Check warning on line 32 in benchmarks/single_node/fixed_seq_len/minimaxm3_fp4_b300_mtp.sh

See this annotation in the file changed.

@claude claude / Claude Code Review

vLLM patch curl loop silently swallows failures for 2 of 3 files

The 3-file vLLM patch overlay loop (lines 25-32) runs `curl -fsSL ... -o` without `set -e` and without `|| exit 1`, so a transient 5xx / rate-limit / commit-reachability failure on file #2 (`modelopt.py`) or file #3 (`flashinfer_utils.py`) returns non-zero and the loop continues silently. The post-patch `python3 -c` only imports `TrtLlmNvFp4ExpertsModular` from file #1, so a partial patch is undetected and `vllm serve` boots on the comment's "unsupported path". Match the sister recipe `minimaxm3

Check warning on line 4194 in perf-changelog.yaml

See this annotation in the file changed.

@claude claude / Claude Code Review

Placeholder PR link XXX in perf-changelog

The perf-changelog entry for `minimaxm3-fp4-b300-vllm-mtp` ends with `pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX` — the literal `XXX` placeholder was never filled in. Every other recent entry links to a real PR number (e.g. `/pull/1927` directly above); this should be `/pull/1929` to match this PR.