Add MiniMax-M3 NVFP4 B300 single-node vLLM benchmark (EAGLE3 spec decode)#1929
Merged
Claude / Claude Code Review
completed
Jun 25, 2026 in 8m 1s
Code review found 2 potential issues
Found 6 candidates, confirmed 2. See review comments for details.
Details
| Severity | Count |
|---|---|
| 🔴 Important | 0 |
| 🟡 Nit | 2 |
| 🟣 Pre-existing | 0 |
| Severity | File:Line | Issue |
|---|---|---|
| 🟡 Nit | benchmarks/single_node/fixed_seq_len/minimaxm3_fp4_b300_mtp.sh:25-32 |
vLLM patch curl loop silently swallows failures for 2 of 3 files |
| 🟡 Nit | perf-changelog.yaml:4194 |
Placeholder PR link XXX in perf-changelog |
Annotations
Check warning on line 32 in benchmarks/single_node/fixed_seq_len/minimaxm3_fp4_b300_mtp.sh
claude / Claude Code Review
vLLM patch curl loop silently swallows failures for 2 of 3 files
The 3-file vLLM patch overlay loop (lines 25-32) runs `curl -fsSL ... -o` without `set -e` and without `|| exit 1`, so a transient 5xx / rate-limit / commit-reachability failure on file #2 (`modelopt.py`) or file #3 (`flashinfer_utils.py`) returns non-zero and the loop continues silently. The post-patch `python3 -c` only imports `TrtLlmNvFp4ExpertsModular` from file #1, so a partial patch is undetected and `vllm serve` boots on the comment's "unsupported path". Match the sister recipe `minimaxm3
Check warning on line 4194 in perf-changelog.yaml
claude / Claude Code Review
Placeholder PR link XXX in perf-changelog
The perf-changelog entry for `minimaxm3-fp4-b300-vllm-mtp` ends with `pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX` — the literal `XXX` placeholder was never filled in. Every other recent entry links to a real PR number (e.g. `/pull/1927` directly above); this should be `/pull/1929` to match this PR.
Loading