Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2004,7 +2004,7 @@ dsr1-fp8-b300-sglang:
# DeepSeek-V4-Pro on B300 with sglang (non-MTP).
# Uses nightly image with megamoe backend for high-concurrency profiles.
dsv4-fp4-b300-sglang:
image: lmsysorg/sglang:nightly-dev-cu13-20260529-a8cfae0b
image: lmsysorg/sglang:nightly-dev-cu13-20260624-b2c8f7a2
model: deepseek-ai/DeepSeek-V4-Pro
model-prefix: dsv4
runner: b300
Expand Down
6 changes: 6 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -4153,3 +4153,9 @@
- "Run the PR #1891 MiniMax-M3 MXFP8 B300 Dynamo-vLLM recipe set on top of current main."
- "Uses the vllm/vllm-openai:minimax-m3-0618-x86_64-cu130 image and the TEP4/TEP8 8k1k topologies not covered by PR #1890."
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1891

- config-keys:
- dsv4-fp4-b300-sglang
description:
- "Update B300 FP4 SGLang (non-MTP) image to latest nightly: lmsysorg/sglang:nightly-dev-cu13-20260624-b2c8f7a2 (was nightly-dev-cu13-20260529-a8cfae0b)."
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/1913