Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/configs/nvidia-master.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -3250,7 +3250,7 @@ gptoss-fp4-h100-vllm:
- { tp: 8, conc-start: 4, conc-end: 16 }

minimaxm2.5-fp8-h100-vllm:
image: vllm/vllm-openai:v0.16.0
image: vllm/vllm-openai:v0.18.0
model: MiniMaxAI/MiniMax-M2.5
model-prefix: minimaxm2.5
runner: h100
Expand Down Expand Up @@ -3532,7 +3532,7 @@ gptoss-fp4-h200-vllm:
- { tp: 8, conc-start: 4, conc-end: 32 }

minimaxm2.5-fp8-h200-vllm:
image: vllm/vllm-openai:v0.16.0
image: vllm/vllm-openai:v0.18.0
model: MiniMaxAI/MiniMax-M2.5
model-prefix: minimaxm2.5
runner: h200
Expand Down
1 change: 0 additions & 1 deletion benchmarks/single_node/minimaxm2.5_fp8_h100.sh
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,6 @@ $EP \
--gpu-memory-utilization 0.90 \
--max-model-len $MAX_MODEL_LEN \
--max-num-seqs 256 \
--disable-log-requests \
--trust-remote-code \
--compilation-config '{"cudagraph_mode":"PIECEWISE"}' > $SERVER_LOG 2>&1 &

Expand Down
1 change: 0 additions & 1 deletion benchmarks/single_node/minimaxm2.5_fp8_h200.sh
Original file line number Diff line number Diff line change
Expand Up @@ -37,7 +37,6 @@ vllm serve $MODEL --port $PORT \
$EP \
--gpu-memory-utilization 0.95 \
--max-model-len $MAX_MODEL_LEN \
--disable-log-requests \
--trust-remote-code > $SERVER_LOG 2>&1 &

SERVER_PID=$!
Expand Down
7 changes: 7 additions & 0 deletions perf-changelog.yaml
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
- config-keys:
- minimaxm2.5-fp8-h100-vllm
- minimaxm2.5-fp8-h200-vllm
description:
- "Update vLLM image from v0.16.0 to v0.18.0 for minimax h100 and h200 configs"
pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟡 The new changelog entry for minimaxm2.5-fp8-h100-vllm and minimaxm2.5-fp8-h200-vllm uses the placeholder pull/XXX instead of the actual PR number. The pr-link on line 7 should be updated to https://github.com/SemiAnalysisAI/InferenceX/pull/958.

Extended reasoning...

Bug: Placeholder PR link not replaced with actual PR number

The new changelog entry added at the top of perf-changelog.yaml (lines 1-7) contains a placeholder pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX instead of the actual PR number.

How it manifests: The diff clearly shows the new entry was added with the XXX placeholder:

- config-keys:
    - minimaxm2.5-fp8-h100-vllm
    - minimaxm2.5-fp8-h200-vllm
  description:
    - "Update vLLM image from v0.16.0 to v0.18.0 for minimax h100 and h200 configs"
  pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX

Why existing code doesn't prevent it: There is no automated validation or CI check that enforces that pr-link values in perf-changelog.yaml must reference real PR numbers rather than the XXX placeholder. The file already contains several other pre-existing XXX placeholders (for entries like dsr1-fp8-h200-sglang, glm5-fp8-mi355x-sglang, minimaxm2.5-fp8-h200-vllm, qwen3.5-bf16-mi325x-sglang, and qwen3.5-fp8-mi325x-sglang), so no pattern-match check exists to flag this.

Why this is different from the other XXX entries: Unlike the pre-existing XXX entries where the PR number may have been unknown at the time of submission, this entry was added as part of PR #958 itself. The PR number was already known and available — it is even referenced in the PR description (Closes #955 and this is PR #958). This is a straightforward oversight where the author forgot to replace the placeholder before submitting.

Impact: The pr-link field is changelog metadata that helps trace which PR introduced each configuration change. A broken link reduces traceability and makes it harder to find the associated discussion, review comments, and rationale for the change. While this does not affect any functionality, it degrades the quality of the changelog as a reference document.

Step-by-step proof:

  1. This PR is Update minimax h100 & h200 vLLM image to v0.18.0 #958 (as confirmed by the PR metadata)
  2. The PR adds a new entry at the top of perf-changelog.yaml
  3. That entry has pr-link: https://github.com/SemiAnalysisAI/InferenceX/pull/XXX
  4. Visiting the link results in a 404 (or a random PR if XXX was meant as a literal string)
  5. The correct link should be https://github.com/SemiAnalysisAI/InferenceX/pull/958

Fix: Replace pull/XXX with pull/958 on line 7 of perf-changelog.yaml.

- config-keys:
- dsr1-fp8-b200-dynamo-trt
- dsr1-fp8-h200-dynamo-trt
Expand Down
Loading