Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .github/workflows/nightly-benchmark.yml
Original file line number Diff line number Diff line change
Expand Up @@ -121,6 +121,7 @@ jobs:
- { id: Qwen/Qwen2.5-7B-Instruct, slug: Qwen-Qwen2.5-7B-Instruct, test_class: TestNightlyQwen7bSingle }
- { id: Qwen/Qwen3-30B-A3B, slug: Qwen-Qwen3-30B-A3B, test_class: TestNightlyQwen30bSingle }
- { id: openai/gpt-oss-20b, slug: openai-gpt-oss-20b, test_class: TestNightlyGptOss20bSingle }
- { id: mistralai/Devstral-2-123B-Instruct-2512, slug: mistralai-Devstral-2-123B-Instruct-2512, test_class: TestNightlyDevstral2Single }
Comment thread
smfirmin marked this conversation as resolved.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Description: Confirm TestNightlyDevstral2Single exists in the test generator configuration

# Check the model definition and generated test class naming pattern
rg -A 2 'mistralai/Devstral-2-123B-Instruct-2512' e2e_test/benchmarks/test_nightly_perf.py

# Verify the test class generation loop and naming pattern
rg -B 3 -A 5 'TestNightly.*Single.*=' e2e_test/benchmarks/test_nightly_perf.py

# Confirm the pattern: TestNightly{fragment}{Single|Multi}
ast-grep --pattern $'for $_ in $_:
    $$$
    _cls_name = f"TestNightly{$_}{$_}"
    $$$'

Repository: lightseekorg/smg

Length of output: 712


Fix YAML flow-mapping brace spacing to pass yamlint checks.

The flow-mapping on line 124 has spaces after { and before }, which violates the braces rule in yamlint and will fail CI linting. The test class name TestNightlyDevstral2Single is correctly generated from the test framework pattern.

🔧 Lint-compliant fix
-          - { id: mistralai/Devstral-2-123B-Instruct-2512, slug: mistralai-Devstral-2-123B-Instruct-2512, test_class: TestNightlyDevstral2Single }
+          - {id: mistralai/Devstral-2-123B-Instruct-2512, slug: mistralai-Devstral-2-123B-Instruct-2512, test_class: TestNightlyDevstral2Single}
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
- { id: mistralai/Devstral-2-123B-Instruct-2512, slug: mistralai-Devstral-2-123B-Instruct-2512, test_class: TestNightlyDevstral2Single }
- {id: mistralai/Devstral-2-123B-Instruct-2512, slug: mistralai-Devstral-2-123B-Instruct-2512, test_class: TestNightlyDevstral2Single}
🧰 Tools
🪛 YAMLlint (1.38.0)

[error] 124-124: too many spaces inside braces

(braces)


[error] 124-124: too many spaces inside braces

(braces)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In @.github/workflows/nightly-benchmark.yml at line 124, The flow-mapping on the
line containing the mapping for id: mistralai/Devstral-2-123B-Instruct-2512,
slug: mistralai-Devstral-2-123B-Instruct-2512, and test_class:
TestNightlyDevstral2Single has spaces after the opening brace and before the
closing brace; remove those spaces so the mapping uses "{id: ..., slug: ...,
test_class: ...}" (no space immediately after "{" and no space immediately
before "}") to satisfy yamlint’s braces rule.

- { id: meta-llama/Llama-4-Scout-17B-16E-Instruct, slug: meta-llama-Llama-4-Scout-17B-16E-Instruct, test_class: TestNightlyLlama4ScoutSingle }
- { id: meta-llama/Llama-3.3-70B-Instruct, slug: meta-llama-Llama-3.3-70B-Instruct, test_class: TestNightlyLlama70bSingle }
- { id: RedHatAI/Llama-3.3-70B-Instruct-FP8-dynamic, slug: RedHatAI-Llama-3.3-70B-Instruct-FP8-dynamic, test_class: TestNightlyLlama70bFp8Single }
Expand Down
8 changes: 7 additions & 1 deletion e2e_test/benchmarks/test_nightly_perf.py
Original file line number Diff line number Diff line change
Expand Up @@ -102,6 +102,13 @@ def _run_nightly(setup_backend, genai_bench_runner, model_id, worker_count=1, **
("Qwen/Qwen3-30B-A3B", "Qwen30b", 4, ["http", "grpc"], {}),
("openai/gpt-oss-20b", "GptOss20b", 1, ["http", "grpc"], {}),
("minimaxai/minimax-m2", "MinimaxM2", 1, ["http", "grpc"], {}),
(
"mistralai/Devstral-2-123B-Instruct-2512",
"Devstral2",
1,
["http", "grpc"],
{},
),
(
"meta-llama/Llama-4-Maverick-17B-128E-Instruct-FP8",
"Llama4Maverick",
Expand Down Expand Up @@ -132,7 +139,6 @@ def _run_nightly(setup_backend, genai_bench_runner, model_id, worker_count=1, **
),
]


# ---------------------------------------------------------------------------
# Dynamic test class generation
# ---------------------------------------------------------------------------
Expand Down
9 changes: 9 additions & 0 deletions e2e_test/infra/model_specs.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,15 @@ def _resolve_model_path(hf_path: str) -> str:
"sglang_args": ["--trust-remote-code"],
"vllm_args": ["--trust-remote-code"],
},
# Devstral 2 123B - Nightly benchmarks
"mistralai/Devstral-2-123B-Instruct-2512": {
"model": _resolve_model_path("mistralai/Devstral-2-123B-Instruct-2512"),
"tp": 4,
"features": ["chat", "streaming", "function_calling", "reasoning"],
"startup_timeout": 1200,
"sglang_args": ["--trust-remote-code"],
"vllm_args": ["--trust-remote-code"],
},
# Vision-language model for multimodal benchmarks (MMMU)
"Qwen/Qwen3-VL-8B-Instruct": {
"model": _resolve_model_path("Qwen/Qwen3-VL-8B-Instruct"),
Expand Down
Loading