fix(mcp): report llm_available via the chat-model credential resolver (#200) by wernerkasselman-au · Pull Request #204 · NVIDIA/SkillSpector

wernerkasselman-au · 2026-06-24T23:37:52Z

What this fixes

Closes the bug I filed as #200. The MCP server decided whether the semantic pass could run by calling resolve_provider_credentials(), which only ever resolves the active provider's credentials. The model the graph actually constructs, though, comes from create_chat_model(), and that one falls back to a standard OpenAI client (OPENAI_API_KEY / OPENAI_BASE_URL) when the active provider is not configured. So on an OpenAI-only setup (the default NVIDIA provider with no NVIDIA_INFERENCE_KEY, but OPENAI_API_KEY set), the gate came out false, the server reported scan_mode: static-only, and it skipped the LLM pass that the CLI would have run on the very same environment.

The change

One line of behaviour: gate llm_available on resolve_chat_model_credentials() (providers/__init__.py:111-117) instead of resolve_provider_credentials(). That resolver already mirrors the OpenAI fallback that create_chat_model() relies on, so the MCP accounting now lines up with the model path the graph actually takes. I left a comment at the call site explaining why the fallback-aware resolver is the right one, so nobody narrows it back to the active provider later.

Tests

Updated the three existing accounting tests to patch the new resolver (resolve_chat_model_credentials), since that is the name the server now consults.
Added test_run_scan_reports_llm_available_via_openai_fallback, a regression test that patches the active-provider-only resolver to None and the chat-model resolver to a credential, then asserts the server reports the LLM as available. On the old code this fails (it consulted the active-only resolver), so it pins the fix.

uv run ruff check src/ tests/        # clean
uv run ruff format --check src/ tests/  # clean
uv run pytest tests/unit/test_mcp_server.py -q  # 6 passed

One thing I want to flag, in case it was deliberate: if the MCP path was scoped to the active provider on purpose (say, to keep the server from quietly using a developer's ambient OPENAI_API_KEY), then this change is the wrong call and I would rather hear that and close it. Otherwise the accounting should tell the truth about what the scan can do.

Refs #200

run_scan gated the LLM pass on resolve_provider_credentials(), which resolves only the active provider's credentials. create_chat_model() falls back to a standard OpenAI client (OPENAI_API_KEY / OPENAI_BASE_URL) when the active provider is unconfigured, so an OpenAI-only setup reported llm_available=false and the semantic pass was skipped even though the CLI would have run it. Gate on resolve_chat_model_credentials(), which includes that fallback, so the MCP server's accounting matches the model path the graph actually takes. Update the existing accounting tests to patch the new resolver and add a regression test for the OpenAI-fallback case. Refs NVIDIA#200 Signed-off-by: Werner Kasselman <145896621+wernerkasselman-au@users.noreply.github.com>

rng1995

Approving — correct fix for #200. Gating MCP llm_available on resolve_chat_model_credentials() (the chat-model resolver) instead of resolve_provider_credentials() (active-provider only) makes availability match what create_chat_model actually does, including the OpenAI fallback. The regression test patches the resolver and asserts llm_available is reported truthfully. Note this edits mcp_server.py alongside #196 — expect a rebase.

vcolombo · 2026-07-02T21:25:43Z

Ran into #200 independently while embedding SkillSpector in an agent runtime, and I'd like to suggest going one step further than resolve_chat_model_credentials(): gating on llm_utils.is_llm_available() instead.

The credential-resolver fix correctly handles the OpenAI-fallback case from #200, but it still reports llm_available=false for every credential-less provider — the CLI providers (claude_cli, codex_cli, gemini_cli) return None from resolve_credentials() by design, so under the MCP server they're judged unavailable even though get_chat_model() would happily serve them via their is_available() capability check. The CLI already gets this right; run_scan doesn't.

is_llm_available() is capability-aware (it delegates to is_available() for CLI-capable providers and falls back to credential resolution — including the OpenAI fallback — for HTTP providers), so:

llm_available, _ = is_llm_available()

fixes #200 and the CLI-provider case in one move, and keeps run_scan consistent with what get_chat_model() will actually do. It also future-proofs the gate for any external duck-typed provider (has_cli_capability explicitly invites those).

I have a tested implementation of this (including the monkeypatch updates to tests/unit/test_mcp_server.py, which currently patch the old resolver name) that I'm happy to contribute here or as a follow-up — whichever is easier for this PR.

(Also relevant to #243 — an injected embedded provider is credential-less by definition, so it depends on the gate being capability-aware.)

wernerkasselman-au · 2026-07-02T22:12:17Z

Thanks, good catch. I agree run_scan should use is_llm_available() as the single availability gate rather than duplicating credential-specific logic in the MCP wrapper.

For this PR as it stands, that is behaviorally equivalent to resolve_chat_model_credentials() because is_llm_available() is still credential-backed, so the #200 OpenAI fallback fix remains covered. But it is the better abstraction once the CLI-capable providers from #234 land, and it also keeps the MCP server aligned with report.py.

I’m fine either way on how to land it: I can update this PR to use is_llm_available() and adjust the test_mcp_server.py monkeypatches, or you can contribute your tested implementation as a follow-up.

@rng1995 deferring to you on which route you’d prefer for this PR.

wernerkasselman-au mentioned this pull request Jun 25, 2026

style: apply ruff 0.15.2 formatting to drifted files (unblocks CI format check) #207

Open

rng1995 mentioned this pull request Jun 27, 2026

fix(mcp): reject local targets over HTTP transport #196

Open

5 tasks

rng1995 approved these changes Jun 27, 2026

View reviewed changes

vcolombo mentioned this pull request Jul 2, 2026

[Feature] Let an embedding application inject an LLM provider #243

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix(mcp): report llm_available via the chat-model credential resolver (#200)#204

fix(mcp): report llm_available via the chat-model credential resolver (#200)#204
wernerkasselman-au wants to merge 1 commit into
NVIDIA:mainfrom
wernerkasselman-au:fix/mcp-llm-available-fallback-creds

wernerkasselman-au commented Jun 24, 2026

Uh oh!

rng1995 left a comment

Uh oh!

vcolombo commented Jul 2, 2026

Uh oh!

wernerkasselman-au commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Conversation

wernerkasselman-au commented Jun 24, 2026

What this fixes

The change

Tests

Uh oh!

rng1995 left a comment

Choose a reason for hiding this comment

Uh oh!

vcolombo commented Jul 2, 2026

Uh oh!

wernerkasselman-au commented Jul 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants