fix(mcp): report llm_available via the chat-model credential resolver (#200)#204
Conversation
run_scan gated the LLM pass on resolve_provider_credentials(), which resolves only the active provider's credentials. create_chat_model() falls back to a standard OpenAI client (OPENAI_API_KEY / OPENAI_BASE_URL) when the active provider is unconfigured, so an OpenAI-only setup reported llm_available=false and the semantic pass was skipped even though the CLI would have run it. Gate on resolve_chat_model_credentials(), which includes that fallback, so the MCP server's accounting matches the model path the graph actually takes. Update the existing accounting tests to patch the new resolver and add a regression test for the OpenAI-fallback case. Refs NVIDIA#200 Signed-off-by: Werner Kasselman <145896621+wernerkasselman-au@users.noreply.github.com>
rng1995
left a comment
There was a problem hiding this comment.
Approving — correct fix for #200. Gating MCP llm_available on resolve_chat_model_credentials() (the chat-model resolver) instead of resolve_provider_credentials() (active-provider only) makes availability match what create_chat_model actually does, including the OpenAI fallback. The regression test patches the resolver and asserts llm_available is reported truthfully. Note this edits mcp_server.py alongside #196 — expect a rebase.
|
Ran into #200 independently while embedding SkillSpector in an agent runtime, and I'd like to suggest going one step further than The credential-resolver fix correctly handles the OpenAI-fallback case from #200, but it still reports
llm_available, _ = is_llm_available()fixes #200 and the CLI-provider case in one move, and keeps I have a tested implementation of this (including the monkeypatch updates to (Also relevant to #243 — an injected embedded provider is credential-less by definition, so it depends on the gate being capability-aware.) |
|
Thanks, good catch. I agree For this PR as it stands, that is behaviorally equivalent to I’m fine either way on how to land it: I can update this PR to use @rng1995 deferring to you on which route you’d prefer for this PR. |
What this fixes
Closes the bug I filed as #200. The MCP server decided whether the semantic pass could run by calling
resolve_provider_credentials(), which only ever resolves the active provider's credentials. The model the graph actually constructs, though, comes fromcreate_chat_model(), and that one falls back to a standard OpenAI client (OPENAI_API_KEY/OPENAI_BASE_URL) when the active provider is not configured. So on an OpenAI-only setup (the default NVIDIA provider with noNVIDIA_INFERENCE_KEY, butOPENAI_API_KEYset), the gate came outfalse, the server reportedscan_mode: static-only, and it skipped the LLM pass that the CLI would have run on the very same environment.The change
One line of behaviour: gate
llm_availableonresolve_chat_model_credentials()(providers/__init__.py:111-117) instead ofresolve_provider_credentials(). That resolver already mirrors the OpenAI fallback thatcreate_chat_model()relies on, so the MCP accounting now lines up with the model path the graph actually takes. I left a comment at the call site explaining why the fallback-aware resolver is the right one, so nobody narrows it back to the active provider later.Tests
resolve_chat_model_credentials), since that is the name the server now consults.test_run_scan_reports_llm_available_via_openai_fallback, a regression test that patches the active-provider-only resolver toNoneand the chat-model resolver to a credential, then asserts the server reports the LLM as available. On the old code this fails (it consulted the active-only resolver), so it pins the fix.One thing I want to flag, in case it was deliberate: if the MCP path was scoped to the active provider on purpose (say, to keep the server from quietly using a developer's ambient
OPENAI_API_KEY), then this change is the wrong call and I would rather hear that and close it. Otherwise the accounting should tell the truth about what the scan can do.Refs #200