ci: add MCP tool name validation against live servers by jordigilh · Pull Request #95 · RHEcosystemAppEng/agentic-collections

jordigilh · 2026-05-08T17:28:33Z

Summary

Adds a CI check that starts each pack's MCP servers (from mcps.json) via podman and cross-references the allowed-tools declared in SKILL.md frontmatter against the actual tools exposed by the server
Catches tool name mismatches (e.g., pod_list vs pods_list) that pass all static linters but silently break skills at runtime
Uses a Kind cluster to provide a valid kubeconfig for MCP server startup

Motivation

While working on PRs #79 and #80, we discovered 6 tool name mismatches between our SKILL.md declarations and the actual MCP server tool registry. These went undetected by the existing skill-linter and compliance-check because they only validate static structure, not runtime tool availability.

SKILL.md declared	Actual MCP tool
`pod_list`	`pods_list`
`pod_logs`	`pods_log`
`get_metric_names`	(doesn't exist)
`get_metric_metadata`	(doesn't exist)
`get_series`	(doesn't exist)
`query`	`prometheus_query`

Components

scripts/validate-mcp-tools.sh — Starts container-based MCP servers via podman, sends JSON-RPC initialize + tools/list, and validates each skill's allowed-tools
.github/workflows/mcp-tool-validation.yml — GitHub Actions workflow using Kind cluster + podman, triggers on changes to mcps.json or SKILL.md files

Test plan

Tested locally against rh-developer pack — correctly detected 6 mismatches in incident-triage (pre-fix) and passed debug-scc/debug-rbac (post-fix)
Verify workflow runs successfully in CI with Kind cluster
Verify credential-gated MCP servers (github, lightspeed-mcp) are gracefully skipped

Made with Cursor

Adds a CI check that starts each pack's MCP servers (from mcps.json) and cross-references the `allowed-tools` declared in SKILL.md frontmatter against the actual tools exposed by the server. This catches tool name mismatches (e.g., pod_list vs pods_list) that would silently break skills at runtime but pass all static linters. Components: - scripts/validate-mcp-tools.sh: bash script that starts container- based MCP servers via podman, sends JSON-RPC initialize + tools/list, and validates each skill's allowed-tools against the response - .github/workflows/mcp-tool-validation.yml: GitHub Actions workflow using Kind cluster + podman, triggers on changes to mcps.json or SKILL.md files Co-authored-by: Cursor <cursoragent@cursor.com>

Replace bash script with a pure-Python implementation for cross-OS portability. Key improvements: - Response-based MCP communication via subprocess + select (no sleeps) - Explicit image pulling with error handling - JSON-RPC pagination support (follows nextCursor) - Levenshtein-based "did you mean?" suggestions for mismatched tools - Graceful handling of servers that exit immediately (missing creds) - File path and line number in error output - Logging of skipped non-container MCP servers Workflow hardening: - Pin all GitHub Actions to SHA - Add permissions: contents: read - Add concurrency group with cancel-in-progress - Add timeout-minutes: 10 - Explicit KUBECONFIG capture from Kind - Add workflow_dispatch pack input for manual runs - Inline pack detection (remove dependency on detect-changed-packs.sh) Co-authored-by: Cursor <cursoragent@cursor.com>

dmartinol · 2026-05-11T12:23:23Z

Hey @jordigilh thanks for your contribution to proactively catch these mismtaches!
To avoid risky changes, we decided to postpone commits that could affect the skill's execution post-Summit, hop e you can understand.

Apart from that, 2 comments on this awesome PR:
1- we should improve the suggestion mechanism to suggest matching names in case of missed toolset or wrong toolset or tool name. Maybe using semantic search with a small embedding model we can avoid coding complex match logic?
2- try to extend the check for non containerized commands (uvx, npm)

jordigilh · 2026-05-11T13:28:07Z

Thanks @dmartinol — totally understand the post-Summit freeze. No rush on merging this one.

On your two suggestions:

Semantic search for suggestions — interesting idea. The current Levenshtein approach catches simple typos (pod_list → pods_list) but misses semantic gaps like query → prometheus_query. An embedding model would handle that better. I'll explore lightweight options (e.g., a small sentence-transformer or even TF-IDF over tool names + descriptions) as a follow-up so we don't add heavy dependencies to CI.
Non-containerized commands (npx, uvx) — agreed, this would future-proof the check. Right now no skills declare allowed-tools referencing tools from npx-based servers, so there's zero false-positive risk from skipping them. But as more skills adopt allowed-tools, we'll want coverage. I'll look into adding Node.js to the CI environment and handling the different startup patterns.

Happy to iterate on both after Summit. Let me know if there's anything else you'd like adjusted in the meantime.

dmartinol · 2026-05-12T06:52:46Z

2. Non-containerized commands (npx, uvx) — agreed, this would future-proof the check. Right now no skills declare allowed-tools referencing tools from npx-based servers, so there's zero false-positive risk from skipping them. But as more skills adopt allowed-tools, we'll want coverage. I'll look into adding Node.js to the CI environment and handling the different startup patterns.

Thanks for being available to extend this solution! And yes, we'll also extend the coverage of allowed-tools fields early next week.

dmartinol · 2026-05-12T06:53:16Z

pls also add this step under the validate make target

Done in 527a2f7. Here's what this commit adds:

validate_mcp_tools.py added to the main validate target — runs automatically with make validate.

Graceful prerequisite skip — the script checks for podman on PATH and a valid KUBECONFIG at startup. If either is missing it prints a SKIP: message and exits 0, so make validate (and the compliance-check CI workflow that calls it) never breaks in environments without container tooling.

Standalone validate-mcp-tools target — for running the check in isolation, with optional PACK= filter: make validate-mcp-tools PACK=rh-developer.

Improved suggestions — alongside Levenshtein distance, the suggestion engine now uses substring/prefix matching and component overlap (e.g. pods-list → pods_list) to catch more mismatches.

jordigilh · 2026-05-12T14:34:47Z

@dmartinol @r2dedios Thanks for the feedback on integrating this into the validate make target. Before pushing changes, I'd like to get your input on the approach since validate_mcp_tools.py has heavier prerequisites (podman + KUBECONFIG/Kind) than the existing validation steps.

Option A: Add it to make validate with graceful skip

The script would detect missing prerequisites (no podman, no KUBECONFIG) and exit 0 with a warning instead of failing. This keeps make validate and compliance-check.yml working as-is — the MCP step simply auto-skips when the infra isn't there. Locally, devs with podman get the full check; those without get a skip message.

Pro: Single make validate command always works everywhere
Con: Silent skipping can mask the fact that MCP validation didn't actually run

Option B: Standalone target + keep workflows separate

Add make validate-mcp-tools as its own target (supports PACK=rh-developer), but don't add it to make validate. The existing compliance-check.yml stays fast and lightweight (static checks only), while mcp-tool-validation.yml handles the runtime validation with its own podman/Kind setup and targeted triggers (mcps.json, SKILL.md changes).

Pro: Clean separation of static vs. runtime validation; no extra CI minutes on docs-only PRs
Con: make validate doesn't cover MCP tools — devs need to know to run make validate-mcp-tools separately

I'm leaning toward Option B since MCP validation is fundamentally a runtime check (starts containers, JSON-RPC handshake) while the other validate steps are static. But happy to go with either — what's your preference?

dmartinol · 2026-05-12T15:26:36Z

I'm leaning toward Option B since MCP validation is fundamentally a runtime check (starts containers, JSON-RPC handshake) while the other validate steps are static. But happy to go with either — what's your preference?

Hey @jordigilh, good catch!
Whatever the selected option (initially I opted for A, because I did not experience long waiting time during my tests, but B is fine as well), the important is to provide the developer with tools to identify the mismatches and to help him with the fix (e.g. extend the suggestion range).

BTW: other tools we should be considering are the CLI tools that are sometimes referred in the SKILL.md, like oc, but those are harder to be found with a python script.

- Add validate_mcp_tools.py to the main `validate` Makefile target - Add standalone `validate-mcp-tools` target with PACK= filter support - Add graceful prerequisite detection: script exits 0 with a SKIP message when podman or KUBECONFIG is unavailable, so `make validate` never breaks in environments without container tooling - Improve tool name suggestions with substring/prefix matching and component overlap scoring alongside the existing Levenshtein distance Co-authored-by: Cursor <cursoragent@cursor.com>

jordigilh and others added 2 commits May 8, 2026 13:28

dmartinol reviewed May 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ci: add MCP tool name validation against live servers#95

ci: add MCP tool name validation against live servers#95
jordigilh wants to merge 3 commits into
RHEcosystemAppEng:mainfrom
jordigilh:feat/mcp-tool-validation-ci

jordigilh commented May 8, 2026

Uh oh!

dmartinol commented May 11, 2026

Uh oh!

jordigilh commented May 11, 2026

Uh oh!

dmartinol commented May 12, 2026

Uh oh!

dmartinol May 12, 2026

Uh oh!

jordigilh May 12, 2026

Uh oh!

jordigilh commented May 12, 2026

Uh oh!

dmartinol commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jordigilh commented May 8, 2026

Summary

Motivation

Components

Test plan

Uh oh!

dmartinol commented May 11, 2026

Uh oh!

jordigilh commented May 11, 2026

Uh oh!

dmartinol commented May 12, 2026

Uh oh!

dmartinol May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jordigilh May 12, 2026

Choose a reason for hiding this comment

Uh oh!

jordigilh commented May 12, 2026

Uh oh!

dmartinol commented May 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants