698 Tools Across 3 Custom Go MCP Servers for Dynatrace — Full API Coverage, Reads and Writes, 3 Months in Production #704
npcomplete777
started this conversation in
Show and tell
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
What I Built
Three standalone MCP servers written in Go, each covering a distinct Dynatrace API surface:
698 tools total. Every server connects to any MCP client (Claude Desktop, Claude Code, Cursor, etc.) over stdio using the standard JSON-RPC transport. Built with the [mcp-go SDK](https://github.com/mark3labs/mcp-go).
Why Three Servers
This maps to how Dynatrace actually works. The Platform API discovers and queries (Grail, DQL, Davis). The Environment API reads runtime state (entities, problems, metrics, topology). The Configuration API writes operational config (dashboards, alerting, auto-tags, management zones).
In Claude Desktop, all three connect independently. One context window has access to all 698 tools simultaneously. You can go from "show me services with high error rates" → "which auto-tags apply to those services" → "create an alerting profile for this subset" → "build a dashboard tracking the SLOs" in a single conversation. Read the environment, reason about it, write changes back. Closed loop.
The Read + Write Distinction
This is the part that matters most. The servers don't just query Dynatrace — they configure it. Create auto-tag rules. Build dashboards. Set up management zones. Tune anomaly detection thresholds. Configure alerting profiles and notification integrations. Write maintenance windows. Define service detection rules.
Full CRUD across the entire configuration surface. Not just GET — POST, PUT, DELETE.
Production Use
My brother is a Dynatrace tools admin at a Fortune 500 retailer. He uses these MCP servers for daily configuration work — auto-tagging, dashboard management, anomaly detection tuning, management zone setup. Three months of daily production use. Zero tool failures. Zero hallucination-caused misconfigurations.
I use them on consulting engagements for environment profiling, SLO creation, dashboard generation, and configuration audits.
What I've Learned About Scaling MCP Tool Count
The obvious question: "How do you handle 698 tools without the model hallucinating?"
I'm aware of Anthropic's [published guidance on tool selection degradation](https://www.anthropic.com/engineering/advanced-tool-use) beyond 30-50 tools, and their Tool Search Tool for deferred loading. My experience suggests that with specific architectural patterns, the degradation curve is less severe than the general case. Here's what I've found works:
Tool naming conventions matter more than tool count. Every tool follows a consistent
{domain}_{action}_{resource}pattern —dt_env_get_metrics,dt_config_create_dashboard,dt_platform_query_dql. Clear, predictable, zero ambiguity between tools. The model doesn't struggle with selection when every name unambiguously describes exactly one operation. Anthropic's own guidance emphasizes clear tool descriptions as critical, and I'd argue that at scale, naming discipline is the single biggest factor in selection accuracy.Descriptions are the real interface. The model reads tool descriptions to decide what to call. I spent significant time making each description precise — what the tool does, what parameters it requires, what it returns. Vague descriptions cause wrong tool selection. Precise descriptions don't.
Domain separation keeps the selection space manageable. Splitting into three servers aligned with how the APIs are actually structured (different auth patterns, different base URLs, different response formats) means each server's tools are internally consistent. A conversation about SLOs naturally stays within Platform API tools. A configuration audit naturally stays within Configuration API tools. The model isn't choosing between 698 undifferentiated tools — it's navigating well-separated domains.
Typical conversations touch a small subset. The MCP client loads tool definitions at connection time. The model has the full catalog available but only invokes what's relevant to the current task. A conversation about SLOs touches maybe 5-10 tools. A configuration audit might touch 30-40. The effective selection space for any given task is much smaller than the total catalog.
Expert-in-the-loop matters for write operations. I want to be precise about the "zero hallucination" claim. Both operators — my brother and I — are Dynatrace domain experts. We review tool selections and parameters before write operations execute. We're not handing 698 tools to an unattended agent and walking away. The reliability claim holds in the context of expert-guided operation, not unsupervised autonomous execution. That's an important distinction.
On Tool Search and Deferred Loading
Anthropic's Tool Search Tool and
defer_loadingare solving a real problem — context window bloat and selection accuracy at scale. I haven't integrated these yet because Claude Desktop's stdio transport handles the current setup well for expert-guided workflows. But I'm interested in whether Tool Search could further improve an already-functional large-catalog setup, particularly for scenarios involving less domain-expert guidance or fully autonomous operation. If anyone from Anthropic has insight into how Claude Desktop's stdio MCP implementation handles tool definition loading relative to the raw API, I'd be very curious to learn more.Comparison to Official Vendor MCPs
For context on where vendor MCP implementations currently stand:
dynatrace-oss/dynatrace-mcp) — TypeScript, Platform API focused. DQL queries, problems, entities, logs, Davis CoPilot integration. Strong for developer workflows in IDEs. No Configuration API coverage — no auto-tags, no dashboards, no management zones, no alerting profiles. Read-oriented.These are good starting points. But they cover a fraction of what the platforms can actually do, and they're almost entirely read-only. The gap between "query your environment" and "operate your environment" through MCP is where the real value lives, and it's where I've focused.
Tech Stack
What I'd Like to Discuss
Has anyone else gone deep on a single platform's API surface with MCP? Most implementations I've seen are either shallow (10-30 tools) or wide (many integrations, one tool each). I'm curious if others have taken the deep vertical approach.
Tool discovery and selection at scale. At 698 tools, the model still picks correctly the vast majority of the time with the patterns described above. But I'm interested in whether anyone has experimented with dynamic tool filtering, tool categories, progressive disclosure patterns, or Anthropic's Tool Search Tool to improve selection accuracy as tool counts grow — particularly for less supervised or fully autonomous workflows.
Read + Write patterns. Most MCP servers are read-only. Once you add write operations (creating dashboards, modifying alerting rules, changing configuration), the stakes go up. I'd be interested in how others are thinking about guardrails, confirmation flows, or approval gates for write operations in MCP.
Happy to answer questions about the architecture, the Go implementation, or lessons learned from running this at scale in production.
Beta Was this translation helpful? Give feedback.
All reactions