Skip to content

feat: three-tier tool coverage, safety hooks, security hardening#5

Merged
notque merged 11 commits intomainfrom
feature/audit-fixes-and-parity
May 7, 2026
Merged

feat: three-tier tool coverage, safety hooks, security hardening#5
notque merged 11 commits intomainfrom
feature/audit-fixes-and-parity

Conversation

@notque
Copy link
Copy Markdown
Owner

@notque notque commented May 7, 2026

Summary

  • Document all 119 MCP tools across 12 services with proper tier markers (Read/Write/Admin)
  • Add PreToolUse hook (destructive-action-gate) blocking deletes and power-state changes until user approves
  • Add PostToolUse hook (credential-expiry-detector) detecting auth failures with strong/weak pattern tiers
  • Add ADR-002 documenting the three-tier architecture decision
  • Expand rules with error handling patterns, rate limiting, stop conditions, and role awareness
  • Add org standard files (CODE_OF_CONDUCT, CODEOWNERS, .editorconfig, LICENSE, CI workflow)
  • Remove real employee identifiers and internal domain names from examples
  • Fix duplicate tool name in hook, dead tool references in skills, heading format inconsistencies

Test plan

  • python3 tools/validate.py passes (validates all manifests, skill frontmatter, line counts)
  • Hook tests: destructive-action-gate correctly blocks all delete/stop/HARD-reboot operations
  • Hook tests: credential-expiry-detector triggers on strong patterns, no false positive on single weak match
  • No secrets, tokens, or real identifiers in committed files (security scan passes)
  • GitHub Actions validate workflow passes on push

notque added 11 commits May 7, 2026 11:51
…ive rules

- Add 9 new service skills: DNS (Designate), load balancer (Octavia),
  images (Glance), object storage (Swift), secrets (Barbican),
  autoscaling (Castellum), shared storage (Manila), baremetal (Ironic),
  email (Cronus) — all with parameters verified against MCP server source

- Fix parameter errors: attribute_name→attribute (hermes), id-only for
  app credential delete (keystone), remove phantom resource param (limes)

- Add CONTRIBUTING.md, SECURITY.md, CHANGELOG.md for v0.1.0

- Expand rules from 8 lines to comprehensive guidance: error handling
  patterns (401-503), rate limiting, pagination, destructive operation
  gates, stop conditions, max-depth directives, role awareness table

- Add CADF event format reference from sapcc/go-api-declarations

- Update README: prerequisites, MCP server setup, credential setup,
  smoke test, task-routing table for all 19 skills, related tools

- Update knowledge/services.md: all services with MCP prefixes,
  CLI tools (hermescli, limesctl), go-api-declarations, OpenStack
  API reference links

- Update marketplace descriptions to reflect 19-skill coverage
Behavioral rules in markdown are advisory — the LLM can rationalize
past them. This hook is deterministic enforcement: it fires on every
MCP tool call and blocks destructive operations (delete, stop, HARD
reboot, security group removal, etc.) with exit code 2, requiring
the user to individually approve each destructive action.

Covers: Nova (stop/HARD reboot/delete), Cinder (delete volume/snapshot),
Neutron (delete port/network/subnet/SG/FIP/router), Keystone (delete
app cred/project), Designate (delete zone/recordset), Octavia (delete
LB/pool/listener), Barbican (delete secret), Manila (delete share),
Swift (delete container/object), Glance (delete image).

Safe operations (list, get, create, soft reboot, start) pass through
with exit 0 and no user friction.
When any MCP tool returns 401/auth error, injects context directing
the agent to stop retrying and guide the user to re-authenticate via
credential-setup skill. Prevents wasteful retry loops on dead tokens.
The Hermes API returns HTTP 500 when paginating past offset 10,000.
hermescli handles this with --over-10k-fix (time-based cursoring).
The MCP tool has no built-in workaround, so the skill must teach
agents to keep queries scoped below 10k results.

Also documents all valid sort keys (from hermescli ListOpts) and
updates troubleshooting section with 500-on-large-query guidance.
ADR-002 establishes the architecture for documenting all 119 MCP tools
across read/write/admin tiers, mirroring the MCP server's env-gated
visibility model (MCP_READ_ONLY, MCP_ADMIN_TOOLS).

Rules file updated with:
- Three-tier model documentation
- "Tool doesn't appear" troubleshooting guidance
- Write tool safety protocol (confirmed two-call pattern)
- Admin tool role awareness

Hook updated with:
- neutron_delete_floating_ip (new in PR #13)
- ironic_node_power_state (admin tier, destructive)
- Better target ID extraction (checks all common ID field names)
- Document all 119 MCP tools with proper tier markers (* write, † admin)
- Add Guardrails sections documenting server-side validation (UUID, FQDN, etc.)
- Add Security: Credential Isolation sections for Barbican, Ironic, Manila
- Add MCP_READ_ONLY and MCP_ADMIN_TOOLS to .mcp.json env config
- Add three-tier model explanation and env vars to README
- Remove duplicate sapcc-filesystems (consolidated into sapcc-shared-storage)
- Mark Castellum and Cronus as planned (not yet in MCP server)
- Update plugin.json description to reflect 119-tool coverage
.claude/session-reads.txt is per-session state that shouldn't be versioned.
Prevent session-specific files (plans, worktrees, scheduled tasks,
session reads) from being accidentally committed.
Security fixes:
- Remove real employee I-number from CADF example (use D012345)
- Replace cloud.sap with cloud.example.com in README examples
- Replace monsoon3 domain with cc-demo/my-domain in examples

Hook fixes:
- Remove duplicate neutron_delete_floatingip entry (canonical name
  is neutron_delete_floating_ip with underscore)
- Credential detector now uses strong/weak pattern tiers to reduce
  false positives (single weak match no longer triggers)

Tool reference fixes:
- Fix neutron_get_subnet -> neutron_list_subnets in loadbalancer skill
- Fix limes_get_cluster_quota -> limes_get_cluster_capacity in quota skill
- Mark castellum_* and cronus_* as (planned) in README routing table
- Clarify autoscaling/email tools as NOT YET AVAILABLE
Standardize all skills to same section heading pattern:
- ### Read Tools (no qualifier)
- ### Write Tools (requires MCP_READ_ONLY=false)
- ### Admin Tools (requires MCP_ADMIN_TOOLS=true)

Also mark autoscaling/email as (MCP tools planned) in README
skills table to set correct expectations.
@notque notque merged commit 03e7232 into main May 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant