feat: three-tier tool coverage, safety hooks, security hardening by notque · Pull Request #5 · notque/openstack-agent-toolkit

notque · 2026-05-07T19:14:01Z

Summary

Document all 119 MCP tools across 12 services with proper tier markers (Read/Write/Admin)
Add PreToolUse hook (destructive-action-gate) blocking deletes and power-state changes until user approves
Add PostToolUse hook (credential-expiry-detector) detecting auth failures with strong/weak pattern tiers
Add ADR-002 documenting the three-tier architecture decision
Expand rules with error handling patterns, rate limiting, stop conditions, and role awareness
Add org standard files (CODE_OF_CONDUCT, CODEOWNERS, .editorconfig, LICENSE, CI workflow)
Remove real employee identifiers and internal domain names from examples
Fix duplicate tool name in hook, dead tool references in skills, heading format inconsistencies

Test plan

python3 tools/validate.py passes (validates all manifests, skill frontmatter, line counts)
Hook tests: destructive-action-gate correctly blocks all delete/stop/HARD-reboot operations
Hook tests: credential-expiry-detector triggers on strong patterns, no false positive on single weak match
No secrets, tokens, or real identifiers in committed files (security scan passes)
GitHub Actions validate workflow passes on push

…ive rules - Add 9 new service skills: DNS (Designate), load balancer (Octavia), images (Glance), object storage (Swift), secrets (Barbican), autoscaling (Castellum), shared storage (Manila), baremetal (Ironic), email (Cronus) — all with parameters verified against MCP server source - Fix parameter errors: attribute_name→attribute (hermes), id-only for app credential delete (keystone), remove phantom resource param (limes) - Add CONTRIBUTING.md, SECURITY.md, CHANGELOG.md for v0.1.0 - Expand rules from 8 lines to comprehensive guidance: error handling patterns (401-503), rate limiting, pagination, destructive operation gates, stop conditions, max-depth directives, role awareness table - Add CADF event format reference from sapcc/go-api-declarations - Update README: prerequisites, MCP server setup, credential setup, smoke test, task-routing table for all 19 skills, related tools - Update knowledge/services.md: all services with MCP prefixes, CLI tools (hermescli, limesctl), go-api-declarations, OpenStack API reference links - Update marketplace descriptions to reflect 19-skill coverage

Behavioral rules in markdown are advisory — the LLM can rationalize past them. This hook is deterministic enforcement: it fires on every MCP tool call and blocks destructive operations (delete, stop, HARD reboot, security group removal, etc.) with exit code 2, requiring the user to individually approve each destructive action. Covers: Nova (stop/HARD reboot/delete), Cinder (delete volume/snapshot), Neutron (delete port/network/subnet/SG/FIP/router), Keystone (delete app cred/project), Designate (delete zone/recordset), Octavia (delete LB/pool/listener), Barbican (delete secret), Manila (delete share), Swift (delete container/object), Glance (delete image). Safe operations (list, get, create, soft reboot, start) pass through with exit 0 and no user friction.

When any MCP tool returns 401/auth error, injects context directing the agent to stop retrying and guide the user to re-authenticate via credential-setup skill. Prevents wasteful retry loops on dead tokens.

The Hermes API returns HTTP 500 when paginating past offset 10,000. hermescli handles this with --over-10k-fix (time-based cursoring). The MCP tool has no built-in workaround, so the skill must teach agents to keep queries scoped below 10k results. Also documents all valid sort keys (from hermescli ListOpts) and updates troubleshooting section with 500-on-large-query guidance.

ADR-002 establishes the architecture for documenting all 119 MCP tools across read/write/admin tiers, mirroring the MCP server's env-gated visibility model (MCP_READ_ONLY, MCP_ADMIN_TOOLS). Rules file updated with: - Three-tier model documentation - "Tool doesn't appear" troubleshooting guidance - Write tool safety protocol (confirmed two-call pattern) - Admin tool role awareness Hook updated with: - neutron_delete_floating_ip (new in PR #13) - ironic_node_power_state (admin tier, destructive) - Better target ID extraction (checks all common ID field names)

- Document all 119 MCP tools with proper tier markers (* write, † admin) - Add Guardrails sections documenting server-side validation (UUID, FQDN, etc.) - Add Security: Credential Isolation sections for Barbican, Ironic, Manila - Add MCP_READ_ONLY and MCP_ADMIN_TOOLS to .mcp.json env config - Add three-tier model explanation and env vars to README - Remove duplicate sapcc-filesystems (consolidated into sapcc-shared-storage) - Mark Castellum and Cronus as planned (not yet in MCP server) - Update plugin.json description to reflect 119-tool coverage

.claude/session-reads.txt is per-session state that shouldn't be versioned.

Prevent session-specific files (plans, worktrees, scheduled tasks, session reads) from being accidentally committed.

Security fixes: - Remove real employee I-number from CADF example (use D012345) - Replace cloud.sap with cloud.example.com in README examples - Replace monsoon3 domain with cc-demo/my-domain in examples Hook fixes: - Remove duplicate neutron_delete_floatingip entry (canonical name is neutron_delete_floating_ip with underscore) - Credential detector now uses strong/weak pattern tiers to reduce false positives (single weak match no longer triggers) Tool reference fixes: - Fix neutron_get_subnet -> neutron_list_subnets in loadbalancer skill - Fix limes_get_cluster_quota -> limes_get_cluster_capacity in quota skill - Mark castellum_* and cronus_* as (planned) in README routing table - Clarify autoscaling/email tools as NOT YET AVAILABLE

Standardize all skills to same section heading pattern: - ### Read Tools (no qualifier) - ### Write Tools (requires MCP_READ_ONLY=false) - ### Admin Tools (requires MCP_ADMIN_TOOLS=true) Also mark autoscaling/email as (MCP tools planned) in README skills table to set correct expectations.

notque added 11 commits May 7, 2026 11:51

docs: link service names to upstream repos in README skills table

15e17cd

feat: add PostToolUse hook for credential expiry detection

5435efe

When any MCP tool returns 401/auth error, injects context directing the agent to stop retrying and guide the user to re-authenticate via credential-setup skill. Prevents wasteful retry loops on dead tokens.

chore: remove session tracking file from repo

14c0609

.claude/session-reads.txt is per-session state that shouldn't be versioned.

chore: add .claude session state to gitignore

b7babf3

Prevent session-specific files (plans, worktrees, scheduled tasks, session reads) from being accidentally committed.

notque merged commit 03e7232 into main May 7, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: three-tier tool coverage, safety hooks, security hardening#5

feat: three-tier tool coverage, safety hooks, security hardening#5
notque merged 11 commits intomainfrom
feature/audit-fixes-and-parity

notque commented May 7, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

notque commented May 7, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant