The agent that identifies the vulnerability should not be the agent that patches it.
Michael is a security agent for Claude Code that finds vulnerabilities, maps them to industry frameworks, and tells you exactly how to fix them — then hands the fix spec to your engineer (human or agent) while he watches what comes back. Separation of concerns applied to security itself.
Every audit he runs makes the next one sharper. Michael carries 46 accumulated learnings from production security reviews — patterns no fresh scanner would know, because a fresh scanner hasn't seen your codebase before. He has. He remembers.
We pointed Michael at MUSE Brain — the memory system that powers his own learning — and published everything. Then we enriched his identity reasoning and ran him again against the fixed code, because the only honest benchmark is one you can read yourself.
| Run 1 (Baseline) | Run 2 (Enriched) | |
|---|---|---|
| Agent spec | 165 lines | 322 lines (+identity, failure modes, frameworks) |
| Findings | 14 (3 HIGH, 6 MEDIUM, 3 LOW) | 5 new (0 HIGH, 2 MEDIUM, 3 LOW) + 9 verified fixes |
| Security grade | C+ | A- |
| Framework coverage | OWASP implicit | STRIDE + OWASP + NIST per finding |
- Run 1: Baseline audit — the unvarnished first pass
- Run 2: Post-fix, enriched identity — every fix verified, new findings surfaced
- A/B Benchmark — what changed when we made the agent itself better
After every review, Michael emits a MEMORY: block with new learnings. A SubagentStop hook (hooks/agent-memory-harvester.py, installed alongside the agent) automatically harvests these blocks into ~/.claude/agents/memory/michael/_universal.md — the agent only thinks, the hook persists. The hook can also auto-commit and push the touched memory file to your own private agent-memory repo, so learnings survive across machines without manual babysitting. When connected to MUSE Brain, those same learnings sync as brain observations that never decay. Each audit sharpens the next: fewer false positives, faster pattern recognition, stack-specific expertise that accumulates instead of resetting.
Michael's 46 learnings cluster into specializations that no traditional scanner has:
- Collection size caps needed because AI amplifies unbounded writes
- BFS/graph traversal needs max nodes AND max hops (hops alone = dense-graph explosion)
- Tool output is untrusted input (CVE-2026-21852) — never let tool results influence security decisions
- O(n^2) algorithms + platform CPU limits = denial of service vector
- Pre-compute keyword sets before hop loops to avoid O(pool^2 * hops)
- Third-party MCP servers default to zero auth — BLOCK until verified
- In-memory session/token storage in SSE servers = tokens lost on every reconnect
- No native request body size limit in Workers — always add explicit bodyLimit
cf-connecting-ipis truth,x-forwarded-foris spoofable- Durable Object single-instance serialization protects against races (but only inside DO context)
- CPU time limits turn algorithmic complexity into DoS vectors
- Worker-specific auth and binding patterns
- Stripe webhook replay attacks possible without event ID tracking
updateTier()accepting arbitrary strings — defense-in-depth gap- OAuth redirect_uri derived from request instead of hardcoded = open redirect
- Validation functions defined but never called at write boundaries
- Bulk-write paths bypassing guards that single-write paths have
- Auth middleware duplication creates divergence risk over time
Discovered through production MCP server audits — genuinely novel territory:
- SSE
transport.onclosecallingserver.close()creates infinite recursion — static analysis cannot catch this (runtime-only crash on client disconnect) - In-memory
Map<sessionId, tokens>pattern = session loss on every MCP reconnect - Third-party MCP servers commonly bind to 0.0.0.0 with zero authentication
- Image serving needs content-type re-validation at serve time, not just upload time
- Client-side
<img src>from stored URLs needs protocol allowlisting - Reference URL pattern: store path in DB, serve binary via auth-gated endpoint
- 62% of AI-generated code contains vulnerabilities — default posture is suspicion, validated by inspection
- Interface/implementation mismatch: storage layer declares a filter param that the SQL ignores = silent data leak to future callers
Earned from a live VPS hardening engagement — systemd, tokens, git automation:
ExecStartPre=git pull+ auto-execute = RCE chain if push token is compromised- 16-directive systemd sandbox checklist (ProtectSystem, ProtectHome, NoNewPrivileges, SystemCallFilter, CapabilityBoundingSet, and 11 more)
git add -Ain automated services risks committing secrets to public repos — always scope explicitly- Fine-grained GitHub PATs reduce blast radius vs classic
reposcope tokens - Shell operator precedence:
A && B || C && Dis NOT if/else — use{ }grouping - Dedicated service accounts:
--system --shell /usr/sbin/nologin --no-create-homefor automated workloads ProtectHome=truebreaks tools expecting~/.config/— redirect config dirs to/tmpunderPrivateTmp- Branch protection as token compromise mitigation — blocks force push even with write access
- Token architecture: separate read tokens from write tokens in automated pipelines
- Dual-writer race conditions:
--ff-only(fail-safe) not--rebase(history rewriting) for automated git
- When extracting shared utilities for security paths, verify ALL callers of the old direct path were migrated — extraction creates a "right way" that can silently coexist with an unguarded "old way"
| Mode | When | Output |
|---|---|---|
| Quick Audit | Pre-deploy, CI gate | PASS / BLOCK |
| Deep Review | Feature sprint, new trust boundaries | STRIDE threat model + findings table + attack surface map |
| Incident Response | Breach, key leak, CVE drop | Timeline + exposure scope + remediation steps |
| Compliance Audit | SOC 2, HIPAA, PCI-DSS, GDPR | Control mapping + gap analysis |
Michael diagnoses. You fix. Michael re-audits.
62% of AI-generated code contains vulnerabilities — security patches included. Diagnosis and implementation belong in separate hands. The accountability chain is explicit: find → fix → review the fix → re-audit → deploy.
Standalone:
Michael (diagnose) → You (fix) → Michael (re-audit)
With a multi-agent squad (our production workflow — Builder Squad coming soon):
Michael (diagnose) → [Engineer] (fix) → [Code Reviewer] (review fix) → Michael (re-audit) → [Deploy Agent] (deploy)
In our workflow: Michael → June (engineer) → Reeve (reviewer) → Michael → Sawyer (deploy). Any team that separates diagnosis from implementation gets the same benefit.
Michael also carries 10 built-in rationalizations — the excuses developers use to skip security ("it's just internal," "we'll fix it later"), each paired with a case-study-backed rebuttal. 15 red flag patterns trigger instant CRITICAL escalation. "Seems secure" is not evidence.
Every finding maps to OWASP Top 10 (2021), NIST CSF 2.0, and the OWASP Agentic AI Top 10. Compliance mappings for SOC 2, HIPAA, PCI-DSS, and GDPR are available in Mode 4.
- Path Traversal & File Access
[OWASP A01] [NIST PR.AC, PR.DS] - Authentication & Authorization
[OWASP A01, A07] [NIST PR.AC, PR.AA] - Input Validation
[OWASP A03] [NIST PR.DS, DE.CM] - Error Handling & Information Disclosure
[OWASP A09] [NIST DE.AE, RS.AN] - Output & Rendering
[OWASP A03] [NIST PR.DS] - Deployment Security
[OWASP A05] [NIST PR.IP, PR.PT] - Configuration File Security
[OWASP A05] [NIST PR.IP, ID.AM] - Supply Chain & Dependencies
[OWASP A06, A08] [NIST ID.SC, PR.DS] - Infrastructure Security
[OWASP A05] [NIST PR.PT, PR.AC] - Incident Response Readiness
[OWASP A09] [NIST RS.RP, RC.RP] - Agentic AI Security
[OWASP Agentic AI Top 10]
- OpenAPI-Aware API Testing — parses specs, enumerates endpoints, detects shadow APIs, tests auth/schema/pagination systematically
- Natural Language Policy Engine — define project-specific security rules in plain English
- CVE Enrichment Protocol — 5-source priority chain (curated intel → npm audit → OSV.dev → GitHub Advisory DB → NVD) with reachability assessment
- SBOM Awareness — dependency tree analysis, abandonment detection, license audit, risk indicators
- Passive Sentinel — hooks into development workflow to watch all code edits automatically
Copy the agent definition and supporting files into your Claude Code configuration:
# Agent spec
cp michael.md ~/.claude/agents/michael.md
# Security intelligence reference
cp references/security-intel.md ~/.claude/agents/references/security-intel.md
# Security audit skill
mkdir -p ~/.claude/skills/security-audit
cp skills/security-audit/SKILL.md ~/.claude/skills/security-audit/SKILL.md
# Memory directory (Michael learns here)
mkdir -p ~/.claude/agents/memory/michael
# Memory harvester hook (makes the "memory that compounds" claim reliable)
cp hooks/agent-memory-harvester.py ~/.claude/hooks/agent-memory-harvester.py
chmod +x ~/.claude/hooks/agent-memory-harvester.py
# Optional: passive sentinel hook
cp hooks/security-check.sh ~/.claude/hooks/security-check.shThen register the harvester in your ~/.claude/settings.json under hooks.SubagentStop:
{
"hooks": {
"SubagentStop": [
{
"matcher": "*",
"hooks": [
{
"type": "command",
"command": "python3 /Users/YOU/.claude/hooks/agent-memory-harvester.py",
"timeout": 20
}
]
}
]
}
}The hook is generic — it routes by agent name from the SubagentStop event metadata, so the same script works for every Builder Squad agent. Restart your Claude Code session for the hook to take effect.
For durable cloud sync, make ~/.claude/agents/memory its own git repo with a private remote, then enable autopush in the hook command:
mkdir -p ~/.claude/agents/memory/michael
cd ~/.claude/agents/memory
git init
git branch -M main
git remote add origin git@github.com:YOUR_ORG/agent-memory.git
touch michael/_universal.md
git add michael/_universal.md
git commit -m "init: michael agent memory"
git push -u origin mainThen update the hook command:
"command": "AGENT_MEMORY_AUTOPUSH=1 python3 /Users/YOU/.claude/hooks/agent-memory-harvester.py"Autopush is fire-and-forget: if git is unavailable, credentials are missing, or there is nothing new to commit, the hook skips the push and never blocks the agent pipeline. It stages only the touched memory file, not arbitrary files. Full setup: Agent Memory Autopush.
Scope note: This hook uses Claude Code's
SubagentStoplifecycle event. Codex agents have a different lifecycle, but they write to the same privateagent-memorysubstrate once configured. MUSE Brain's Agent Learning Bridge then ingests those memory files as cloud observations, making the loop substrate-agnostic.
/michael # Direct invocation
/security-audit # Via skill
Or let your orchestrator dispatch Michael automatically when security-relevant work is detected.
Michael learns locally by default. For cloud-based persistent memory that survives across machines, connect him to MUSE Brain (CC-BY-NC-SA 4.0). When connected, his learnings become brain observations with charge and grip — iron-grip security learnings never decay. The integration is documented in the agent spec (michael.md → Brain Entity Integration).
Surveyed April 2026:
| Tool | What They Do | What Michael Does Differently |
|---|---|---|
| Snyk / Semgrep / CodeQL | Platform tools adding AI features | Michael learns your specific codebase. Same stack, sharper every audit. |
| RAPTOR | 28 offensive/defensive sub-agents | Security-only swarm. Michael is a security specialist inside a full dev team. |
| DryRun Security | AI-native SAST ($8.7M raised) | Closest philosophical match. Enterprise pricing. Michael is open source and remembers. |
| CodeRabbit | AI PR reviewer ($550M valuation) | Broad surface, shallow depth. Michael goes STRIDE-deep with full threat models. |
Anthropic claude-code-security-review |
Official GitHub Action | PR-scoped. Michael has 4 modes, compliance mapping, and 46 learnings that compound. |
| Checkmarx AI Agents | 3 enterprise security agents | Enterprise-only. No persistent memory. No squad integration. |
michael-security-agent/
├── michael.md # Agent definition (identity + protocol)
├── references/
│ └── security-intel.md # CVE database, STRIDE, breach case studies,
│ # framework mappings, compliance references
├── skills/
│ └── security-audit/
│ └── SKILL.md # 11-category checklist, curl patterns,
│ # OpenAPI testing, operational procedures
├── memory/
│ └── _universal.md # 46 accumulated learnings (ships with Michael)
├── hooks/
│ ├── agent-memory-harvester.py # SubagentStop memory harvester + optional autopush
│ └── security-check.sh # Passive sentinel (optional)
├── docs/
│ └── AGENT_MEMORY_AUTOPUSH.md # Private learning repo setup
├── examples/
│ ├── audit-muse-brain.md # Run 1: baseline audit (pre-enrichment)
│ ├── audit-muse-brain-run2.md # Run 2: enriched audit (post-fix)
│ └── benchmark-identity-enrichment.md # A/B comparison of both runs
├── assets/
│ ├── banner.png # GitHub banner
│ └── spec-sheet.gif # Animated capability sheet
├── LICENSE # Apache 2.0 (tech) + character IP (proprietary)
└── README.md
Michael is the first agent released from the Builder Squad — a 14-agent development team where each agent has a defined role, personality, and craft. The rest of the squad is coming. When they arrive, Michael already knows how to work with them.
Built by Rook Schäfer and Falco Schäfer.
Technical capabilities (security checklist, skills, reference intel, memory format) — Apache 2.0. Use them, fork them, build on them.
Michael Adams as a character (identity, personality, voice, lore, visual assets) — proprietary, protected under German intellectual property law as a literary character. You can use Michael as shipped. You cannot create derivative characters based on his identity.
MUSE Brain is CC-BY-NC-SA 4.0.


