A Claude Code Agent Skill that audits a code repository for architectural and design patterns. It walks the tree, builds a multi-language import graph, runs a battery of structural detectors against a 50-pattern catalog, grades each finding against a five-criterion rubric, and emits two artifacts:
architecture-audit.md— human-readable reportarchitecture-audit.json— machine-readable findings
The skill is offline, polyglot, and intentionally conservative: it refuses to claim a pattern without two independent structural signals, requires file:line evidence for every "Incorrect" verdict, and runs a self-check on the rendered report.
| Tool | What it does | What it doesn't |
|---|---|---|
| ArchUnit / ts-arch / NetArchTest | Enforces architecture rules you have already written | Does not discover them |
| ast-grep / Semgrep | Matches one structural pattern at a time | Doesn't aggregate hits into a verdict per architectural pattern |
| CodeScene / Arcan | History-driven smell detection | Online / commercial / JVM-focused |
| LLM-only "Codebase Analyzer" skills | Narrative summary | No machine-readable output, no graded findings, no per-file evidence |
This skill sits at the intersection of all four: discovery + grading + macro/meso/micro stratification + offline + scriptable. See references/prior-art.md for the full comparison.
Detection coverage and import-graph resolution for:
Python · TypeScript / JavaScript · Java · Kotlin · Scala · Groovy · Go · Rust · C# · Ruby · PHP · Swift · Dart
Each language has an idiomatic-signals cheatsheet under references/language-cheatsheets/.
Ask Claude: "audit the architecture of this repo", "what patterns does this codebase use?", "is this hexagonal/clean/layered/MVC?", "find architectural anti-patterns". The skill is auto-discovered if installed under ~/.claude/skills/ or a project's .claude/skills/.
The scripts are independently runnable. Typical sequence:
ROOT=/path/to/target/repo
OUT=$ROOT/.architecture-audit
python scripts/scan_repo.py --root "$ROOT" --mode quick --out "$OUT"
python scripts/build_import_graph.py --root "$ROOT" --out "$OUT/graph.json"
python scripts/detect_layers.py --graph "$OUT/graph.json" --out "$OUT/layers.json"
python scripts/detect_patterns.py --root "$ROOT" --inventory "$OUT/inventory.json" \
--graph "$OUT/graph.json" --layers "$OUT/layers.json" \
--out "$OUT/findings.json" --mode quick
python scripts/render_report.py --findings "$OUT/findings.json" \
--template templates/audit-report.md.j2 \
--out "$ROOT/architecture-audit.md" \
--out-json "$ROOT/architecture-audit.json"Modes:
--mode quick— folder/name heuristics + ripgrep, no graph traversal beyond cycle finding. Defaults the agent uses for first-pass triage.--mode deep— full per-file structural checks; same scripts, slower run.
| Tool | Required? | Behaviour if missing |
|---|---|---|
python3 ≥ 3.9 |
Yes | Skill cannot run |
ripgrep (rg) |
Strongly recommended | Falls back to a slower pure-Python grep that respects the same ignore list |
networkx (Python package) |
Recommended | Falls back to a DFS cycle finder bounded at depth 8 |
jinja2 (Python package) |
Recommended | Falls back to an inline templating engine that supports the bundled template's syntax |
tokei |
Optional | LOC counted with a naive line count |
ast-grep |
Optional | Structural queries fall back to regex via ripgrep |
Every degradation is recorded in findings.json under runtime.degradations and surfaced in the report's "Tooling" appendix.
- Macro (system shape): Monolith, Modular Monolith, Microservices, Event-Driven, Hexagonal / Ports & Adapters, Clean / Onion, Layered N-tier, CQRS, Event Sourcing, Serverless, BFF, Micro-frontends, Plugin Architecture, Pipes & Filters, Actor Model, SOA, Space-based.
- Meso (component): MVC, MVVM, MVP, Repository, Unit-of-Work, Service Layer, Domain Model (rich), DDD Aggregate, Saga, Outbox, Transactional Inbox, Circuit Breaker, Bulkhead, Sidecar, Ambassador, Strangler-Fig, API Gateway, BFF (component view), Feature Flags, Specification, CRUD.
- Micro (GoF / idiom): Factory, Builder, Strategy, Observer, Decorator, Adapter, Facade, Command, Chain of Responsibility, Dependency Injection, Singleton, Template Method.
Anti-patterns the catalog covers: Big Ball of Mud, Distributed Monolith, Anemic Domain Model, God Object / God Module, Circular Dependency, Layering Violation, Leaky Abstraction, Shotgun Surgery, Feature Envy, Divergent Change, Chatty Service, Service Locator (in DI codebases), Singletonitis, Hub-like Dependency, Unstable Dependency, Cyclic Hierarchy, Implicit Cross-Module Dependency, Unrestricted ORM Leak, Sync-Over-Async, Vendor-Locked Core, RPC-Over-Broker.
The active detectors implement the most load-bearing of these (see scripts/detect_patterns.py); the catalog files are the source of truth for what could be detected and what would qualify as a finding.
architectural-pattern-auditor/
├── SKILL.md # YAML frontmatter + workflow + output contract
├── README.md # this file
├── LICENSE # MIT
├── references/
│ ├── prior-art.md # comparison with ArchUnit, ast-grep, Semgrep, CodeScene, Arcan, Claude skills, …
│ ├── patterns-catalog.md # ~50 patterns: intent, structural fingerprint, graph fingerprint, detector recipe
│ ├── anti-patterns-catalog.md # 21 anti-patterns with severity floors
│ ├── evaluation-rubric.md # 5-criteria scoring → verdict bands + confidence/severity formulas
│ └── language-cheatsheets/ # idiomatic signals per language
│ ├── csharp.md dart.md go.md java.md php.md
│ ├── python.md ruby.md rust.md scala.md swift.md
│ └── typescript.md
├── scripts/
│ ├── scan_repo.py # inventory + per-file role guess + framework detection
│ ├── build_import_graph.py # multi-language import resolver, JVM Gradle/Maven-aware, cycle finder
│ ├── detect_layers.py # folder → layer assignment + upward-edge counting
│ ├── detect_patterns.py # per-pattern detector runner + rubric application
│ └── render_report.py # Jinja2 (with inline fallback) renderer + self-check
├── templates/
│ └── audit-report.md.j2 # 10-section report skeleton
└── assets/
└── example-output.md # realistic worked example
Before the renderer accepts a findings file it verifies:
- Every
Incorrectverdict has ≥1 evidence entry withfileandlines - Every
Correct/Partialverdict has ≥2 entries insignals_confirming - No finding has
confidence > 0.5with a single signal - Every recommendation starts with an imperative verb (
Extract,Move,Invert,Replace,Remove,Introduce,Split,Merge,Co-locate,Forbid,Maintain,Keep,Preserve,Extend,Confirm,Centralise) - Every recommendation is one sentence (≤ ~30 words)
self_check_issues is printed to stdout after every render. Treat any entry there as a detector bug, not as a quality issue with the audited repo.
- Cycle finding without networkx is slow. The fallback DFS visits up to depth 8 across all nodes; on a 6 k-node graph this can take several minutes. Install
networkxto drop it to seconds. - C# / Swift / Ruby resolvers are partial. C# uses a namespace declaration scan; Swift modules are not resolved (link-time only); Ruby resolves only relative
require_relative. PRs welcome. - MVC, Layered, and Service Layer verdicts are still repo-wide. Per-subtree splitting is on the roadmap; today these can over-claim on polyglot monorepos with a single MVC frontend.
- Spring Boot DI is now detected via constructor-injection plus
@Service/@Componentannotations; older codebases relying on field injection (@Autowiredon fields) are detected, but pure XML-based wiring is not. - History-augmented anti-patterns (Shotgun Surgery, Unstable Dependency, Implicit Cross-Module Dependency) require
.gitand are only attempted in--deepmode against a non-shallow clone.
The skill follows the Agent Skills frontmatter spec. version in SKILL.md is bumped on every change to either the output schema or the detector contract. Adding a new pattern or language cheatsheet does not bump the version.
Issues and PRs welcome. Two areas where contributions are highest-value:
- New language cheatsheets with concrete detector recipes (one file per language; the existing ones are a template).
- Per-subtree detection in macro patterns so the same repo can be
Layeredfor one bounded context andHexagonalfor another without averaging.