Launchers → Orchestration → Workflows → SOPs → Tools & Primitives (+ Guardrails)
This repository documents a layered architecture pattern for orchestrating AI-assisted workflows in Claude Code. It's not a framework to install — it's a thinking framework to internalize.
We name each layer by its conceptual role and give the Claude Code primitive as the current implementation. The pattern is harness-agnostic: the concepts survive a switch to another agentic tool; only the implementation artifacts change.
Want a compact reference? See the handy gist — a condensed version you can feed directly to your AI agent.
The architecture separates how to invoke the stack from what a user asks for from how to sequence the work from how to do one thing well from mechanical execution. Five layers plus a cross-cutting dimension. Each making you think harder about the right abstraction level.
┌──────────────────────────────────────────────────────────────────────┐
│ Layer 4: Launchers (e.g. Justfile / Makefile / run.sh / Python) │
│ Management scripts that invoke `claude` with specific │
│ flags (--plugin-dir, --agent, --settings, -p, ...). │
│ Make the stack reproducibly callable from cron/CI/aliases. │
├──────────────────────────────────────────────────────────────────────┤
│ Layer 3: Orchestration (e.g. Custom Commands — .claude/commands/) │
│ "When and in what sequence" — user-facing /slash-commands. │
│ Thin. 5–15 lines. Orchestrate, don't implement. │
├──────────────────────────────────────────────────────────────────────┤
│ Layer 2: Workflows (e.g. Custom Agents — .claude/agents/*.md) │
│ "How to sequence capabilities" — specialist pipelines. │
│ Sequence SOPs, handle errors, ask at decision points. │
├──────────────────────────────────────────────────────────────────────┤
│ Layer 1: SOPs / Capabilities │
│ (e.g. Skills — .claude/skills/*/SKILL.md) │
│ Standard Operating Procedures — what Claude can do, │
│ documented as reusable capabilities. One SOP = one thing │
│ done well. Can bundle L0 tools. │
├──────────────────────────────────────────────────────────────────────┤
│ Layer 0: Tools & Primitives (e.g. Scripts — scripts/*.sh, *.py) │
│ Deterministic substrate. No AI. Testable standalone. │
│ Instrumentable for telemetry. "Below the AI." │
├──────────────────────────────────────────────────────────────────────┤
│ Bonus : Guardrails (e.g. Hooks — .claude/settings.json) │
│ Cross-cutting enforcement. A dimension, not a layer. │
└──────────────────────────────────────────────────────────────────────┘
Numbering direction. We number bottom-up: L0 is the foundation, L4 is the outermost entry point. This matches IndyDevDan's original framing (he numbers Skills=1, Agents=2, Commands=3, Justfile=4) while still giving scripts their own explicit tier as L0. See docs/concepts-vs-implementation.md for the full mapping.
"Don't outsource learning how to build with the most important technology of our lifetime, agents. […] Agentic engineers know what their agents are doing, and they know it so well, they don't have to look. Vibe coders don't know, and they don't look. Don't outsource learning."
— IndyDevDan, My 4-Layer Claude Code Playwright CLI Skill (video; full excerpt and transcript in
docs/reference-cache/indydevdan/)
This plugin is designed to be food for your thinking, not a substitute for it. Its commands ask you to articulate before they act. Its hooks nudge, but never block. Its journal records your reasoning so you — not a stateless agent — own the design.
The teach-back system (/four-layer-architecture:feynman-check,
/four-layer-architecture:socratic,
/four-layer-architecture:brainstorm-architecture) scaffolds two kinds of
understanding:
- Your own system, in your domain — can you explain how your billing workflow sequences its skills? What breaks if a specific step fails?
- The 4-layer meta-pattern — do you actually know the difference
between
context: forkand theskills:field? Between a hook that replaces a skill and one that augments it?
If you find yourself wanting to skip every teach-back prompt, that is a signal worth taking seriously: either the prompt is wrong, or the part of your architecture it's pointing at is the part you understand least.
See docs/philosophy.md for the long-form argument
— cognitive telescope, not prompt jockey.
Not because it makes AI do more for you. Because it makes you think more clearly.
Every time you decide "this belongs in a skill, not a script" or "this command is too thick — the logic should be in the agent" — you're exercising architectural judgment.
The developers who thrive with AI agents won't be the ones who copy-paste prompts. They'll be the ones who design systems — who understand separation of concerns so deeply that they can decompose any workflow into the right layers.
This repo is here to help you get there.
TODO/WIP — After we finish polishing this repo, this section will contain a one-liner to bootstrap the architecture in your project. For now, read on and build understanding first. That's the point.
If you want to jump straight to the illustrative examples that live in this repo (yes, the repo eats its own dog food), after install (see below):
/four-layer-architecture:brainstorm-architecture ← recommended first run
/four-layer-architecture:review-my-architecture ← Socratic audit pipeline
/four-layer-architecture:feynman-check [topic] ← you explain, the plugin checks
/four-layer-architecture:socratic [topic] ← the plugin probes, you answer
/four-layer-architecture:explain-layer <file> ← classify a file by layer
The first one is the best starting point: it scans your project, shows a two-pane menu (your own system on one side, the 4-layer meta-pattern on the other), and routes you into either a Feynman-deep or a Socratic round.
The audit command invokes an L3 Orchestration → L2 Workflow → L1 SOP → L0 Tool pipeline that asks Socratic questions about your design decisions. It doesn't give answers. It develops your understanding.
How the chain works technically:
- The command uses
context: fork+agent: socratic-reviewerto launch a subagent - The subagent's
skills: [architecture-audit]preloads the skill content - The skill references its bundled
scripts/scan-layers.sh - Each layer delegates downward — no upward dependencies
Namespacing: every command this plugin adds lives under the
four-layer-architecture: prefix after install. That prevents collisions
with other plugins you may have enabled and makes the provenance of each
command obvious.
The components in plugins/four-layer-architecture/ are packaged as a proper Claude Code plugin with a .claude-plugin/plugin.json manifest and a marketplace index at .claude-plugin/marketplace.json. Pick whichever install method fits your workflow.
Option 1 — Ad-hoc, per-invocation (no install):
# From a clone of this repo:
claude --plugin-dir ./plugins/four-layer-architecture
# Or point at an absolute path from anywhere:
claude --plugin-dir /path/to/agentic-4layer-architecture/plugins/four-layer-architectureThis loads the plugin for a single Claude Code session — no registration, no persistence. Useful for trying it out or for L4 launcher scripts that want to pin exactly this plugin.
Option 2 — Install via marketplace (persistent):
/plugin marketplace add CLIAI/agentic-4layer-architecture
/plugin install four-layer-architecture@agentic-4layer-architecture
Inside Claude Code, /plugin marketplace add accepts a GitHub owner/repo shortcut, a full git URL, or a local path. Once added, /plugin install pulls the specific plugin from that marketplace by name.
Option 3 — Copy-paste (learn by doing):
Clone the repo and copy plugins/four-layer-architecture/{commands,agents,skills,hooks} into your own project's .claude/ directory. This is the learning path — by the time you've done it, you'll understand every layer. See docs/ecosystem.md for the progression from local .claude/ → plugin → marketplace.
Orchestration (L3) via Custom Commands delegates to Workflows (L2) implemented as Custom Agents, which compose SOPs / Capabilities (L1) implemented as Skills, which bundle Tools & Primitives (L0) — deterministic scripts — for:
- Improved command-line ergonomics
- Instrumentation for telemetry
- Safety assertions and permission checks
- Background work that doesn't need AI reasoning
Launchers (L4) — justfiles, Makefiles, run.sh, Python entrypoints — sit
on top and invoke claude with the right flags (--plugin-dir, --agent,
--settings, -p, …) so the stack is reproducibly callable from cron, CI,
or a teammate's laptop.
Guardrails (Bonus) — Hooks — are the enforcement dimension, injecting validation and automation whenever the other layers aren't sufficient.
When an L1 SOP bundles an L0 tool (a script), something powerful happens:
- The script is testable standalone —
./scripts/scan-layers.sh /path/to/projectworks without Claude Code - The script is instrumentable — add logging, metrics, timing without touching AI prompts
- The script is auditable — security review a bash script, not a probabilistic prompt
- The script is fast — no token cost, no API latency, just execution
- The script sets boundaries — mechanical work stays mechanical
This is where many agentic architectures fail: they put everything in prompts. The 4-layer pattern forces you to ask: "Does this need AI reasoning, or is it just filesystem traversal with a fancy wrapper?"
Understanding this architecture requires hands-on practice with the underlying primitives. Don't just read about it — build with it.
- IndyDevDan: 4-Layer Architecture — Skills → Agents → Commands → Reusability
The video that crystallizes this pattern through Playwright browser automation.
Watch how Dan builds capabilities layer by layer, from raw scripts up to
orchestrated multi-agent workflows.
- Companion reference notes — maps Dan's concepts to Claude Code primitives
- Claude Code Overview — start here (also available as llms.txt)
- CLI reference (local cache) — flags that L4 Launchers use:
--plugin-dir,--agent,--settings,--mcp-config,-p,--bare,--permission-mode,--max-turns,--max-budget-usd - Skills — L3 Orchestration & L1 SOPs: commands have been merged into skills. Use
context: fork+agentto delegate to subagents - Custom Subagents — L2 Workflows: YAML frontmatter with
tools,skills,memory,hooks - Hooks — Guardrails (Bonus): nested format with
type: command/http/prompt - Agent Teams — Multi-agent coordination across separate sessions
- Plugins — Package and distribute skills, agents, and hooks
- Building, Bundling, and Distributing — From local components to plugins, marketplaces, and cross-harness portability
- Cognitive Horizons — David Shapiro's "I was the bottleneck, not the AI" (transcript, glossary, diagrams)
- docs/philosophy.md — our take on designing with AI, not delegating to it
agentic-4layer-architecture/
├── README.md ← You are here
├── AGENTS.md ← Instructions for AI agents in this repo
├── docs/
│ ├── architecture.md ← Deep dive: the 4 layers + hooks
│ ├── wiring-the-chain.md ← HOW each layer delegates to the next (frontmatter fields)
│ ├── philosophy.md ← Why design matters more than prompting
│ ├── examples.md ← Concrete pattern applications
│ ├── hooks-as-guardrails.md ← Hooks deep dive
│ ├── ecosystem.md ← Building, bundling, distributing as plugins
│ ├── references.md ← All links, resources, further reading
│ ├── tricky-corners.md ← Curated catalog the Socratic command drills on
│ ├── design/ ← Per-topic design docs (frontmatter-dated)
│ ├── understanding/ ← Per-layer curated notes promoted from teach-back rounds
│ └── reference-cache/ ← Snapshots of upstream docs + IndyDevDan transcript
├── .claude-plugin/
│ └── marketplace.json ← Marketplace index (enables `/plugin marketplace add`)
├── plugins/
│ └── four-layer-architecture/ ← Installable Claude Code plugin
│ ├── .claude-plugin/
│ │ └── plugin.json ← Plugin manifest
│ ├── commands/ ← L3 Orchestration (thin)
│ │ ├── review-my-architecture.md ← Socratic architecture audit pipeline
│ │ ├── explain-layer.md ← Layer classification guide
│ │ ├── brainstorm-architecture.md ← Guided entry (two-pane menu)
│ │ ├── feynman-check.md ← You explain, plugin checks
│ │ └── socratic.md ← Plugin probes, you answer
│ ├── agents/ ← L2 Workflows
│ │ ├── socratic-reviewer-agent.md ← Audit dialogue
│ │ ├── layer-guide-agent.md ← Layer classifier
│ │ ├── teach-back-coach-agent.md ← Runs Feynman rounds
│ │ └── socratic-probe-agent.md ← Runs Socratic rounds
│ ├── skills/ ← L1 SOPs / Capabilities
│ │ ├── architecture-audit/ ← Audit SOP + scan-layers.sh
│ │ ├── layer-explainer/ ← Layer classification knowledge
│ │ ├── feynman-protocol/ ← Feynman lite + deep rules
│ │ ├── socratic-protocol/ ← Socratic dialogue rules
│ │ ├── system-scan/ ← SOP + scan-system.sh + detect-change.sh (L0)
│ │ └── teach-back-journal/ ← Journal format + promotion flow
│ ├── hooks/ ← Guardrails + advisory nudges
│ │ ├── hooks.json ← Plugin hook registration (uses ${CLAUDE_PLUGIN_ROOT})
│ │ ├── validate-no-force-push.sh ← PreToolUse guardrail
│ │ ├── check-frontmatter.sh ← PostToolUse guardrail
│ │ ├── check-prerequisites.sh ← SessionStart + first-use nudge
│ │ └── suggest-teach-back.sh ← Advisory nudge on structural edits
│ └── templates/
│ └── four-layer-architecture.local.md.example ← Per-project settings template
├── .claude/
│ └── settings.json ← Project-local hook wiring for contributors working in the repo
└── LICENSE
Every file in plugins/four-layer-architecture/ is both documentation and working code.
Five pipelines demonstrate the pattern in action:
/four-layer-architecture:review-my-architecture— Socratic audit of your project's 4-layer compliance/four-layer-architecture:explain-layer <file>— Explains which layer a file belongs to and traces the chain/four-layer-architecture:brainstorm-architecture— Guided entry: two-pane menu → Feynman-deep or Socratic/four-layer-architecture:feynman-check [topic]— You explain your system; the plugin checks against code + docs/four-layer-architecture:socratic [topic]— The plugin runs a 3–5 turn probing dialogue on one tricky topic
Launchers say how to start. Orchestration says what the user asked for. Workflows say when and in what order. SOPs say how to do each thing well. Tools & Primitives just do it. Guardrails make sure everyone behaves.
If that sentence makes sense to you, you understand the pattern. If it doesn't yet — the architecture deep-dive and the concepts vs implementation mapping will get you there.
It's tempting to use this architecture to just... let AI do everything. Copy-paste a command structure, never think about why the layers exist, treat it as a recipe.
Don't.
The value isn't in the commands you create. It's in the architectural decisions you make while creating them. Every "should this be a skill or a script?" question develops your judgment. Every "is this agent too coupled to one pipeline?" question grows your design sense.
IndyDevDan didn't become effective with Claude Code by copying someone's
.claude/ directory. He became effective by understanding the primitives
deeply — reading the docs, experimenting with frontmatter options,
discovering what tools and skills do to agent behavior, learning when
hooks are the right tool vs when skills are sufficient.
That's the path. There are no shortcuts worth taking. See docs/philosophy.md for the full argument.
This is a living document. If you've applied the 4-layer pattern and discovered something worth sharing — patterns, anti-patterns, examples, hard-won insights — contributions are welcome.
The best contributions won't be "here's my .claude/ directory."
They'll be "here's what I learned about decomposition while building X."
MIT — Use freely. Think deeply. Grow intentionally.
Built with the conviction that the best AI tooling makes humans sharper, not lazier.