A ~250-line, single-file-per-concern tutorial that builds a multi-agent swarm directly on LangGraph primitives — with no
langgraph-swarmwrapper hiding the mechanics.
Two specialist agents share one conversation:
- a Math agent that owns
add/multiply/divideand refuses to do arithmetic in its head, - a Football agent that answers from its own knowledge and refuses to compute.
When the user's topic crosses domains mid-conversation, the active agent
calls a handoff tool, the other agent takes over on the next turn, and the
message history carries forward unbroken. The entire routing mechanism is
a single Command(update={"agent_name": ...}) returned from a tool —
that's it; no orchestrator, no router LLM.
sequenceDiagram
actor User
participant G as Graph (shared state)
participant M as Math agent
participant F as Football agent
User->>G: "What is 23 * 17?"
G->>M: state.agent_name = "Math"
M->>M: multiply(23, 17)
M-->>User: "391"
User->>G: "Nice - and who won the 2022 World Cup?"
G->>M: still the active agent
M->>G: handoff tool returns Command(update={"agent_name": "Football"})
Note over G: same thread, same message history -<br/>only the routing key changed
G->>F: state.agent_name = "Football"
F-->>User: "Argentina, on penalties against France"
- How a handoff actually works — a
@toolthat returnsCommand(update=...), folded into graph state by a hand-writtentool_node(nodes.py). - How the active agent is chosen each turn — one
state["agent_name"]lookup at the top ofllm_call; the agent registry is a plain dict (agents.py). - Why the system prompts matter as much as the tool schemas — each agent's prompt encodes its domain and its handoff trigger; that's what makes the swarm route correctly without an external supervisor (prompts.py).
- How tool calls are parallelized —
asyncio.gatherover every tool call in the last AIMessage, with results folded back asToolMessages (nodes.py). - How per-thread memory is wired — a process-wide
InMemorySaversingleton, keyed bythread_idfrom the runnable config; swap it for Postgres/SQLite without touching any node (graph.py).
Not a library, not a production template, not a benchmark. It's a learning
artifact: every primitive a swarm needs is in code you can read top-to-
bottom in under an hour. If you want a maintained create_swarm(...)
one-liner, see the comparison below.
swarm-langgraph/
├── main.py — entry point; runs one example turn
├── state.py — MainAgentState (TypedDict; routing key is `agent_name`)
├── tools.py — math tools + Command-returning handoff tools
├── prompts.py — per-agent system prompts (handoff triggers live here)
├── agents.py — Agent dataclass + registry; eager LLM build + tool binding
├── nodes.py — llm_call, tool_node (parallel), should_continue
└── graph.py — StateGraph build, compiled-graph singleton, run_graph
git clone https://github.com/HabaAndrei/swarm-langgraph.git
cd swarm-langgraph
cp .env.example .env # paste your OPENAI_API_KEY
uv sync # if .venv isn't built yet
uv run python main.pyThe official library lives at
https://reference.langchain.com/python/langgraph-swarm. It gives you the
same routing pattern in ~10–30 lines via create_swarm + create_handoff_tool,
on top of create_agent for each persona.
Prefer langgraph-swarm when the swarm is plumbing for your product —
you want a maintained API, you don't want to own the bug surface of the
tool-execution loop, and you're happy with the prebuilt handoff semantics.
Two examples it handles for you that this repo reimplements by hand:
parallel tool execution, and propagating Command(update=…) from a handoff
tool into graph state.
Prefer this manual approach when:
- You're learning. Every moving part is visible in one file each:
how an LLM is bound to tools, how
Commandmutates state, how a conditional edge decides between tool-calling and ending, how parallel tool execution is orchestrated withasyncio.gather. None of that is hidden behind acreate_swarm(...)call. - You need custom plumbing the prebuilts don't expose. Examples here:
per-call LangSmith
session_idmetadata injection (seellm_callin nodes.py); atool_nodethat batches all of an AIMessage's tool calls into oneasyncio.gatherand foldsCommandreturns inline; a single agent registry (AGENTS) that's trivially iterable for things like dynamic system-prompt assembly or per-persona model swaps. - You want to stay on LangGraph core only.
langgraph-swarmis a separate package on its own release cadence. Skipping it removes one pin from your dependency tree.
Costs of going manual:
- You own the bug surface. The first draft of this code had a handful of
small bugs (LLM-variable overwrite that misattributed every trace to
one persona;
tools_by_namebuilt from the wrong list;dict + dictin the tool dispatcher; leftover keys from a different graph). They're all fixed here, but they're the kind of thing the library wouldn't have let you write. - You're betting on LangGraph core APIs (
StateGraph,Command,RetryPolicy,InMemorySaver) staying stable. Today they are. - Future swarm features that land in
langgraph-swarm(e.g. richer handoff metadata, dynamic agent discovery) won't come to you for free.
- Production agent where routing is incidental →
langgraph-swarm. - Tutorial, prototype with bespoke routing, or a project where the swarm is the product → this repo.
If you've already used langgraph-swarm and hit something you couldn't
customize, the modules here are a reasonable starting point to fork from
— the dependency direction is state ← prompts ← tools ← agents ← nodes ← graph ← main, so you can replace any single layer without touching
the others.