Own your psychological twin — don't rent it. A local, file-based container that models a real person — their decisions, preferences, and the affect behind them, with bi-temporal validity and supersession — and an OKF v0.1-native knowledge graph that AI agents read and write natively. One SQLite file, zero dependencies, full data ownership, multi-device sync via git.
中文文档 · Architecture · SKILL.md
git clone https://github.com/stancsz/second-brain.git
cd second-brain
python3 scripts/brain_cli.py add "First note" "It works." # creates ~/.secondbrain/brain.db
python3 scripts/brain_cli.py search "works"No pip install, no Docker, no cloud account, no API key — just Python 3.8+ (standard library only).
Two ways to give your agent memory — same brain, same files:
- 🧩 As a native Claude Code skill — clone into
~/.claude/skills/and runinstall.sh. Your agent gains "remember this" / "what do I know about X" with auto-capture and proactive recall. → Use with Claude Code - 🔌 As a standalone MCP server — run
python3 scripts/brain_mcp.pyand it speaks the Model Context Protocol over stdio, so Claude Desktop, Cursor, Continue, or any MCP client can read and write the brain. Still zero dependencies. → Run as an MCP server
The digital-mind platforms want to rent you back to yourself. Delphi, Personal.ai, and the wave behind them will happily build an AI version of you — your voice, your decisions, your personality — and keep it on their servers, behind their API, on their pricing. Your psychological twin becomes a subscription. The day the terms change, the price jumps, or the company gets acquired, the model of who you are changes hands with it. That is the most personal data there will ever be, and the entire category is racing to host it for you.
second-brainbets on the other side: your twin is a folder of plain markdown in your own git. Every memory of the person — what they decided, what they prefer, how they felt, and when each of those was true — lives in files you can read, diff, version, and carry forever. No vendor owns the model of you. There is no migration plan, because there is no one to migrate from. Nobody else lets you own the psychological twin instead of renting it. That is the whole point.
It is important to own your data and own your knowledge. Your memory is your intellectual and emotional history — the decisions you made, the things you learned, the people you cared about, the way you changed. That record belongs to you, not to a platform.
second-brainkeeps it that way: plain files you can read, copy, version, and carry with you forever.
second-brain is a personal knowledge store designed to be read and written by AI agents as easily as by humans. Concepts (notes) are persisted as an OKF v0.1-conformant Bundle — a directory of markdown files with YAML frontmatter — backed by a SQLite cache at ~/.secondbrain/brain.db. Everything uses the Python standard library only — no pip install, no docker compose up, no cloud account.
Concepts are linked together through [[wikilinks]] in their content, building a knowledge graph automatically as you write. The store supports full-text search, typed relations, tags, collections, soft delete, and round-trip export/rebuild to Markdown. Multi-device sync works via a git remote (the Bundle is your source of truth; SQLite is a disposable cache). Psychological memory is native: Concepts carry optional subjects (who or what the memory is about), temporal validity (when it's true), affect (emotional valence/arousal), and supersession (later facts replace earlier ones).
The SKILL.md in this repository makes second-brain a drop-in Claude Code skill: any agent that loads the skill can save, search, and link Concepts in your brain during a conversation.
The category around you is racing to build your digital twin — and to host it. second-brain is the same ambition with the ownership inverted: the twin is yours, on disk, in your git. It serves two jobs that share the same files.
Point the auto-capture hooks at your real sessions and the brain quietly distills the durable signal of a person — their decisions, preferences, knowledge, and the affect behind them — into titled Concepts, never raw chat. What makes it a twin and not a pile of notes:
- Subjects (
sb_subject) scope every memory to who or what it's about, so a persona sub-graph is one query away. - Structured affect (
sb_affect) records the emotional valence/arousal/emotion behind a memory — the difference between knowing a fact about someone and knowing how they felt about it. - Bi-temporal validity (
sb_valid_from/sb_valid_to) + supersession mean the model of the person stays current as they change —supersede()closes the old window instead of deleting it — yet stays historically queryable:recall --as-of <date>reconstructs who they were on any past date. People are not snapshots; this models the trajectory.
Over time this accumulates into an owned, evolving container that models how a real human actually thinks, feels, and changes — the foundation an agent needs to faithfully shadow a specific person instead of impersonating a generic one. And because it's plain OKF markdown, you can open the model of yourself in a text editor and read it.
The same primitives, pointed at a fictional subject, give an agent a durable character: a consistent personality with its own history, preferences, and emotional coloring, persisted as files you own rather than a system prompt you re-paste every session.
The capture pipeline, the psychological fields, and the bi-temporal history are built and working today (see Features). A turnkey "mimic me" agent on top of them is the direction, not a finished product — and on raw mimicry fidelity, the well-funded cloud platforms are further along. What second-brain guarantees right now, and what none of them offer, is that the substrate is yours: the model of a person is never rented, never opaque, and never locked to a vendor that can change the terms or disappear.
| Where "you" live | You own the model? | Psychological structure | If they vanish / change terms | |
|---|---|---|---|---|
| Delphi / Personal.ai (digital-mind platforms) | Their cloud | No — hosted, per their terms & pricing | Yes, but proprietary and opaque | The model of you goes with them |
| Letta / MemGPT | Your infra or their cloud | Partially (self-hostable runtime) | Agent-state memory blocks — not a person-model | You keep a runtime, not a portable twin |
| Mem0 / Zep | Vendor cloud / API | No / partial | Bi-temporal graph (Zep), but built for agent memory | Vendor-controlled; Zep's community edition was deprecated |
| second-brain | Your git, plain markdown | Yes, fully | Subjects + bi-temporal validity + affect + supersession, in OKF you can read | Nothing changes — they're your files |
The hosted platforms lead on capability — voice, fidelity, scale. second-brain leads on the one axis that can't be added later: you own the twin. That intersection — owned, local, file-based, psychologically structured — is, as far as we can find, empty except for this project.
Most "AI memory" products store your data in a third-party cloud, behind an API, and behind a vendor that can change pricing, terms, or shut down at any time. Even local-first tools like Obsidian don't speak natively to agents — you end up with one tool for humans and a separate, paid API for AI.
second-brain is a deliberately minimal alternative:
- One file. A SQLite database you can open with any tool, copy with
cp, back up withrsync, version withgit. - Standard schema. The schema is checked in as
scripts/schema.sqland is plain SQL — no proprietary format, no migration service. - Agent-native. Every operation is a single CLI command. Agents read and write the brain with the same interface a human does.
- Zero dependencies. If you have Python 3.8+ and a SQLite build, you can run it.
- OKF Bundle as source of truth. Concepts (notes) are persisted as markdown files organized in an OKF v0.1-conformant Bundle. The SQLite database is a fully rebuildable cache — delete it anytime and
rebuildfrom the files losslessly. - Flat knowledge graph. Concepts carry tags, an optional collection, and typed relations. No folder hierarchy to maintain.
[[wikilinks]]. Cross-references are written in the body and resolved at write time — relations cannot drift from the text.- Pending links. Forward references to not-yet-existing Concepts are stored in an indexed table and promoted to real relations when the target is created.
- Full-text search. SQLite FTS5 with soft-delete awareness. Returns sub-100ms results on 50K Concepts.
- Soft delete by default.
deleteis reversible;delete --hardis permanent. Tombstones propagate over git sync. - Typed relations.
references,contradicts,expands,relatedwith optional strength. - Graph traversal. Recursive CTE-based traverse from any Concept.
- Multi-device sync via git. Export your brain as a Bundle, commit to git, pull/rebase/push, then rebuild on another device. Git is the only sync backbone; all other clouds are one-way mirrors. Works fully offline.
- Conflict parking. Concurrent edits to the same Concept on two devices park as
*.conflict.mdinstead of clobbering. Human resolves the conflict once; both edits preserved until then. - Import / export. Round-trip to JSON, Markdown (Obsidian-compatible, OKF-native), and CSV.
- Distill & archive. Goal-based filter (
distill --query "X") writes a focused working brain without touching the old one (pass--activateto swap). Cold-storage (archive --older-than-days 180) moves untouched Concepts out and VACUUMs the working brain.merge-brain --from <archive>brings them back. - Psychological memory foundation. Concepts carry optional subjects (who/what the memory is about; enables persona sub-graphs), temporal validity (
sb_valid_from/to; enables historical queries), affect (emotional valence/arousal/emotion type), and supersession (newer facts replace older ones). These fields ride in OKF frontmatter and enable emotion-aware recall, perspective-aware memory synthesis, and — taken together — an owned, evolving model of a real person rather than a flat pile of notes (see What it's for). - Logs stay logs; the brain stays clean. A
Stophook archives the full raw transcript of every session to plain files under~/.secondbrain/logs/— never into the brain. The brain (brain.db) holds only distilled know-how: titled Concepts the agent extracts at session end (decisions, preferences, facts, reusable knowledge), plus anything you explicitly save. Searching your knowledge never returns a wall of raw chat. - Proactive recall. A
UserPromptSubmithook searches the clean brain against each prompt and injects relevant Concepts into the agent's context before it answers — so you don't have to ask "what do I know about X". /historyslash command. Browse past conversations in your brain, then dive into the chosen one.- Standalone MCP server.
scripts/brain_mcp.pyexposes the brain over the Model Context Protocol (stdio, stdlib-only) so Claude Desktop, Cursor, Continue, and any MCP client can use it — same files as the CLI and skill. See Run as an MCP server. - Phase 2 (planned). Optional vector search via
sqlite-vecand selective encryption for private/psychological Concepts.
git clone https://github.com/stancsz/second-brain.git
cd second-brain
python3 scripts/brain_cli.py stats # first run creates ~/.secondbrain/brain.dbOptional, to invoke as brain:
ln -s "$(pwd)/scripts/brain_cli.py" /usr/local/bin/brain
# or
alias brain='python3 ~/path/to/second-brain/scripts/brain_cli.py'The only runtime requirement is Python 3.8+ with sqlite3 (included in the standard library). The schema uses FTS5, JSON1, and recursive CTEs; these are built into the Python-bundled SQLite since 3.9, otherwise SQLite 3.41+ is required.
To use it as a Claude Code skill (with auto-capture and the /history command), see Use with Claude Code below — or just run bash install.sh after cloning.
# Capture
python3 scripts/brain_cli.py add "RAG" "Retrieval-augmented generation" \
--collection AI --tags rag,llm
# Recall
python3 scripts/brain_cli.py search "RAG"
# Link (the [[RAG]] in content auto-resolves to a references relation;
# if RAG doesn't exist yet, it goes to pending_links and resolves on first match)
python3 scripts/brain_cli.py add "Vector Search" "See [[RAG]]" --collection AI
# Traverse the graph
python3 scripts/brain_cli.py related <id>
python3 scripts/brain_cli.py traverse <id> --depth 2
# Brain health
python3 scripts/brain_cli.py summary
# Distill a focused working brain (old brain stays as a point-in-time backup)
python3 scripts/brain_cli.py distill --query "RAG" --output focused.db --activate
# Cold-store untouched Concepts (180d+) and shrink the working brain
python3 scripts/brain_cli.py archive --output archive-2026.db --older-than-days 180
# Bring archived Concepts back
python3 scripts/brain_cli.py merge-brain --from archive-2026.db
# Browse past conversation logs (also available as the /history slash command)
ls -1t ~/.secondbrain/logs/*/*/*.jsonl 2>/dev/null | head
# Export an Obsidian-compatible vault (one .md file per note, into a directory)
python3 scripts/brain_cli.py export --format markdown --output ./brain-vaultThis repository is itself a Claude Code skill — SKILL.md defines triggers and behavior.
second-brain is agent-native, so the fastest install is to let the agent do it. Paste this into Claude Code (or any coding agent with shell access):
Install the second-brain skill from https://github.com/stancsz/second-brain
into my personal Claude Code skills. Cloning the repo does NOT configure the
skill — you MUST also run install.sh, then verify the Stop and PreCompact
hooks actually landed in ~/.claude/settings.json, then end-to-end test
add+search. Do not report success until all three are confirmed.
A common failure mode is agents stopping after git clone and the CLI smoke test, leaving auto-capture wired to nothing. The agent must complete every step below, in order:
-
Clone into the skills directory.
mkdir -p ~/.claude/skills git clone https://github.com/stancsz/second-brain.git ~/.claude/skills/second-brain
(Use
.claude/skills/second-braininstead for a single project.) -
Smoke-test the CLI — this also creates
~/.secondbrain/brain.dbon first run.python3 ~/.claude/skills/second-brain/scripts/brain_cli.py stats -
Run the installer to wire up the auto-capture hooks and the
/historycommand without clobbering existing settings.bash ~/.claude/skills/second-brain/install.shinstall.shmerges theStop/PreCompacthooks into yoursettings.json(it does not overwrite the file), symlinkscommands/history.md, and prints aSECONDBRAIN_CLIenv-var hint. -
Verify end to end.
python3 ~/.claude/skills/second-brain/scripts/brain_cli.py add "Install test" "second-brain is live" python3 ~/.claude/skills/second-brain/scripts/brain_cli.py search "live"
-
Reload the skill. Restart Claude Code (or start a new session) so it picks up the new skill, hooks, and slash command.
Verify the hooks are actually wired. Run this on your own to confirm — the agent's "I ran install.sh" is not the same as the hooks being present:
cat ~/.claude/settings.json | python3 -m json.tool | grep -A 4 -E '"(Stop|PreCompact)"'You should see a Stop and a PreCompact entry, each with a hooks array whose command ends in hooks/capture_conversation.py. If those entries are missing, auto-capture is not working yet — re-run bash ~/.claude/skills/second-brain/install.sh and check ~/.claude/settings.json directly.
After that, say "remember this" or "what do I know about X" in any session and the skill takes over.
SKILL.md defines the triggers and behavior; cloning the repo into a skills directory is all Claude Code needs. Three ways:
Project scope (one project):
mkdir -p .claude/skills
git clone https://github.com/stancsz/second-brain.git .claude/skills/second-brainPersonal scope (all your projects):
mkdir -p ~/.claude/skills
git clone https://github.com/stancsz/second-brain.git ~/.claude/skills/second-brainSubmodule (if you want to pin a version):
git submodule add https://github.com/stancsz/second-brain.git .claude/skills/second-brainOnce installed, the agent will catch phrases like "remember this", "what do I know about X", "catch me up on project Y", "记一下", "我之前写过 X 吗", and act on them using your brain.
The design splits cleanly in two: logs stay logs, the brain stays clean.
- Raw transcripts → log files. Every session is archived to
~/.secondbrain/logs/YYYY/MM/as plain JSONL. These are logs — browse them with/history,grepthem, delete them, git-ignore them. They never touch the brain. - Know-how → the brain. At session end the hook asks the agent to extract
the durable bits (decisions, preferences, facts, reusable knowledge — and,
over many sessions, the durable signal of you: how you decide, what you
prefer, the affect behind it) into clean, titled Concepts. The brain
accumulates distilled knowledge, not bulk chat — so
searchand proactive recall stay sharp. This is the shadow-distillation loop in practice: each session leaves the model of the person a little more complete.
The easiest way to wire it up is install.sh, which merges the hooks into your
existing settings (no overwrite) and substitutes the real path:
bash <repo>/install.shTo wire it up by hand instead, merge the entries from settings.example.json
into your own ~/.claude/settings.json (personal) or .claude/settings.json
(project). Don't cp it over an existing file — that would discard your other
settings. Replace /path/to/second-brain with the real repo path.
This wires up three hooks:
Stop(hooks/capture_conversation.py) — logs the transcript to disk, then nudges the agent once to distill durable know-how into the brain. Quiet, never wedges the session, and audits tohooks/capture_conversation.log. The nudge is gated by a smart trigger: trivial sessions (short, or no decision/remember markers) skip the block and just leave the log on disk. When the trigger fires, a heuristic pre-pass surfaces up to 10 candidate lines for the agent to filter — the agent reviews the candidates, not the full transcript. Result: less latency, higher precision.PreCompact(optional) — snapshots the log before context compaction in long sessions. Never distills. Comment out if noisy.UserPromptSubmit(hooks/recall_memories.py) — proactive recall: searches the clean brain against each prompt and injects relevant notes into context.
Env switches: SECONDBRAIN_SKIP_CAPTURE=1 (disable the capture hook),
SECONDBRAIN_SKIP_DISTILL=1 (log only, no distill nudge),
SECONDBRAIN_SKIP_RECALL=1 (disable proactive recall),
SECONDBRAIN_LOGS_DIR=/path (move the log directory).
Smart-trigger knobs (all default-off; tune if you want more or less auto-distill):
SECONDBRAIN_MIN_USER_CHARS=1500— min user-text chars before distill firesSECONDBRAIN_MIN_TURNS=4— min user-prompt turns before distill firesSECONDBRAIN_LONG_SESSION_TURNS=20— above this, the marker check is skipped (long sessions get distilled regardless)SECONDBRAIN_MAX_CANDIDATES=10— max candidate lines surfaced to the agent
# disable everything for one session
SECONDBRAIN_SKIP_CAPTURE=1 SECONDBRAIN_SKIP_RECALL=1 claudeThe repo ships a slash command at commands/history.md that browses your
conversation logs (the files under ~/.secondbrain/logs/, not the brain).
Wire it up with a symlink (install.sh does this for you):
# Personal scope
mkdir -p ~/.claude/commands
ln -s <repo>/commands/history.md ~/.claude/commands/history.mdThen type /history — the agent lists your recent session logs and opens the one
you pick, rendered readably. You can also say "show me my last 3 conversations".
From there it can distill a log into the brain on request.
Beyond the Claude Code skill, second-brain ships a standalone MCP server — scripts/brain_mcp.py — so any MCP-speaking client (Claude Desktop, Cursor, Continue, Cline, or your own host) can read and write the same brain. Like the rest of the project, it is stdlib-only: the server implements MCP's stdio transport (newline-delimited JSON-RPC 2.0) directly, with no SDK and no pip install.
Run it directly to confirm it starts:
python3 scripts/brain_mcp.py # speaks MCP on stdio; Ctrl-D / Ctrl-C to exitRegister it with an MCP client — e.g. Claude Desktop's claude_desktop_config.json:
{
"mcpServers": {
"second-brain": {
"command": "python3",
"args": ["/absolute/path/to/second-brain/scripts/brain_mcp.py"]
}
}
}The server exposes these tools, all backed by the same ~/.secondbrain/brain.db the CLI and skill use:
| Tool | What it does |
|---|---|
brain_add |
Save a Concept (with tags, collection, [[wikilinks]]) |
brain_search |
Full-text search the brain |
brain_show |
Show one Concept in full, with its relations |
brain_list |
List Concepts, filterable by collection/tag |
brain_related |
Related Concepts for a given id |
brain_recall_subject |
The persona sub-graph for one subject (person/topic) |
brain_recall_as_of |
Point-in-time recall — facts valid on a given ISO date |
brain_stats |
Brain health: counts, top tags, collections |
Same files, three front doors: the brain CLI for humans, the Claude Code skill for auto-capture, and this MCP server for every other agent host.
| Tool | Data location | Agent-readable | Lock-in | Backup | Cross-session memory | Install |
|---|---|---|---|---|---|---|
| Notion AI | Notion cloud | No | High | Vendor-controlled | No | Browser |
| ChatGPT Memory | OpenAI cloud | No | Total (black box) | Vendor-controlled | Yes (opaque) | Browser |
| Claude Projects | Anthropic cloud | No | High | Vendor-controlled | Yes (per-project) | Browser |
| mem0 | Vendor Postgres | Yes (paid API) | Medium (SDK bound) | Vendor-controlled | Yes (API) | pip install + key |
| Obsidian | Local .md |
No (plugin required) | None | Manual | No (DIY) | Desktop app |
| Logseq | Local .md |
No | None | Manual | No | Desktop app |
| Anytype | Local (P2P) | No | None | Manual sync | No | Desktop app |
| Quivr / privateGPT | Local vector DB | Via API | None | Manual | No | Docker + models |
| Apple Notes / Keep / OneNote | Vendor cloud | No | High | Vendor-controlled | No | OS-bundled |
| Evernote | Vendor cloud | No | High (historic) | Vendor-controlled | No | Desktop / web |
| second-brain | Local SQLite | Yes (CLI) | None | cp / git push |
Yes (agent-native) | git clone |
What only second-brain offers in this list:
- Full data ownership. The store is a plain SQLite file.
sqlite3 brain.dbopens it. The schema is in this repository asscripts/schema.sql. There is no export flow because there is no vendor to export from. - Versionable. The whole brain is one file.
git initit,git pushit to a private GitHub repo, get free history, diff, and disaster recovery. - Agent-native. The CLI is the API. There is no second interface for "AI mode" that you have to pay for separately.
- Air-gap and compliance ready. Zero network calls, zero external dependencies, zero telemetry. Passes the "can legal/security read the whole thing in an afternoon?" test. Runs identically on an internet-connected laptop and a fully isolated private network.
- You use AI agents (Claude Code, Cursor, Aider, Continue, custom) and want them to remember across sessions.
- You want a knowledge base that survives any single vendor disappearing.
- You are comfortable with a 200-line Python CLI and a SQLite file.
- You want one tool that humans and agents both drive, with the same data.
- Your organisation does not allow third-party memory or data-retention services.
second-brainnever phones home, sends no telemetry, and stores nothing outside the file you point it at. Compliance, legal, and security teams can audit the entire codebase in an afternoon — it is 400 lines of stdlib Python and a SQL schema. - You are running agents in an air-gapped or offline environment. Every dependency ships with Python's standard library. No package registry, no cloud API, no license server. Once the repo is cloned, it works indefinitely with zero network access — on a developer laptop, a private build server, a factory floor, or an isolated government network.
- You are building or deploying local agents and need a memory layer that stays local. Most "agent memory" solutions are SaaS APIs (mem0, Zep, LangMem) or require running a vector database (Qdrant, Weaviate, Chroma).
second-brainis a single SQLite file: no daemon to keep alive, no Docker image to pull, no API key to rotate.
- You want a polished WYSIWYG note-taking app for non-technical users → use Obsidian or Notion.
- You need a team wiki with permissions and comments → use Notion or Confluence.
- You need to store millions of documents and run vector search at scale → use a dedicated vector database;
second-brainis for personal-scale knowledge. - You cannot run Python locally → use a hosted note service.
See references/architecture.md for:
- The data model (3 tables + FTS +
pending_links) - FTS5 correctness notes (the v2 bugs and their v2.1 fixes)
- Wikilink resolution rules (frozen at write time)
- Soft delete semantics
- Phase 2 MCP interface contract
- v1 → v2 migration
- Performance targets
The recommended setup is to put ~/.secondbrain/brain.db under version control in a private GitHub repository. The database is a single file; even at 50K Concepts it is typically under 100 MB, which is fine for git push.
For continuous backup, pair with litestream to replicate the WAL stream to S3, Backblaze, or any S3-compatible object store. Schema migrations and disaster recovery are standard SQLite operations.
- v2.1 (current). FTS5, soft delete, write-time-frozen wikilinks,
pending_linkstable, recursive traverse. - Shipped. Standalone MCP server (
scripts/brain_mcp.py) — stdlib-only, stdio transport. - Phase 2. Vector search via
sqlite-vec, automaticinferred-source links above a similarity threshold. - North star — the shadow-distilled human. The psychological-memory layer (subjects, affect, bi-temporal validity, supersession) and the auto-distill loop are the foundation for an agent that can faithfully shadow a specific real person over time. The work ahead: richer persona synthesis from the sub-graph, drift/consistency checks across superseded facts, and a recall surface that reconstructs "who this person was as of date X" for grounding a mimic agent.
- Ideas. Markdown round-trip sync, Obsidian-compatible export refinements, encrypted local replicas.
Issues and pull requests welcome. The schema is the API — please open an issue before adding tables or columns.
MIT © 2026 second-brain contributors