中文 | English
Personal knowledge-base system that encapsulates core workflows as Skills to automatically transform reusable source materials into a structured Wiki. Paired with the qmd MCP search engine, it enables automated ingestion, intelligent querying, and health maintenance for long-term knowledge preservation and efficient retrieval.
Project article: https://blog.wenjiexu.site/en/posts/llm-personal-wiki/
- 🔄 Automatic sync detection — detect new, modified and deleted files under the
ObsidianRaw/03_Resources/directory - 📥 Intelligent ingestion — extract concepts and entities from source materials, produce concise summaries, and build indexes
- 🔍 Hybrid retrieval — combine lexical (keyword) search and vector (semantic) search for accurate recall
- 📊 Health checks — detect orphan pages, concept gaps, contradiction annotations, cognitive quality, and sync issues
PersonalWiki/
├── ObsidianRaw/ # Raw source materials (organized with the PARA method, read-only)
│ ├── 00_Inbox/ # Inbox (temporary, not processed)
│ ├── 01_Projects/ # Projects (action-oriented, not processed)
│ ├── 02_Areas/ # Areas (ongoing responsibilities, not processed)
│ ├── 03_Resources/ # Resources ← the only ingestion source
│ └── 04_Archive/ # Archive (historical records, not processed)
├── Wiki/ # The knowledge layer (maintained by the AI Agent)
│ ├── Index.md # Navigation index for the Wiki
│ ├── Log.md # Operation log (machine-parsable)
│ ├── Concepts/ # Concept pages (theories, frameworks)
│ ├── Entities/ # Entity pages (people, organizations, tools)
│ ├── Sources/ # Source summaries (tracks original file status)
│ └── Outputs/ # Analytical outputs (high-value synthesis from queries)
├── .claude/skills/ # Skills definitions (implemented workflows)
│ ├── syncing-wiki/ # Full sync workflow
│ ├── ingesting-resources/ # Ingesting resources workflow
│ ├── querying-wiki/ # Querying the Wiki workflow
│ ├── checking-wiki-health/ # Health checking workflow
│ └── detecting-resources-sync/ # Resources sync detection workflow
├── AGENTS.md # Comprehensive operational specification
├── VERSION # Architecture specification version
└── README.md # This file
Only files under ObsidianRaw/03_Resources/ are ingested.
| Folder | Purpose | Processed? |
|---|---|---|
00_Inbox/ |
Temporary inbox, needs curation | ❌ |
01_Projects/ |
Action-oriented project notes | ❌ |
02_Areas/ |
Ongoing responsibility notes | ❌ |
03_Resources/ |
Reusable reference materials (target for ingestion) | ✅ |
04_Archive/ |
Historical records | ❌ |
The ingestion pipeline skips:
- The
Personal/folder (sensitive data) - Files that contain only images (no text)
- Shopping lists or configuration dumps (no knowledge value)
sync and process
Runs the complete workflow in one pass: detect → confirm → process → verify → log → report.
detect new files
Produces a sync report comparing ObsidianRaw/03_Resources/ and Wiki/Sources/:
- New files — queued for ingestion
- Changed files — require re-ingestion
- Deleted files — require handling in Wiki
# Ingest a single file
ingest Academic/NewTopic.md
# Ingest a directory
ingest "Academic/Research Notes/Theory/"
# Ingest all new files
ingest --all-newAsk a question and the system will search the Wiki and answer:
What are the development stages of resilience theory?
What are typical applications of Bayesian networks?
run health check
Generates a Wiki health report: orphan pages, concept gaps, contradiction annotations, cognitive quality, and unsynced sources.
1. sync and process → detect and ingest in one pass (recommended)
2. run health check → validate the result
1. regularly sync and process → discover updates and rebuild the Wiki
2. weekly health check → keep the Wiki healthy
1. Ask a question → querying-wiki searches the Wiki
2. If the result is high-value → consider backfilling to Wiki/Outputs/
| Trigger phrases | Skill | Function |
|---|---|---|
sync and process, full sync, detect and process |
syncing-wiki |
Full workflow (recommended) |
detect new files, sync status, check updates |
detecting-resources-sync |
Scan Resources and report sync status |
ingest, process file, update wiki |
ingesting-resources |
Ingest source files into the Wiki |
health check, check status, diagnose |
checking-wiki-health |
Run Wiki health checks |
| Free-form question | querying-wiki |
Search the Wiki and answer |
All pages use standard YAML frontmatter:
---
title: Page title
created: YYYY-MM-DD
updated: YYYY-MM-DD
type: concept | entity | source | output
tags: [tag1, tag2]
---Each Source page includes tracking fields for sync detection:
source_path: Academic/Research Notes/Theory/resilience.md
source_hash: 45ea3a15 # first 8 chars of SHA256
source_mtime: 2026-04-08T15:00:05ZThese fields allow the system to detect moved, changed or deleted source files.
Pages link to each other using [[wiki-links]].
| Type | Path | Example |
|---|---|---|
| Concept | Wiki/Concepts/{ConceptName}.md |
Concepts/Resilience.md |
| Entity | Wiki/Entities/{EntityName}.md |
Entities/Bayesian_Network.md |
| Source | Wiki/Sources/{SourceId}.md |
Sources/Resilience_Paper.md |
| Output | Wiki/Outputs/{YYYY-MM-DD}-{topic}.md |
Outputs/2026-04-08-resilience-analysis.md |
The system uses the local qmd MCP tool for Markdown search and supports:
- Lexical search (
lex) — exact keyword matching - Vector search (
vec) — semantic similarity matching - Hybrid search — combine both strategies (recommended)
- Precise retrieval — fetch by path or docid
intent can be used to disambiguate queries when needed.
- An AI agent runtime (e.g., Claude Code, Codex)
qmdcli and MCP tools, local Markdown search engineopenssl, for computing file hashes
- Karpathy: LLM Wiki — original inspiration
- PARA method — organization framework for raw sources
- Obsidian — local Markdown knowledge browser
- qmd — local Markdown search engine