Markdown-specialized embedded database with interval-indexed block storage and hierarchical query support.
mq-db treats Markdown documents as structured, hierarchical databases rather than plain text. It parses Markdown into a flat block list with an interval index (Nested Set / Pre-Post Order), enabling O(1) section hierarchy queries. Documents can be queried with SQL or mq and persisted to a compact custom page-file format.
flowchart TD
A["Markdown File(s)"] -->|"CST Parser (mq-markdown)"| B["Block Tree\n(heading Β· paragraph Β· code Β· list β¦)"]
B -->|"Interval Index + Secondary Indexes"| C["Flat Block Vector\n(pre/post integers)"]
C --> D["BitmapIndex\n(block_type)"]
C --> E["BTreeIndex\n(pre / post)"]
C --> F["HashIndex\n(content / lang / depth)"]
C --> G["Zone Maps\n(per-document stats)"]
C --> H["SQL Engine\n(sqlparser β custom native evaluator)"]
C --> I["mq Engine\n(mq-lang evaluator)"]
Important
This project is under active development and the API may change.
- Flat block storage β every Markdown element becomes a typed
Blockwith row-polymorphic properties - O(1) hierarchy queries β interval index (
pre/post) makes ancestor/descendant checks a single integer comparison - Three-layer secondary indexes β
BitmapIndex(block type),BTreeIndex(pre/post),HashIndex(content/lang/depth) for fast SQL predicate pushdown - Zone Maps β per-document statistics skip irrelevant files before scanning any blocks
- Dual query engines β SQL via a custom
sqlparser-based evaluator, andmqviamq-lang - DDL support β
CREATE TABLE,INSERT INTO,DROP TABLEfor in-memory custom tables mq()scalar function β run an mq program against Markdown content inline in SQL- Custom page-file persistence β 8 KB fixed pages, checksums, atomic writes
- CLI + interactive REPL + TUI β full terminal experience
curl -fsSL https://raw.githubusercontent.com/harehare/mq-db/main/bin/install.sh | bashThe installer will:
- Download the latest release for your platform
- Verify the binary with SHA256 checksum
- Install to
~/.local/bin/ - Update your shell profile (bash, zsh, or fish)
After installation, restart your terminal or run:
source ~/.bashrc # or ~/.zshrc, or ~/.config/fish/config.fishcargo install mq-db# Latest Development Version
cargo install --git https://github.com/harehare/mq-db.git- Linux: x86_64, aarch64
- macOS: x86_64 (Intel), aarch64 (Apple Silicon)
- Windows: x86_64
mq-db index docs/ --recursive --output store.mq-db
mq-db index README.md DESIGN.md
mq-db index docs/ --no-spans # omit source spans (~21 bytes/block saved) β docs/DESIGN.md
β docs/API.md
Indexed 2 files β store.mq-db
mq-db list --db store.mq-db
mq-db list --db store.mq-db --format json # also: csv, tsv, markdown, htmlββββββββ¬βββββββββββββββββββββββββββββββββββββββββββββββββββββ¬βββββββββ¬βββββββββββ
β ID β Path / Title β Blocks β Tags β
ββββββββΌβββββββββββββββββββββββββββββββββββββββββββββββββββββΌβββββββββΌβββββββββββ€
β 0 β docs/DESIGN.md β 142 β β
β 1 β docs/API.md β 87 β api, v2 β
ββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββ΄βββββββββ΄βββββββββββ
2 documents
mq-db sql "SELECT block_type, count(*) FROM blocks GROUP BY block_type" --db store.mq-db
mq-db sql --file query.sql --db store.mq-db # read SQL from a file
mq-db sql "SELECT ..." --db store.mq-db --format json # also: csv, tsv, markdown, htmlβββββββββββββββ¬βββββββββββ
β block_type β count(*) β
βββββββββββββββΌβββββββββββ€
β paragraph β 48 β
β heading β 21 β
β code β 15 β
βββββββββββββββ΄βββββββββββ
(3 rows)
Hierarchy query with under() β find all content inside a specific section:
mq-db sql "
SELECT b.block_type, b.content
FROM blocks b
WHERE under(b.pre, b.post,
(SELECT pre FROM blocks WHERE block_type = 'heading' AND content = 'Architecture'),
(SELECT post FROM blocks WHERE block_type = 'heading' AND content = 'Architecture'))
ORDER BY b.pre
" --db store.mq-dbmq() scalar function β run an mq program against Markdown content inline:
mq-db sql "SELECT mq('.h1 | to_text', content) AS title FROM blocks WHERE block_type = 'code'" --db store.mq-db# Create from a SELECT result
mq-db sql "CREATE TABLE headings AS SELECT content, depth FROM blocks WHERE block_type = 'heading'" --db store.mq-db
# Create with explicit schema, then insert
mq-db sql "CREATE TABLE notes (id TEXT, body TEXT)" --db store.mq-db
mq-db sql "INSERT INTO notes VALUES ('1', 'Hello world')" --db store.mq-db
# Inspect
mq-db sql "SHOW TABLES" --db store.mq-db
mq-db sql "DESC notes" --db store.mq-db
# Drop
mq-db sql "DROP TABLE notes" --db store.mq-dbmq-db mq ".h1" --db store.mq-db
mq-db mq 'select(.code_lang == "rust")' --db store.mq-db
mq-db mq ".h1" --db store.mq-db --format markdown # also: json, csv, tsv, htmlmq-db repl --db store.mq-db --mode sqlmq-db (.help for commands .quit to exit)
mode: sql (.mode mq | .mode sql)
sql> SELECT content FROM blocks WHERE block_type = 'heading' LIMIT 3;
ββββββββββββββββββββ
β content β
ββββββββββββββββββββ€
β Overview β
β Architecture β
β Query Engine β
ββββββββββββββββββββ
(3 rows)
sql> .mode mq
β mq mode
mq> .h2
## Architecture
## Query Engine
mq-db serve --db store.mq-db # listens on 127.0.0.1:7878
mq-db serve --db store.mq-db --port 8080 # custom port
mq-db serve --db store.mq-db --host 0.0.0.0 --port 8080Three endpoints are available:
| Method | Path | Body | Description |
|---|---|---|---|
GET |
/health |
β | {"status":"ok","documents":<n>} |
POST |
/sql |
{"query":"SELECT β¦"} |
Execute a SQL query, returns JSON rows |
POST |
/mq |
{"code":".h1"} |
Evaluate an mq expression, returns {"results":[β¦]} |
# Health check
curl http://127.0.0.1:7878/health
# SQL via HTTP
curl -s -X POST http://127.0.0.1:7878/sql \
-H 'Content-Type: application/json' \
-d '{"query":"SELECT block_type, count(*) FROM blocks GROUP BY block_type"}'
# mq via HTTP
curl -s -X POST http://127.0.0.1:7878/mq \
-H 'Content-Type: application/json' \
-d '{"code":".h1"}'mq-db lint --db store.mq-db --depth 2β 1 violation (H2 immediately followed by list)
file heading
ββββββββββββββββββββββββββββββββββββββββ ββββββββββββββββββββββββββββββ
docs/DESIGN.md "Quick Start"
mq-db stats --db store.mq-db Documents 5
Blocks 632
Block types
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
ΒΆ paragraph ββββββββββββββββββββββββ 241 (38%)
# heading ββββββββββββββββββββββββ 89 (14%)
{} code ββββββββββββββββββββββββ 73 (12%)
β’ list ββββββββββββββββββββββββ 58 (9%)
Code languages
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
{} rust ββββββββββββββββββββββββ 41 (57%)
{} python ββββββββββββββββββββββββ 18 (25%)
{} bash ββββββββββββββββββββββββ 14 (19%)
mq-db show 0 --db store.mq-db docs/DESIGN.md
title Design Document
blocks 142
pre post type content
ββββ ββββ ββββββββββββββββ ββββββββββββββββββββββββββββββββββββββββββ
0 141 heading H1 Design Document
2 55 heading H2 Architecture
4 21 paragraph The system is built onβ¦
22 37 heading H3 Query Engine
24 36 code fn main() { β¦ }
mq-db tui --db store.mq-db mq-db SQL Tab:switch i:input j/k:nav d/u:scroll q:quit
ββ Documents βββββββββββ¬β SQL βββββββββββββββββββββββββββββββββββββββββββββββββ
β DESIGN.md β SELECT block_type, count(*) FROM blocks GROUP BY b_ β
β 142 blocks ββ Results βββββββββββββββββββββββββββββββββββββββββββββ€
β API.md β βββββββββββββββ¬βββββββββββ β
β 87 blocks API β β block_type β count(*) β β
β README.md β βββββββββββββββΌβββββββββββ€ β
β 34 blocks β β paragraph β 48 β β
ββββββββββββββββββββββββ΄βββββββββββββββββββββββββββββββββββββββββββββββββββββββ
5 docs 632 blocks 3 rows
Keys:
| Key | Action |
|---|---|
i |
Focus query input |
Esc |
Blur input |
Enter |
Run query |
Tab |
Toggle mq / SQL mode |
j / k |
Navigate document list |
d / u |
Scroll results down / up |
g / G |
Jump to top / bottom |
q / Ctrl+C |
Quit |
use mq_db::{DocumentStore, SqlEngine, MqEngine, block::BlockType};
// ββ Build in memory ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
let mut store = DocumentStore::new();
store.add_file("docs/DESIGN.md")?;
store.add_str("# Hello\n\n## Architecture\n\nDetails\n")?;
// Chainable query API β zone-map skip + interval scope + block predicates
let chunks = store.query()
.documents(|doc| doc.zone_maps.heading_contents.contains("Architecture"))
.under_heading("Architecture", Some(2))
.filter(|b| matches!(b.block_type, BlockType::Paragraph | BlockType::Code))
.blocks();
// SQL engine (custom sqlparser-based evaluator β no SQLite dependency)
let engine = SqlEngine::new(&store)?;
let out = engine.execute(
"SELECT content FROM blocks WHERE block_type = 'heading' ORDER BY pre"
)?;
print!("{}", out.to_table());
// mq engine
let results = MqEngine::eval_store(".h1", &store)?;
// Structural lint
let violations = store.query().lint_heading_followed_by(2, &[BlockType::List]);
// ββ Persist / load βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
store.save("store.mq-db")?;
// Full load β all blocks read into memory, indexes built on first SqlEngine use
let store = DocumentStore::load("store.mq-db")?;
// Lazy open β catalog only; call load_all_blocks() + load_all_indexes() before SQL
let mut store = DocumentStore::open("store.mq-db")?;
store.load_all_blocks()?;
store.load_all_indexes()?;
// Catalog-only β for metadata commands (list, stats) that don't need block data
let store = DocumentStore::load_catalog_only("store.mq-db")?;SELECT id, path, title, tags FROM documents;
SELECT id, document_id, block_type, content, pre, post,
depth, lang, properties FROM blocks;| Function | Description |
|---|---|
under(pre, post, anc_pre, anc_post) |
O(1) interval ancestor check |
mq(program, content) |
Run an mq program against Markdown content |
json_extract(json, path) |
Extract a value from a JSON string |
count(*) / min / max / sum / avg |
Aggregate functions |
lower / upper / length / coalesce |
Scalar utilities |
| Statement | Description |
|---|---|
CREATE TABLE name AS SELECT β¦ |
Create a custom table from a query result |
CREATE TABLE name (col TYPE, β¦) |
Create an empty custom table with explicit schema |
INSERT INTO name VALUES (β¦) |
Insert a row into a custom table |
DROP TABLE name |
Drop a custom table |
SHOW TABLES |
List all custom tables |
DESC name |
Show schema of a custom table |
-- All text/code under a specific section (RAG extraction)
SELECT b.block_type, b.content
FROM blocks b
WHERE under(b.pre, b.post,
(SELECT pre FROM blocks WHERE block_type = 'heading' AND content = 'Architecture'),
(SELECT post FROM blocks WHERE block_type = 'heading' AND content = 'Architecture'))
AND b.block_type IN ('paragraph', 'code')
ORDER BY b.pre;
-- Extract H1 title from code block content via the mq() scalar function
SELECT mq('.h1 | to_text', content) AS title
FROM blocks
WHERE block_type = 'code' AND lang = 'markdown';
-- H2 headings immediately followed by a list (structural lint)
SELECT d.path, h.content AS heading
FROM blocks h
JOIN blocks nxt ON nxt.document_id = h.document_id AND nxt.pre = h.pre + 1
JOIN documents d ON d.id = h.document_id
WHERE h.block_type = 'heading' AND depth = 2 AND nxt.block_type = 'list';
-- Documents containing Python code
SELECT DISTINCT d.path
FROM documents d JOIN blocks b ON b.document_id = d.id
WHERE b.block_type = 'code' AND lang = 'python';Every Markdown element becomes a Block:
struct Block {
id: u32,
document_id: u32,
block_type: BlockType, // Heading, Paragraph, Code, List, β¦
content: String,
span: Option<Span>, // line/column for editor sync
pre: u32, // interval index pre-order
post: u32, // interval index post-order
properties: Properties, // row-polymorphic extra attributes
}| Block type | Properties |
|---|---|
Heading |
{ "depth": 2, "slug": "architecture" } |
Code |
{ "lang": "rust", "meta": "no_run" } |
List |
{ "ordered": false, "level": 1, "checked": null } |
Yaml / Toml |
parsed front-matter keys ("title", "tags", β¦) |
mq-db applies three complementary index layers, cheapest-first.
flowchart LR
Q["SQL Query"] --> ZM["Layer 1\nZone Maps\n(document skip)"]
ZM -->|"relevant docs"| II["Layer 2\nInterval Index\n(section scope)"]
II -->|"candidate blocks"| SI["Layer 3\nSecondary Indexes\n(block lookup)"]
SI -->|"BitmapIndex\nBTreeIndex\nHashIndex"| R["Result Rows"]
ZM -->|"skip"| X1["β irrelevant docs"]
SI -->|"no hint"| FS["Full Scan"]
Built once per document and stored in the .mq-db file. Checked before any block is read:
| Field | Skips documents where⦠|
|---|---|
heading_contents |
The requested heading text is absent |
code_languages |
The requested language tag is absent |
max_heading_depth |
The requested depth cannot exist |
tags |
The tag filter cannot match |
Heading hierarchy encoded as (pre, post) pairs via Pre-Post Order (Nested Set) traversal:
graph TD
doc["# Doc\npre=0 Β· post=11"]
secA["## Section A\npre=2 Β· post=7"]
para1["Paragraph\npre=3 Β· post=4"]
code1["Code\npre=5 Β· post=6"]
secB["## Section B\npre=8 Β· post=11"]
para2["Paragraph\npre=9 Β· post=10"]
doc --> secA
doc --> secB
secA --> para1
secA --> code1
secB --> para2
A is_under B β B.pre < A.pre AND A.post < B.post β O(1), no tree traversal.
| Index | Column(s) | Structure | Complexity |
|---|---|---|---|
BitmapIndex |
block_type |
Inverted list per type | O(1) key + O(k) iterate |
BTreeIndex |
pre, post |
BTreeMap |
O(log n) point, O(log n + k) range |
HashIndex |
content, lang, depth |
HashMap |
O(1) average |
SQL predicate pushdown picks an IndexHint:
flowchart TD
P["SQL WHERE predicate"]
P -->|"block_type = '...'"| B["BitmapIndex"]
P -->|"pre = N"| BT1["BTreeIndex (point)"]
P -->|"pre BETWEEN N AND M"| BT2["BTreeIndex (range)"]
P -->|"content = '...'"| H1["HashIndex"]
P -->|"lang = '...'"| H2["HashIndex"]
P -->|"depth = N"| H3["HashIndex"]
P -->|"other"| FS["Full Scan"]
Custom 8 KB page file:
graph TD
P0["Page 0 β File Header\nmagic 0x4D514442 Β· version Β· page count"]
P1["Page 1 β Catalog\ndoc_id β first_block_page Β· num_blocks Β· ZoneMaps"]
P2["Page 2+ β Block Data\nlinked page chains Β· overflow pages"]
P0 --> P1 --> P2
Writes are atomic: data goes to <path>.tmp then renamed to <path> on success.
