A production-grade educational storage engine built in modern C++17
PersistX demonstrates how real-world databases (PostgreSQL, MySQL/InnoDB, SQLite) manage persistent storage β from raw disk I/O to crash-safe ACID transactions β all in ~12,000 lines of hand-written C++.
- Full ACID Transactions β begin, commit, and abort with strict atomicity and durability guarantees
- ARIES Crash Recovery β three-phase protocol (Analysis β Redo β Undo) with CLR support for crash-safe aborts
- B+ Tree Index β O(log N) point lookups, insertions, deletions, and range scans with leaf-linked-list traversal
- Write-Ahead Logging β all modifications are logged before data pages are flushed; log is force-flushed on commit
- Buffer Pool Manager β fixed-size in-memory page cache with LRU eviction and WAL-enforced dirty page writeback
- Slotted Page Layout β variable-length records with stable
RID = (page_id, slot_id)identifiers that survive compaction - Fuzzy Checkpointing β non-blocking checkpoint protocol that snapshots the Active Transaction Table (ATT) and Dirty Page Table (DPT)
- Interactive Shell β feature-rich REPL with ANSI-colored output, box-drawn tables, B+ tree visualization, and benchmarking tools
PersistX follows a strict layered architecture with single-responsibility components and clean interface boundaries β the same design philosophy used by PostgreSQL, InnoDB, and BusTub.
βββββββββββββββββββββββββββ
β Interactive Shell β REPL interface
β (ANSI terminal UI) β
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββΌβββββββββββββ
β Transaction Manager β BEGIN / COMMIT / ABORT
β (ATT, CLR undo chain) β Crash-safe abort via CLRs
ββββββββββββββ¬βββββββββββββ
β
ββββββββββββββββββββΌβββββββββββββββββββ
β β β
βββββββββββββΌβββββββββββ ββββββΌββββββββββββ ββββββΌβββββββββββββββ
β B+ Tree Index β β WAL / Log β β Checkpoint β
β (search, insert, β β Manager β β Manager β
β delete, range) β β (append, flush)β β (ATT + DPT snap) β
βββββββββββββ¬βββββββββββ ββββββ¬ββββββββββββ βββββββββββββββββββββ
β β
β βββββββββββββΌββββββββββββ
ββββββββΊ Buffer Manager β In-memory page cache
β (LRU eviction, β WAL enforcement
β pin/unpin, flush) β
βββββββββββββ¬ββββββββββββ
β
βββββββββββββΌββββββββββββ
β Disk Manager β Raw page I/O
β (read / write / β File management
β allocate pages) β
βββββββββββββ¬ββββββββββββ
β
βββββββββββββΌββββββββββββ
β persistx.db β Persistent storage
β persistx.log β WAL file
βββββββββββββββββββββββββ
| Component | File(s) | Responsibility |
|---|---|---|
| Slotted Page | page.hpp/cpp |
Fixed 4 KB page layout with variable-length records, tombstone-aware slot directory, and O(1) space checks |
| Disk Manager | disk_manager.hpp/cpp |
Thread-safe raw page I/O to a single database file |
| Buffer Manager | buffer_manager.hpp/cpp, replacer.hpp/cpp |
In-memory page cache with LRU eviction, pin counting, and WAL-enforced writeback |
| Log Manager | log_manager.hpp/cpp, log_record.hpp |
Append-only WAL with in-memory buffering, group flush, and LSN-indexed record lookup |
| Transaction Manager | transaction_manager.hpp/cpp |
Full transaction lifecycle β BEGIN, COMMIT (force-flush), ABORT (CLR undo chain) |
| Checkpoint Manager | checkpoint_manager.hpp/cpp |
Fuzzy checkpoints β snapshots ATT and DPT to the WAL without blocking transactions |
| Recovery Manager | recovery_manager.hpp/cpp |
ARIES three-phase recovery: Analysis, Redo, Undo with CLR skip logic |
| B+ Tree Index | btree_index.hpp/cpp, btree_page.hpp/cpp |
Ordered index with leaf/internal page layouts, recursive split, and linked-leaf range scan |
| Shell | shell.hpp/cpp, shell_commands.cpp |
Interactive REPL with 20+ commands, ANSI rendering, and benchmark tools |
| Terminal | terminal.hpp/cpp |
ANSI color helpers, box-drawing, table rendering, and Windows VT100 setup |
- C++17 compiler (GCC 9+, Clang 10+, or MSVC 2019+)
- CMake 3.20 or newer
- pthreads (Linux/macOS) or Windows threads
# Clone the repository
git clone https://github.com/yourusername/PersistX.git
cd PersistX
# Configure and build
cmake -S . -B build -G "MinGW Makefiles" # Windows with MinGW
# cmake -S . -B build # Linux / macOS
cmake --build build
# Run the interactive shell
./build/persistx.exe # Windows
# ./build/persistx # Linux / macOSPersistX ships with a comprehensive test suite covering every layer:
./build/test_page.exe # Slotted page layout tests
./build/test_disk_manager.exe # Disk I/O tests
./build/test_buffer_manager.exe # Buffer pool + eviction tests
./build/test_wal.exe # Write-ahead logging tests
./build/test_transaction.exe # Transaction ACID tests
./build/test_checkpoint.exe # Checkpoint protocol tests
./build/test_recovery.exe # ARIES crash recovery tests
./build/test_btree.exe # B+ tree index tests
./build/overall_tests.exe # Full integration test suitePersistX includes a beautifully rendered interactive REPL that exposes every layer of the storage engine:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β β P E R S I S T X S t o r a g e E n g i n e β β
β v0.1.0 | ACID | ARIES | B+Tree β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
Type help for available commands.
persistx>
| Command | Description |
|---|---|
put <key> <value> |
Insert a key-value pair into the B+ tree |
get <key> |
Look up a value by key |
del <key> |
Delete a key-value pair |
update <key> <value> |
Update the value for an existing key |
scan <begin> <end> |
Range scan β returns all keys in [begin, end] |
| Command | Description |
|---|---|
begin |
Start an explicit transaction |
commit |
Commit the current transaction (force-flush WAL) |
abort |
Abort and rollback via CLR undo chain |
| Command | Description |
|---|---|
status |
System overview β buffer pool, WAL, B+ tree stats |
tree |
Visualize the B+ tree structure level by level |
buffer |
Inspect buffer pool frame table |
wal [n] |
Show last N WAL log entries |
page <id> |
Inspect a specific page (data or index) |
| Command | Description |
|---|---|
flush |
Flush all dirty pages to disk |
checkpoint |
Trigger a fuzzy checkpoint |
fill <n> |
Bulk insert N sequential keys (benchmarking) |
bench <n> |
Benchmark N insert + lookup operations |
crash |
Simulate a crash (_Exit(1) β no flush!) |
recover |
Run ARIES recovery + rebuild B+ tree index |
clear |
Clear the terminal screen |
exit |
Shut down gracefully |
persistx> put 1 "Hello, World!"
β INSERT key=1 β RID(2,0) [LSN=1]
persistx> put 2 "Database Internals"
β INSERT key=2 β RID(2,1) [LSN=3]
persistx> get 1
1 β Hello, World!
persistx> begin
β BEGIN txn#2
persistx[txn#2]> put 3 "ACID transactions"
β INSERT key=3 β RID(2,2) [LSN=5]
persistx[txn#2]> abort
β ABORT txn#2
persistx> get 3
Key 3 not found
persistx> crash
π₯ CRASH β killing process without flushing!
persistx> recover
π Running ARIES recovery...
β ARIES complete (2.35ms)
π§ Rebuilding B+ tree index...
β Index rebuilt (2 keys recovered)
β Recovery complete (4.12ms total)
Every page is exactly 4096 bytes, aligned for direct I/O:
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Header (21 bytes) β
β page_id (4) β type (1) β slots (2) β tombstones (2) β
β free_ptr (4) β page_lsn (8) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ€
β Record Data (grows β from top) β
β ββββββββ ββββββββββββββ ββββββββ β
β β Rec0 β β Rec1 β β Rec2 β ... β
β ββββββββ ββββββββββββββ ββββββββ β
β β free space β β
β [Slot2][Slot1][Slot0] β Slot Directory (grows β) β
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Tombstone-aware: deleted slots are marked (not removed), enabling O(1)
can_insert()checks and stable RIDs - Compaction: live records are rewritten contiguously while preserving slot indices β RIDs remain valid
- Leaf pages: sorted array of
(int64_t key, RID)entries β max 291 entries/leaf - Internal pages:
child[0] | key[0] | child[1] | ... | child[N]β max 340 keys/node - Leaf linked list: leaves are connected via
next_leaf_idfor efficient range scans - Lens pattern:
BTreeLeafPageandBTreeInternalPageare zero-cost wrappers over rawPage*buffers β no extra allocation - No merge on delete: same policy as PostgreSQL β deleted entries are removed without rebalancing
Every modification follows the LOG β FLUSH β DATA WRITE protocol:
- Serialize the log record (before-image + after-image) into the in-memory buffer
- Force-flush the log to disk before any dirty data page can be written
- The
BufferManager::enforce_wal()method guarantees this invariant on every eviction
Log record types: BEGIN, UPDATE, COMMIT, ABORT, TXN_END, CLR, CHECKPOINT_BEGIN, CHECKPOINT_END
On every startup (clean or crash), PersistX runs the full ARIES protocol:
| Phase | What It Does |
|---|---|
| Analysis | Scan WAL from the last checkpoint to reconstruct the ATT and DPT |
| Redo | Replay all logged changes β idempotent thanks to LSN comparison |
| Undo | Roll back uncommitted transactions using the prevLSN chain, writing CLRs for crash safety |
After ARIES, the B+ tree index is rebuilt by scanning recovered data pages β ensuring total consistency even after an unclean shutdown.
Abort walks the undo chain backward (prevLSN pointers), restoring before-images:
UPDATE(LSN=5) β prevLSN β UPDATE(LSN=3) β prevLSN β BEGIN(LSN=1)
β β
βΌ βΌ
CLR(LSN=7, undo_next=3) CLR(LSN=8, undo_next=1)
Each CLR records which step was already undone, so a crash mid-abort can safely skip completed undo steps.
PersistX/
βββ include/ # Header files (public API)
β βββ common.hpp # Shared types, constants, RID
β βββ page.hpp # Slotted page layout
β βββ disk_manager.hpp # Raw page I/O
β βββ replacer.hpp # LRU eviction policy
β βββ buffer_manager.hpp # In-memory page cache
β βββ log_manager.hpp # WAL management
β βββ log_record.hpp # Log record serialization
β βββ transaction.hpp # Transaction state
β βββ transaction_manager.hpp # Transaction lifecycle
β βββ checkpoint_manager.hpp # Fuzzy checkpoints
β βββ recovery_manager.hpp # ARIES recovery
β βββ btree_page.hpp # B+ tree node layouts
β βββ btree_index.hpp # B+ tree public API
β βββ shell.hpp # Interactive shell
β βββ terminal.hpp # ANSI terminal rendering
βββ src/ # Implementation files
β βββ main.cpp # Entry point
β βββ page.cpp # Slotted page operations
β βββ disk_manager.cpp # File I/O
β βββ replacer.cpp # LRU eviction
β βββ buffer_manager.cpp # Buffer pool management
β βββ log_manager.cpp # WAL append / flush / read
β βββ transaction_manager.cpp # BEGIN / COMMIT / ABORT + CLR
β βββ checkpoint_manager.cpp # Checkpoint protocol
β βββ recovery_manager.cpp # Analysis / Redo / Undo
β βββ btree_page.cpp # Leaf & internal page ops
β βββ btree_index.cpp # Tree search / insert / split
β βββ shell.cpp # REPL loop + data commands
β βββ shell_commands.cpp # Diagnostics & system commands
β βββ terminal.cpp # ANSI rendering utilities
βββ tests/ # Test suite
β βββ test_page.cpp # Page layout correctness
β βββ test_disk_manager.cpp # Disk I/O + allocation
β βββ test_buffer_manager.cpp # Eviction, pinning, WAL
β βββ test_wal.cpp # Log append / flush / read
β βββ test_transaction.cpp # ACID transaction tests
β βββ test_checkpoint.cpp # Checkpoint correctness
β βββ test_recovery.cpp # Crash recovery scenarios
β βββ test_btree.cpp # B+ tree operations
β βββ overall_tests.cpp # End-to-end integration
βββ CMakeLists.txt # Build configuration
βββ README.md
- Page-based storage β all data lives in fixed 4 KB pages for O_DIRECT compatibility
- Separation of concerns β strict layered architecture with zero upward dependencies
- Write-ahead logging β LOG β FLUSH β DATA WRITE, always
- Crash-first design β the system is always recoverable to a consistent state
- Stable record identifiers β
RID = (page_id, slot_id)survives compaction and page reorganization - Zero-copy lenses β B+ tree pages are interpreted in-place, no deserialization overhead
- Thread safety β mutex-protected critical sections with documented latch ordering:
TxnMgr β BufferMgr β LogMgr - No UB β all wire format access uses
memcpyinstead ofreinterpret_castto avoid strict-aliasing violations
- CMU 15-445 Database Systems β BusTub project
- Database Internals by Alex Petrov
- ARIES: A Transaction Recovery Method β Mohan et al., 1992
- PostgreSQL and InnoDB source code
This project is for educational purposes. Built as part of the IEEE NITK Envision 2026 program.