Skip to content

shards-lang/cr-sqlite

 
 

Repository files navigation

cr-sqlite - Convergent, Replicated SQLite

c-tests c-valgrind c-asan py-tests

A run-time loadable extension for SQLite that adds multi-master replication and partition tolerance via CRDTs.

Write to your SQLite database while offline. Others write to theirs. Come online, merge, no conflicts.

This is a fork of vlcn-io/cr-sqlite, rewritten as a pure C extension. The original project used a hybrid C/Rust architecture; this fork eliminates the Rust dependency entirely.

Why a pure-C port?

Upstream is a C shell around a no_std Rust core. Embedding Rust there means a pinned old nightly (nightly-2023-10-05), a sqlite-rs-embedded submodule, cargo, and a custom #[global_allocator] that routes every Rust allocation through sqlite3_malloc so ownership can cross the FFI boundary without copies — plus hand-written panic/eh_personality lang items. That's a lot of machinery to make Rust behave the way C does natively, and the hot path is unsafe/raw-pointer FFI glue anyway, so little of Rust's safety actually applies.

Dropping it buys: a build with just a C compiler + CMake (no nightly, cargo, or submodules), a ~7× smaller loadable (~158 KB vs ~1.1 MB), one language, and clean valgrind/asan.

Honest trade-offs: this is not "C is faster than Rust" — the merge speedup is algorithmic and would be just as fast in Rust. And memory safety now rests on discipline plus tests (155 correctness tests, asan, valgrind) rather than the compiler.

Building

Requires only a C compiler and CMake (>= 3.16). No Rust, no Cargo, no nightly toolchains.

git clone git@github.com:shards-lang/cr-sqlite.git
cd cr-sqlite/core
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build

This produces:

  • build/crsqlite.[dylib|so|dll] — loadable SQLite extension
  • build/sqlite3 — SQLite CLI with cr-sqlite built in
  • build/libcrsqlite_static.a — static library

Or using Make directly:

cd cr-sqlite/core
make loadable

Usage

-- load the extension
.load crsqlite

-- create tables as normal
CREATE TABLE foo (a PRIMARY KEY NOT NULL, b);
CREATE TABLE baz (a PRIMARY KEY NOT NULL, b, c, d);

-- upgrade to CRRs (conflict-free replicated relations)
SELECT crsql_as_crr('foo');
SELECT crsql_as_crr('baz');

-- write data as normal
INSERT INTO foo (a, b) VALUES (1, 2);
INSERT INTO baz (a, b, c, d) VALUES ('a', 'woo', 'doo', 'daa');

-- get changes to sync to other nodes
SELECT * FROM crsql_changes WHERE db_version > 0;

-- apply changes received from another node
INSERT INTO crsql_changes VALUES (...);

-- clean up before closing
SELECT crsql_finalize();

How It Works

Tables upgraded with crsql_as_crr() get:

  • Clock tables (__crsql_clock) tracking per-column version vectors
  • PK lookaside tables (__crsql_pks) mapping primary keys to internal IDs
  • AFTER triggers capturing inserts, updates, and deletes

The crsql_changes virtual table exposes changesets for sync and accepts incoming changes for merge. Conflict resolution uses Last-Write-Wins with causal length tracking for delete/resurrect semantics.

Sync Protocol

Node A                              Node B
  |                                   |
  |-- SELECT * FROM crsql_changes --> |
  |   WHERE db_version > last_seen    |
  |                                   |
  | <-- INSERT INTO crsql_changes --- |
  |     (changes from B)              |

Both nodes converge to the same state regardless of sync order.

Tracking Sync Watermarks (optional)

To support incremental sync without rescanning, cr-sqlite maintains a crsql_tracked_peers(site_id, version, seq, tag, event) table of per-peer (db_version, seq) watermarks:

  • RECEIVED (event = 0) is written automatically: every change merged through crsql_changes advances the watermark for its originating site_id. This happens even for no-op merges (older or losing changes), so a puller never refetches changes it has already seen. Local writes and self-echoes are skipped.
  • SENT (event = 1) is for the application to record — e.g. after successfully shipping changes to a peer — via crsql_set_tracked_peer(site_id, version, seq, tag, event). tag is an application-defined channel (default 0).

Both are monotonic: a watermark only ever moves forward, so out-of-order or duplicate updates are safe no-ops. Reading is plain SQL: SELECT version, seq FROM crsql_tracked_peers WHERE site_id = ? AND event = 0.

API Reference

Functions

Function Description
crsql_as_crr('table') Upgrade a table to a CRR
crsql_as_table('table') Downgrade a CRR back to a regular table
crsql_site_id() Returns this node's unique 16-byte site ID
crsql_db_version() Returns the current database version (lamport clock)
crsql_next_db_version() Returns the next version for writes
crsql_begin_alter('table') Start altering a CRR schema
crsql_commit_alter('table') Finish altering a CRR schema
crsql_automigrate(schema) Auto-migrate schema to match provided SQL
crsql_finalize() Clean up before closing the connection
crsql_config_set(key, val) Set a config option (e.g., 'merge-equal-values')
crsql_config_get(key) Get a config option
crsql_set_tracked_peer(site_id, version, seq, tag, event) Record a per-peer sync watermark (monotonic — see below)

Changes Virtual Table

The crsql_changes virtual table has these columns:

Column Type Description
table TEXT Source table name
pk BLOB Packed primary key values
cid TEXT Column name (or '-1' for sentinel)
val ANY Column value
col_version INTEGER Per-column version counter
db_version INTEGER Database-wide lamport clock
site_id BLOB 16-byte site ID of the writer
cl INTEGER Causal length (odd = alive, even = deleted)
seq INTEGER Sequence within a transaction

Fractional Indexing

For ordered lists (e.g., drag-and-drop reorderable items):

CREATE TABLE items (id PRIMARY KEY NOT NULL, list TEXT, spot TEXT, content TEXT);
SELECT crsql_as_crr('items');
SELECT crsql_fract_as_ordered('items', 'spot', 'list');

-- insert at the end (spot = 1) or beginning (spot = -1)
INSERT INTO items VALUES ('a', 'todo', 1, 'first item');
INSERT INTO items VALUES ('b', 'todo', 1, 'second item');

-- insert after a specific item via the fractindex view
INSERT INTO items_fractindex (id, list, content, after_id) VALUES ('c', 'todo', 'between', 'a');

Altering CRR Tables

SELECT crsql_begin_alter('table_name');
ALTER TABLE table_name ADD COLUMN new_col TEXT;
SELECT crsql_commit_alter('table_name');

CRR Compatibility Requirements

Tables upgraded to CRRs must:

  • Have a non-nullable primary key
  • Not use AUTOINCREMENT
  • Not have unique indices (besides the primary key)
  • Not have checked foreign key constraints
  • Have default values for all NOT NULL non-PK columns

Tests

# C unit tests (via CMake)
cd core
cmake -B build -DCMAKE_BUILD_TYPE=Debug
cmake --build build
cd build && ctest --output-on-failure

# Python correctness tests (155 tests)
cd core
cmake -B build -DCMAKE_BUILD_TYPE=Release
cmake --build build --target crsqlite
mkdir -p dist && cp build/crsqlite.* dist/
cd ../py/correctness
pip install pytest hypothesis
pip install -e .
pytest

# Memory checks (via Make)
cd core && make valgrind
cd core && make asan

Performance

  • Reads are the same speed as regular SQLite.
  • Local writes to CRRs are ~2.5x slower than regular SQLite tables (trigger overhead and clock-table writes).
  • Merging changesetsINSERT INTO crsql_changes, the sync-apply path — is roughly 6x faster than the upstream Rust/C implementation: importing 100k changes (20k rows) takes ~0.14s here versus ~0.85s upstream, with ~2.8x on idempotent re-imports. (Measured with both extensions built -O2 and .loaded into the same vanilla SQLite 3.42.0, Apple Silicon.) The win is the merge path itself, not the language — the pure-C port before this work was actually marginally slower than upstream. It drops the per-row statement prepares and the ephemeral btrees that RETURNING clauses forced SQLite to materialize (host sys time falls from ~0.55s to ~0.004s), memoizes per-row and per-site lookups, and short-circuits idempotent re-imports to pure clock-table reads. Run core/test/perf/bench-import.sh for a repeatable benchmark; see core/CHANGELOG.md for the breakdown.

Research & Prior Art

cr-sqlite was inspired by:

License

Same as the original project.

About

Convergent, Replicated SQLite. Multi-writer and CRDT support for SQLite

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages

  • C 69.2%
  • Python 25.8%
  • JavaScript 2.0%
  • Shell 1.2%
  • Makefile 1.2%
  • CMake 0.6%