A SQLite extension that provides semantic memory capabilities with hybrid search (vector similarity + full-text search).
- Overview
- Sync Behavior
- Loading the Extension
- SQL Functions
- Virtual Table Module
- Configuration Options
- Timestamps
- Examples
sqlite-memory enables semantic search over text content stored in SQLite. It:
- Chunks text content using semantic parsing (markdown-aware)
- Generates embeddings for each chunk using the built-in llama.cpp engine (
"local"provider) or the vectors.space remote service - Stores embeddings and full-text content for hybrid search
- Searches using vector similarity combined with FTS5 full-text search
All memory_add_* functions use content-hash change detection to avoid redundant embedding computation. Each piece of content is hashed before processing — if the hash already exists in the database, the content is skipped.
| Scenario | Behavior |
|---|---|
| New content | Chunked, embedded, and indexed |
| Unchanged content | Skipped (hash match) |
| Modified file | Old entry atomically deleted, new content reindexed |
| Deleted file | Entry removed during directory sync |
Every sync operation is wrapped in a SQLite SAVEPOINT transaction. If any step fails (embedding error, disk issue, constraint violation), the entire operation rolls back. This guarantees:
- No partially-indexed files — content is either fully indexed or not at all
- No orphaned chunks — embeddings and FTS entries are always consistent with
dbmem_content - Safe to retry — a failed sync leaves the database in its previous valid state
This makes all sync functions idempotent and safe to call repeatedly (e.g., on a schedule or at application startup).
.load ./memoryThe extension requires sqlite-vector for vector similarity search:
.load ./vector
.load ./memoryReturns the extension version string.
Parameters: None
Returns: TEXT - Version string (e.g., "0.5.0")
Example:
SELECT memory_version();
-- Returns: "0.5.0"Configures the embedding model to use.
Parameters:
| Parameter | Type | Description |
|---|---|---|
provider |
TEXT | "local" for built-in llama.cpp engine, or any other name (e.g., "openai") for vectors.space remote service |
model |
TEXT | For local: full path to GGUF model file. For remote: model identifier supported by vectors.space |
Returns: INTEGER - 1 on success
Notes:
- When
provideris"local", the extension uses the built-in llama.cpp engine and verifies the model file exists - When
provideris anything other than"local", the extension uses the vectors.space remote embedding service - Remote embedding requires a free API key from vectors.space (set via
memory_set_apikey) - Settings are persisted in
dbmem_settingstable - For local models, the embedding engine is initialized immediately
- Automatic reindex: If a model was previously configured and the new provider/model differs, all existing content is automatically re-embedded with the new model. File-based entries are re-read from disk; text-based entries are re-embedded from stored content. Errors on individual entries are silently skipped (best-effort)
Example:
-- Local embedding model (uses built-in llama.cpp engine)
SELECT memory_set_model('local', '/path/to/nomic-embed-text-v1.5.Q8_0.gguf');
-- Remote embedding via vectors.space (requires free API key)
SELECT memory_set_model('openai', 'text-embedding-3-small');
SELECT memory_set_apikey('your-vectorspace-api-key');Sets the API key for the vectors.space remote embedding service.
Parameters:
| Parameter | Type | Description |
|---|---|---|
key |
TEXT | API key obtained from vectors.space (free account) |
Returns: INTEGER - 1 on success
Notes:
- API key is stored in memory only, not persisted to disk
- Required when using any provider other than
"local" - Get a free API key by creating an account at vectors.space
Example:
SELECT memory_set_apikey('your-vectorspace-api-key');Sets a configuration option.
Parameters:
| Parameter | Type | Description |
|---|---|---|
key |
TEXT | Option name (see Configuration Options) |
value |
ANY | Option value (type depends on the option) |
Returns: INTEGER - 1 on success
Example:
-- Set maximum tokens per chunk
SELECT memory_set_option('max_tokens', 512);
-- Enable engine warmup
SELECT memory_set_option('engine_warmup', 1);
-- Set minimum score threshold
SELECT memory_set_option('min_score', 0.75);Retrieves a configuration option value.
Parameters:
| Parameter | Type | Description |
|---|---|---|
key |
TEXT | Option name |
Returns: ANY - Option value, or NULL if not set
Example:
SELECT memory_get_option('max_tokens');
-- Returns: 400
SELECT memory_get_option('provider');
-- Returns: "local"Syncs text content to memory. Duplicate content (same hash) is skipped automatically.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
content |
TEXT | Yes | Text content to store and index |
context |
TEXT | No | Optional context label for grouping memories |
Returns: INTEGER - 1 on success
Notes:
- Content is chunked based on
max_tokensandoverlay_tokenssettings - Each chunk is embedded and stored in
dbmem_vault - Content hash prevents duplicate storage — calling with the same content is a no-op
- Runs inside a SAVEPOINT transaction (see Sync Behavior)
- Sets
created_attimestamp automatically
Example:
-- Add text without context
SELECT memory_add_text('SQLite is a C-language library that implements a small, fast, self-contained SQL database engine.');
-- Add text with context
SELECT memory_add_text('Important meeting notes from 2024-01-15...', 'meetings');Syncs a file to memory. Unchanged files are skipped; modified files are atomically replaced.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
TEXT | Yes | Full path to the file |
context |
TEXT | No | Optional context label for grouping memories |
Returns: INTEGER - 1 on success
Notes:
- Only processes files matching configured extensions (default:
md,mdx) - File path is stored in
dbmem_content.path - If the file was previously indexed with different content, the old entry (chunks, embeddings, FTS) is deleted and new content is reindexed — all within a single SAVEPOINT transaction (see Sync Behavior)
- Not available when compiled with
DBMEM_OMIT_IO
Example:
SELECT memory_add_file('/docs/readme.md');
SELECT memory_add_file('/docs/api.md', 'documentation');Synchronizes a directory with memory. Adds new files, reindexes modified files, and removes entries for deleted files.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
path |
TEXT | Yes | Full path to the directory |
context |
TEXT | No | Optional context label applied to all files |
Returns: INTEGER - Number of new files processed
Notes:
- Recursively scans subdirectories
- Only processes files matching configured extensions
- Phase 1 — Cleanup: Removes entries for files that no longer exist on disk
- Phase 2 — Scan: Processes all matching files:
- New files are chunked, embedded, and added to the index
- Unchanged files are skipped (content hash match)
- Modified files have their old entries atomically replaced with new content
- Each file is processed inside its own SAVEPOINT transaction (see Sync Behavior)
- Safe to call repeatedly — only changed content triggers embedding computation
- Not available when compiled with
DBMEM_OMIT_IO
Example:
SELECT memory_add_directory('/path/to/docs');
-- Returns: 42 (number of new files processed)
SELECT memory_add_directory('/project/notes', 'project-notes');
-- Safe to call again — unchanged files are skipped
SELECT memory_add_directory('/path/to/docs');
-- Returns: 0 (nothing changed)Deletes a specific memory by its hash.
Parameters:
| Parameter | Type | Description |
|---|---|---|
hash |
INTEGER | The hash identifier of the memory to delete |
Returns: INTEGER - Number of content entries deleted (0 or 1)
Notes:
- Atomically deletes from
dbmem_content,dbmem_vault, anddbmem_vault_fts - Uses SAVEPOINT transaction for atomicity
- Hash can be obtained from
dbmem_contenttable or search results
Example:
-- Get hash from content table
SELECT hash FROM dbmem_content WHERE path LIKE '%readme%';
-- Delete by hash
SELECT memory_delete(1234567890);Deletes all memories with a specific context.
Parameters:
| Parameter | Type | Description |
|---|---|---|
context |
TEXT | The context label to match |
Returns: INTEGER - Number of content entries deleted
Notes:
- Deletes all entries where
contextmatches exactly - Cascades to chunks and FTS entries
Example:
-- Delete all memories with context 'meetings'
SELECT memory_delete_context('meetings');
-- Returns: 15Deletes all memories from the database.
Parameters: None
Returns: INTEGER - 1 on success
Notes:
- Clears
dbmem_content,dbmem_vault, anddbmem_vault_fts - Does not delete settings from
dbmem_settings - Does not clear the embedding cache (
dbmem_cache) - Uses SAVEPOINT transaction for atomicity
Example:
SELECT memory_clear();Clears the embedding cache.
Parameters:
| Parameter | Type | Required | Description |
|---|---|---|---|
provider |
TEXT | No | Provider name to clear cache for |
model |
TEXT | No | Model name to clear cache for |
Returns: INTEGER - Number of cache entries deleted
Notes:
- With 0 arguments: clears the entire embedding cache
- With 2 arguments: clears cache entries for a specific provider/model combination
- The embedding cache stores computed embeddings keyed by (text hash, provider, model) to avoid redundant computation
- Safe to call at any time — does not affect stored memories
Example:
-- Clear entire cache
SELECT memory_cache_clear();
-- Clear cache for a specific provider/model
SELECT memory_cache_clear('openai', 'text-embedding-3-small');A virtual table for performing hybrid semantic search.
Query Format:
SELECT * FROM memory_search WHERE query = 'search text';Columns:
| Column | Type | Description |
|---|---|---|
query |
TEXT (HIDDEN) | Search query (required in WHERE clause) |
hash |
INTEGER | Content hash identifier |
path |
TEXT | Source file path or generated UUID for text content |
context |
TEXT | Context label (NULL if not set) |
snippet |
TEXT | Text snippet from the matching chunk |
ranking |
REAL | Combined similarity score (0.0 - 1.0) |
Notes:
- Requires sqlite-vector extension loaded first
- Performs hybrid search combining vector similarity and FTS5
- Results are ranked by combined score
- Limited by
max_resultssetting (default: 20) - Filtered by
min_scoresetting (default: 0.7) - Updates
last_accessedtimestamp ifupdate_accessis enabled
Example:
-- Basic search
SELECT * FROM memory_search WHERE query = 'database indexing strategies';
-- Search with ranking filter
SELECT path, snippet, ranking
FROM memory_search
WHERE query = 'how to optimize queries'
AND ranking > 0.8;
-- Search within a specific context
SELECT * FROM memory_search
WHERE query = 'meeting action items'
AND context = 'meetings';| Option | Type | Default | Description |
|---|---|---|---|
provider |
TEXT | - | Embedding provider ("local" for llama.cpp, otherwise vectors.space) |
model |
TEXT | - | Model path (local) or identifier (remote) |
dimension |
INTEGER | - | Embedding dimension (auto-detected) |
max_tokens |
INTEGER | 400 | Maximum tokens per chunk |
overlay_tokens |
INTEGER | 80 | Token overlap between consecutive chunks |
chars_per_tokens |
INTEGER | 4 | Estimated characters per token |
save_content |
INTEGER | 1 | Store original content (1=yes, 0=no) |
skip_semantic |
INTEGER | 0 | Skip markdown parsing, treat as raw text |
skip_html |
INTEGER | 1 | Strip HTML tags when parsing |
extensions |
TEXT | "md,mdx" | Comma-separated file extensions to process |
engine_warmup |
INTEGER | 0 | Warm up engine on model load (compiles GPU shaders) |
max_results |
INTEGER | 20 | Maximum search results |
fts_enabled |
INTEGER | 1 | Enable FTS5 in hybrid search |
vector_weight |
REAL | 0.5 | Weight for vector similarity in scoring |
text_weight |
REAL | 0.5 | Weight for FTS in scoring |
min_score |
REAL | 0.7 | Minimum score threshold for results |
update_access |
INTEGER | 1 | Update last_accessed on search |
embedding_cache |
INTEGER | 1 | Cache embeddings to avoid redundant computation |
cache_max_entries |
INTEGER | 0 | Max cache entries (0 = no limit). When exceeded, oldest entries are evicted |
search_oversample |
INTEGER | 0 | Search oversampling multiplier (0 = no oversampling). When set, retrieves N * multiplier candidates from each index before merging down to N final results |
The extension tracks two timestamps for each memory:
- Set automatically when content is added via
memory_add_text,memory_add_file, ormemory_add_directory - Stored as Unix timestamp (seconds since 1970-01-01 00:00:00 UTC)
- Never updated after initial creation
- Updated when content appears in search results (if
update_access=1) - Stored as Unix timestamp (seconds since 1970-01-01 00:00:00 UTC)
- Can be disabled by setting
update_accessto 0
Displaying timestamps in local time:
SELECT
path,
datetime(created_at, 'unixepoch', 'localtime') as created,
datetime(last_accessed, 'unixepoch', 'localtime') as accessed
FROM dbmem_content;-- Load extensions
.load ./vector
.load ./memory
-- Check version
SELECT memory_version();
-- Configure local embedding model
SELECT memory_set_model('local', '/models/nomic-embed-text-v1.5.Q8_0.gguf');
-- Configure options
SELECT memory_set_option('max_tokens', 512);
SELECT memory_set_option('min_score', 0.75);
-- Add content
SELECT memory_add_text('SQLite is a C library that provides a lightweight disk-based database.', 'sqlite-docs');
SELECT memory_add_directory('/docs/sqlite', 'sqlite-docs');
-- Search
SELECT path, snippet, ranking
FROM memory_search
WHERE query = 'how does SQLite store data on disk';
-- View all memories with timestamps
SELECT
hash,
path,
context,
datetime(created_at, 'unixepoch', 'localtime') as created,
datetime(last_accessed, 'unixepoch', 'localtime') as last_used
FROM dbmem_content
ORDER BY last_accessed DESC;
-- Delete by context
SELECT memory_delete_context('old-docs');
-- Clear all
SELECT memory_clear();-- Add memories with different contexts
SELECT memory_add_text('Meeting notes...', 'meetings');
SELECT memory_add_text('API documentation...', 'api-docs');
SELECT memory_add_text('Tutorial content...', 'tutorials');
-- Search within a context
SELECT * FROM memory_search
WHERE query = 'authentication'
AND context = 'api-docs';
-- List all contexts
SELECT context, COUNT(*) as count
FROM dbmem_content
GROUP BY context;
-- Delete a context
SELECT memory_delete_context('old-meetings');-- Total memories and chunks
SELECT
(SELECT COUNT(*) FROM dbmem_content) as total_memories,
(SELECT COUNT(*) FROM dbmem_vault) as total_chunks;
-- Storage usage
SELECT
SUM(length(embedding)) as embedding_bytes,
SUM(length) as content_bytes
FROM dbmem_vault;
-- Memories by context
SELECT
COALESCE(context, '(none)') as context,
COUNT(*) as count
FROM dbmem_content
GROUP BY context;
-- Recently accessed
SELECT path, datetime(last_accessed, 'unixepoch', 'localtime') as last_used
FROM dbmem_content
WHERE last_accessed > 0
ORDER BY last_accessed DESC
LIMIT 10;| Option | Description |
|---|---|
DBMEM_OMIT_IO |
Omit file/directory functions (for WASM) |
DBMEM_OMIT_LOCAL_ENGINE |
Omit llama.cpp local engine (for remote-only builds) |
DBMEM_OMIT_REMOTE_ENGINE |
Omit vectors.space remote engine (for local-only builds) |
SQLITE_CORE |
Compile as part of SQLite core (not as loadable extension) |
All functions return an error if:
- Required parameters are missing or of wrong type
- Database operations fail
- Model file not found (for local provider)
- Embedding dimension mismatch
Errors can be caught using standard SQLite error handling mechanisms.
-- Example error handling in application code
SELECT memory_add_text(123); -- Error: expects TEXT parameter
SELECT memory_delete('abc'); -- Error: expects INTEGER parameter