Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 5 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,10 +72,11 @@ pip install cortexadb[docs,pdf] # Optional: For PDF/Docx support
<details>
<summary><b>Technical Architecture & Benchmarks</b></summary>

### Performance Benchmarks (v0.1.7)
Measured on M2 Mac with 1,000 chunks of text.
### Performance Benchmarks (v0.1.8)

| Operation | v0.1.6 (Sync) | v0.1.7 (Batch) | Improvement |
CortexaDB `v0.1.8` introduced a new batching architecture. Measured on an M2 Mac with 1,000 chunks of text:

| Operation | v0.1.6 (Sync) | v0.1.8 (Batch) | Improvement |
|-----------|---------------|----------------|-------------|
| Ingestion | 12.4s | **0.12s** | **103x Faster** |
| Memory Add| 15ms | 1ms | 15x Faster |
Expand All @@ -86,7 +87,7 @@ Measured on M2 Mac with 1,000 chunks of text.
---

## License & Status
CortexaDB is currently in **Beta (v0.1.7)**. It is released under the **MIT** and **Apache-2.0** licenses.
CortexaDB is currently in **Beta (v0.1.8)**. It is released under the **MIT** and **Apache-2.0** licenses.
We are actively refining the API and welcome feedback!

---
Expand Down
82 changes: 37 additions & 45 deletions docs/content/docs/api/python.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -72,57 +72,55 @@ report = db.last_replay_report

## Memory Operations

### `.remember(text, embedding=None, metadata=None)`
### `.add(text=None, vector=None, metadata=None, collection=None)`

Stores a new memory entry. If an embedder is configured and no embedding is provided, the text is auto-embedded.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `text` | `str` | Required | Text content to store |
| `embedding` | `list[float]?` | `None` | Pre-computed embedding vector |
| `text` | `str?` | `None` | Text content to store |
| `vector` | `list[float]?` | `None` | Pre-computed embedding vector |
| `metadata` | `dict[str, str]?` | `None` | Key-value metadata pairs |
| `collection` | `str?` | `"default"` | Target collection |

**Returns:** `int` - The assigned memory ID

**Example:**
```python
mid = db.remember("User prefers dark mode")
mid = db.remember("text", metadata={"source": "onboarding"})
mid = db.remember("text", embedding=[0.1, 0.2, ...])
mid = db.add("User prefers dark mode")
mid = db.add("text", metadata={"source": "onboarding"})
mid = db.add("text", vector=[0.1, 0.2, ...], collection="agent_a")
```

---

### `.ask(query, embedding=None, top_k=5, use_graph=False, recency_bias=False)`
### `.query(text=None, vector=None)`

Performs a hybrid search across the database.
Starts a fluent query builder to search across the database.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `query` | `str` | Required | Search query text |
| `embedding` | `list[float]?` | `None` | Pre-computed query embedding |
| `top_k` | `int` | `5` | Number of results to return |
| `use_graph` | `bool` | `False` | Enable graph expansion via BFS |
| `recency_bias` | `bool` | `False` | Boost recent memories in scoring |
**Methods:**

**Returns:** `list[Hit]`
| Method | Description |
|--------|-------------|
| `.limit(n)` | Set maximum number of results (default 5) |
| `.collection(name)` | Filter to a specific collection |
| `.use_graph()` | Enable hybrid graph traversal |
| `.recency_bias()` | Boost recent memories in scoring |
| `.execute()` | Run the query and return `list[Hit]` |

**Example:**
```python
hits = db.ask("What does the user prefer?")
hits = db.ask("query", top_k=10, use_graph=True, recency_bias=True)
hits = db.query("What does the user prefer?").limit(5).use_graph().execute()

for hit in hits:
print(f"ID: {hit.id}, Score: {hit.score:.3f}")
```

---

### `.get_memory(mid)`
### `.get(mid)`

Retrieves a full memory entry by ID.

Expand All @@ -138,7 +136,7 @@ Retrieves a full memory entry by ID.

**Example:**
```python
mem = db.get_memory(42)
mem = db.get(42)
print(mem.id) # 42
print(mem.content) # b"User prefers dark mode"
print(mem.namespace) # "default"
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The method was correctly renamed from get_memory to get. However, the example on line 142 still shows print(mem.namespace) # "default" — given that the PR is renaming "namespace" to "collection" throughout, you may want to add a comment clarifying that namespace is the internal field name (which is noted in the Memory type table at line 382), or rename the field for consistency.

Suggested change
print(mem.namespace) # "default"
print(mem.collection) # "default"

Copilot uses AI. Check for mistakes.
Expand All @@ -150,7 +148,7 @@ print(mem.embedding) # [0.1, 0.2, ...] or None

---

### `.delete_memory(mid)`
### `.delete(mid)`

Permanently deletes a memory and updates all indexes.

Expand All @@ -164,7 +162,7 @@ Permanently deletes a memory and updates all indexes.

**Example:**
```python
db.delete_memory(42)
db.delete(42)
```

---
Expand All @@ -189,7 +187,7 @@ db.connect(1, 2, "relates_to")
db.connect(1, 3, "caused_by")
```

> Both memories must be in the same namespace. Cross-namespace edges are forbidden.
> Both memories must be in the same collection. Cross-collection edges are forbidden.

---

Expand Down Expand Up @@ -229,7 +227,7 @@ Chunks text and stores each chunk as a memory.
| `chunk_size` | `int` | `512` | Target chunk size in characters |
| `overlap` | `int` | `50` | Overlap between chunks |
| `metadata` | `dict?` | `None` | Metadata to attach to all chunks |
| `namespace` | `str?` | `None` | Target namespace |
| `collection` | `str?` | `None` | Target collection |
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter in the table was updated from namespace to collection, but the method signature on line 217 still reads .ingest(text, strategy="recursive", chunk_size=512, overlap=50, metadata=None, namespace=None). The namespace=None in the signature should also be updated to collection=None for consistency.

Copilot uses AI. Check for mistakes.

**Returns:** `list[int]` - Memory IDs of stored chunks

Expand All @@ -248,7 +246,7 @@ Loads a file, chunks it, and stores each chunk.
| `chunk_size` | `int` | `512` | Target chunk size |
| `overlap` | `int` | `50` | Overlap between chunks |
| `metadata` | `dict?` | `None` | Metadata for all chunks |
| `namespace` | `str?` | `None` | Target namespace |
| `collection` | `str?` | `None` | Target collection |
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The parameter in the table was updated from namespace to collection, but the method signature on line 236 still reads .load(file_path, strategy="markdown", chunk_size=512, overlap=50, metadata=None, namespace=None). The namespace=None in the signature should also be updated to collection=None for consistency.

Copilot uses AI. Check for mistakes.

**Supported formats:** `.txt`, `.md`, `.json`, `.docx` (requires `cortexadb[docs]`), `.pdf` (requires `cortexadb[pdf]`)

Expand All @@ -260,34 +258,28 @@ db.load("paper.pdf", strategy="recursive", chunk_size=1024)

---

### `.ingest_document(text, chunk_size=512, overlap=50, metadata=None, namespace=None)`

Legacy method for chunking and storing text. Uses fixed chunking.

---

## Namespace
## Collections

### `.namespace(name, readonly=False)`
### `.collection(name, readonly=False)`

Returns a scoped view of the database for a specific namespace.
Returns a scoped view of the database for a specific collection.

**Parameters:**

| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `name` | `str` | Required | Namespace name |
| `name` | `str` | Required | Collection name |
| `readonly` | `bool` | `False` | If `True`, write operations raise errors |

**Returns:** `Namespace`
**Returns:** `Collection`

**Example:**
```python
ns = db.namespace("agent_a")
mid = ns.remember("text")
hits = ns.ask("query")
ns.delete_memory(mid)
ns.ingest_document("long text")
col = db.collection("agent_a")
mid = col.add("text")
hits = col.query("query").execute()
col.delete(mid)
col.ingest("long text")
```

---
Expand Down Expand Up @@ -382,12 +374,12 @@ Query result from `.ask()`.

### `Memory`

Full memory entry from `.get_memory()`.
Full memory entry from `.get()`.
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The Memory type description here was updated to reference .get(), which is correct. However, the Hit type description a few lines above (line 368) still reads Query result from .ask(). and should be updated to reference .query().execute() instead, to match the new API.

Copilot uses AI. Check for mistakes.

| Field | Type | Description |
|-------|------|-------------|
| `id` | `int` | Memory ID |
| `namespace` | `str` | Namespace name |
| `namespace` | `str` | Collection name (internal key) |
| `content` | `bytes` | Raw content |
| `embedding` | `list[float]?` | Vector embedding |
| `metadata` | `dict[str, str]` | Key-value metadata |
Expand Down
27 changes: 24 additions & 3 deletions docs/content/docs/api/rust.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,13 +12,15 @@ The high-level API for interacting with the database.
### Opening a Database

```rust
use cortexadb_core::CortexaDB;
use cortexadb_core::{CortexaDB, CortexaDBBuilder};
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CortexaDBBuilder is not re-exported from the crate root (cortexadb_core). Looking at crates/cortexadb-core/src/lib.rs:12, only CortexaDB, CortexaDBConfig, CortexaDBError, Memory, and Stats are re-exported from facade. This import would fail to compile. Either update lib.rs to also re-export CortexaDBBuilder, or change the import here to use cortexadb_core::facade::CortexaDBBuilder;.

Suggested change
use cortexadb_core::{CortexaDB, CortexaDBBuilder};
use cortexadb_core::{CortexaDB, facade::CortexaDBBuilder};

Copilot uses AI. Check for mistakes.

// Simple open with default config
let db = CortexaDB::open("/path/to/db", 128)?;
let db = CortexaDBBuilder::new("/path/to/db", 128).build()?;

// Builder pattern for advanced config
let db = CortexaDB::builder("/path/to/db", config).build()?;
let db = CortexaDBBuilder::new("/path/to/db", 128)
.with_sync_policy(cortexadb_core::engine::SyncPolicy::Async { interval_ms: 1000 })
.build()?;
```

---
Expand Down Expand Up @@ -142,6 +144,25 @@ println!("Indexed: {}", stats.indexed_embeddings);

---

## Observability / Telemetry

CortexaDB uses the standard Rust [`log`](https://crates.io/crates/log) crate for all internal diagnostics and telemetry. It issues structured `debug!` and `trace!` logs instead of printing to stdout/stderr.

To see CortexaDB metrics and internal operations in your application, initialize a logger (like `env_logger` or `tracing-subscriber`):

```rust
use env_logger;

fn main() {
// Initialize the logger before opening the database
env_logger::init();

// In your terminal, run with: RUST_LOG=cortexadb_core=debug cargo run
}
```

---

## Types

### `Hit`
Expand Down
22 changes: 11 additions & 11 deletions docs/content/docs/getting-started/quickstart.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -30,23 +30,23 @@ db = CortexaDB.open("agent.mem", dimension=128)

```python
# Auto-embedding (requires embedder)
mid1 = db.remember("The user prefers dark mode.")
mid2 = db.remember("User works at Stripe.")
mid1 = db.add("The user prefers dark mode.")
mid2 = db.add("User works at Stripe.")

# With metadata
mid3 = db.remember("User's name is Alice.", metadata={"source": "onboarding"})
mid3 = db.add("User's name is Alice.", metadata={"source": "onboarding"})
```

### 4. Query Memories

```python
# Semantic search
hits = db.ask("What does the user like?")
hits = db.query("What does the user like?").execute()
for hit in hits:
print(f"ID: {hit.id}, Score: {hit.score:.3f}")

# Retrieve full memory
mem = db.get_memory(hits[0].id)
mem = db.get(hits[0].id)
print(mem.content) # b"The user prefers dark mode."
```

Expand All @@ -68,12 +68,12 @@ db.load("document.pdf", strategy="recursive")
db.ingest("Long article text here...", strategy="markdown")
```

### 7. Use Namespaces
### 7. Use Collections

```python
agent_a = db.namespace("agent_a")
agent_a.remember("Agent A's private memory")
hits = agent_a.ask("query only agent A's memories")
agent_a = db.collection("agent_a")
agent_a.add("Agent A's private memory")
hits = agent_a.query("query only agent A's memories").execute()
```

---
Expand All @@ -90,10 +90,10 @@ cortexadb-core = { git = "https://github.com/anaslimem/CortexaDB.git" }
### 2. Basic Usage

```rust
use cortexadb_core::CortexaDB;
use cortexadb_core::{CortexaDB, CortexaDBBuilder};
Copy link

Copilot AI Mar 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CortexaDBBuilder is not re-exported from the crate root (cortexadb_core). Looking at crates/cortexadb-core/src/lib.rs:12, only CortexaDB, CortexaDBConfig, CortexaDBError, Memory, and Stats are re-exported. This import would fail to compile. Either update lib.rs to also re-export CortexaDBBuilder, or change the import to use cortexadb_core::facade::CortexaDBBuilder;.

Suggested change
use cortexadb_core::{CortexaDB, CortexaDBBuilder};
use cortexadb_core::CortexaDB;
use cortexadb_core::facade::CortexaDBBuilder;

Copilot uses AI. Check for mistakes.

fn main() -> Result<(), Box<dyn std::error::Error>> {
let db = CortexaDB::open("/tmp/agent.mem", 128)?;
let db = CortexaDBBuilder::new("/tmp/agent.mem", 128).build()?;

// Store a memory with an embedding
let embedding = vec![0.1; 128];
Expand Down
Loading