Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
58 changes: 58 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,64 @@ All notable changes to VecGrep are documented here.

---

## [Unreleased] — feat/byok-embedding-providers

### Added

- **BYOK cloud embedding providers** — bring your own API key to use OpenAI,
Voyage AI, or Google Gemini embeddings instead of the default local model.
Pass `provider="openai"` (or `"voyage"` / `"gemini"`) to `index_codebase`.

| Provider | Model | Dims | API key env var | Install extra |
|---|---|---|---|---|
| `openai` | `text-embedding-3-small` | 1536 | `VECGREP_OPENAI_KEY` | `vecgrep[openai]` |
| `voyage` | `voyage-code-3` | 1024 | `VECGREP_VOYAGE_KEY` | `vecgrep[voyage]` |
| `gemini` | `gemini-embedding-exp-03-07` | 3072 | `VECGREP_GEMINI_KEY` | `vecgrep[gemini]` |

- **Strategy-pattern `EmbeddingProvider` ABC** — `LocalProvider`,
`OpenAIProvider`, `VoyageProvider`, and `GeminiProvider` all implement the
same `embed(texts) → np.ndarray` interface. Adding new providers only
requires subclassing `EmbeddingProvider`.

- **Dynamic vector dimensions** — `VectorStore` now stores the embedding
dimensionality in the meta table and creates the LanceDB schema with the
correct dims for the chosen provider (384 / 1024 / 1536 / 3072). Backward
compatible: existing 384-dim indexes open without migration.

- **Provider lock** — once a project index is built with a provider, re-
indexing with a different provider requires `force=True`. This prevents
silent dimension mismatches. The lock is stored in the per-project meta
table; switching with `force=True` drops and recreates the chunks table.

- **`get_index_status` now reports provider metadata** — the `Provider`,
`Model`, and `Dimensions` fields are printed in the status output.

- **Optional dependency extras in `pyproject.toml`** —
`vecgrep[openai]`, `vecgrep[voyage]`, `vecgrep[gemini]`, `vecgrep[cloud]`
install only the packages needed for the chosen provider.

### Changed

- **Live-sync guard for cloud providers** — `watch=True` is rejected for any
non-local provider; live file-change sync with cloud embeddings would
incur unbounded API costs.

- **`_get_meta` / `_set_meta` helpers on `VectorStore`** — refactored
manual meta-table queries into reusable key/value helpers used throughout
the store.

### Tests

- Added a full BYOK test suite (`tests/test_providers.py`) covering:
- Provider registry (`get_provider`, unknown provider errors)
- `LocalProvider` shape, dtype, and L2-normalization
- Backward-compatible `embed()` free function
- Cloud providers raising `RuntimeError` when API key or package is missing
- `OpenAIProvider`, `VoyageProvider`, `GeminiProvider` with mocked API
responses (shape, dtype, normalization, empty-input edge cases)

---

## [1.6.0] — 2026-03-02

### Added
Expand Down
68 changes: 61 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,8 +11,10 @@ Instead of grepping 50 files and sending 30,000 tokens to Claude, VecGrep return
## How it works

1. **Chunk** — Parses source files with tree-sitter to extract semantic units (functions, classes, methods)
2. **Embed** — Encodes each chunk locally using [`all-MiniLM-L6-v2-code-search-512`](https://huggingface.co/isuruwijesiri/all-MiniLM-L6-v2-code-search-512) (384-dim, ~80MB one-time download) via the fastembed ONNX backend (~100ms startup) or PyTorch, automatically using Metal (Apple Silicon), CUDA (NVIDIA), or CPU
3. **Store** — Saves embeddings + metadata in LanceDB under `~/.vecgrep/<project_hash>/`
2. **Embed** — Encodes each chunk using the configured embedding provider:
- **Local** (default) — [`all-MiniLM-L6-v2-code-search-512`](https://huggingface.co/isuruwijesiri/all-MiniLM-L6-v2-code-search-512) via fastembed ONNX (~100ms startup, no API key) or PyTorch, with auto device detection (Apple Silicon, CUDA, CPU)
- **Cloud (BYOK)** — OpenAI, Voyage AI, or Google Gemini via your own API key (higher-quality embeddings, optional)
3. **Store** — Saves embeddings + metadata in LanceDB under `~/.vecgrep/<project_hash>/`; vector dimensions adapt automatically to the chosen provider
4. **Search** — ANN index (IVF-PQ) for fast approximate search on large codebases

Incremental re-indexing via mtime/size checks skips unchanged files.
Expand Down Expand Up @@ -66,15 +68,22 @@ After the first index, subsequent searches skip unchanged files automatically

## Tools

### `index_codebase(path, force=False)`
### `index_codebase(path, force=False, watch=False, provider=None)`

Index a project directory. Skips unchanged files on subsequent calls.

```
index_codebase("/path/to/myproject")
# → "Indexed 142 file(s), 1847 chunk(s) added (0 file(s) skipped, unchanged)"

# Use OpenAI embeddings instead of local
index_codebase("/path/to/myproject", provider="openai")
```

**Provider lock**: once a project is indexed with a provider, re-indexing with a different provider requires `force=True` (this rebuilds the vector table with the new embedding dimensions).

**Note:** `watch=True` is only supported with the `local` provider — live sync with cloud providers would incur unbounded API costs.

### `search_code(query, path, top_k=8)`

Semantic search. Auto-indexes if no index exists.
Expand All @@ -96,24 +105,29 @@ def authenticate_user(token: str) -> User:

### `get_index_status(path)`

Check index statistics.
Check index statistics, including the embedding provider used.

```
Index status for: /path/to/myproject
Files indexed: 142
Total chunks: 1847
Last indexed: 2026-02-22T07:20:31+00:00
Index size: 28.4 MB
Provider: local
Model: isuruwijesiri/all-MiniLM-L6-v2-code-search-512
Dimensions: 384
```

## Configuration

VecGrep can be tuned via environment variables:

### Local provider

| Variable | Default | Description |
|---|---|---|
| `VECGREP_BACKEND` | `onnx` | Embedding backend: `onnx` (fastembed, fast startup) or `torch` (sentence-transformers, any HF model) |
| `VECGREP_MODEL` | `isuruwijesiri/all-MiniLM-L6-v2-code-search-512` | HuggingFace model ID to use for embeddings |
| `VECGREP_BACKEND` | `onnx` | Local backend: `onnx` (fastembed, fast startup) or `torch` (sentence-transformers, any HF model) |
| `VECGREP_MODEL` | `isuruwijesiri/all-MiniLM-L6-v2-code-search-512` | HuggingFace model ID (local provider only) |

**Backend comparison:**

Expand All @@ -122,7 +136,47 @@ VecGrep can be tuned via environment variables:
| `onnx` (default) | ~100ms | No | ONNX-exported models only |
| `torch` | ~2–3s | Yes | Any HuggingFace model |

**Examples:**
### Cloud providers (BYOK — Bring Your Own Key)

VecGrep supports three cloud embedding providers. Each requires an API key environment variable and the corresponding optional dependency.

| Provider | Env var | Model | Dims | Install extra |
|---|---|---|---|---|
| `openai` | `VECGREP_OPENAI_KEY` | `text-embedding-3-small` | 1536 | `vecgrep[openai]` |
| `voyage` | `VECGREP_VOYAGE_KEY` | `voyage-code-3` | 1024 | `vecgrep[voyage]` |
| `gemini` | `VECGREP_GEMINI_KEY` | `gemini-embedding-exp-03-07` | 3072 | `vecgrep[gemini]` |

**Install cloud extras:**

```bash
# Single provider
uv tool install --python 3.12 'vecgrep[openai]'
pip install 'vecgrep[openai]'

# All cloud providers at once
pip install 'vecgrep[cloud]'
```

**Use a cloud provider:**

```bash
# Set your API key
export VECGREP_OPENAI_KEY=sk-...

# Index with OpenAI embeddings
index_codebase("/path/to/myproject", provider="openai")

# Or tell Claude to use it:
# "Index my project at /path/to/myproject using openai embeddings"
```

**Switch providers** (requires force re-index to rebuild the vector table):

```
index_codebase("/path/to/myproject", provider="voyage", force=True)
```

**Local backend examples:**

```bash
# Use a different model with the torch backend
Expand Down