diff --git a/docs/indexing/index.mdx b/docs/indexing/index.mdx
index d8889eb..39c33b9 100644
--- a/docs/indexing/index.mdx
+++ b/docs/indexing/index.mdx
@@ -28,13 +28,12 @@ LanceDB provides a comprehensive suite of indexing strategies for different data
| Index | Use Case | Description |
| :--------- | :------- | :---------- |
-| `HNSW` (Vector) | High recall and low latency vector searches. Ideal for applications requiring fast approximate nearest neighbor queries with high accuracy. | Hierarchical Navigable Small World—a graph-based approximate nearest neighbor algorithm.
Distance metrics: `l2` `cosine` `dot`
Quantizations: `PQ` `SQ`|
| `IVF` (Vector) | Large-scale vector search with configurable accuracy/speed trade-offs. Supports binary vectors with hamming distance. | Inverted File Index—a partition-based approximate nearest neighbor algorithm that groups similar vectors into partitions for efficient search.
Distance metrics: `l2` `cosine` `dot` `hamming`
Quantizations: `None/Flat` `PQ` `SQ` `RQ`|
| `IVF_HNSW` (Vector) | Large-scale vector search requiring both high recall and efficient partitioning. Combines the scalability of IVF with the search quality of HNSW. | Hybrid index combining IVF partitioning with HNSW graphs built within each partition. Provides improved search quality over pure IVF while maintaining scalability.
Distance metrics: `l2` `cosine` `dot`
Quantizations: `SQ`, `PQ`|
+| `FTS` (Full-text search) | String columns (e.g., title, description, content) requiring keyword-based search with BM25 ranking. | Full-text search index using BM25 ranking algorithm. Tokenizes text with configurable tokenization, stemming, stop word removal, and language-specific processing. |
| `BTree` (Scalar) | Numeric, temporal, and string columns with mostly distinct values. Best for highly selective queries on columns with many unique values. | Sorted index storing sorted copies of scalar columns with block headers in a btree cache. Header entries map to blocks of rows (4096 rows per block) for efficient disk reads. |
| `Bitmap` (Scalar) | Low-cardinality columns with few thousand or fewer distinct values. Accelerates equality and range filters. | Stores a bitmap for each distinct value in the column, with one bit per row indicating presence. Memory-efficient for low-cardinality data. |
| `LabelList` (Scalar) | List columns (e.g., tags, categories, keywords) requiring array containment queries. | Scalar index for `List` columns using an underlying bitmap index structure to enable fast array membership lookups. |
-| `FTS` (Full-text) | String columns (e.g., title, description, content) requiring keyword-based search with BM25 ranking. | Full-text search index using BM25 ranking algorithm. Tokenizes text with configurable tokenization, stemming, stop word removal, and language-specific processing. |
TypeScript currently doesn't support `IvfSq` (IVF with Scalar Quantization).
diff --git a/docs/indexing/vector-index.mdx b/docs/indexing/vector-index.mdx
index 379624d..2dae75c 100644
--- a/docs/indexing/vector-index.mdx
+++ b/docs/indexing/vector-index.mdx
@@ -1,7 +1,7 @@
---
title: "Vector Indexes"
sidebarTitle: "Vector Index"
-description: "Build and optimize LanceDB vector indexes, including IVF_HNSW_SQ, IVF_RQ, IVF_PQ, and binary indexes."
+description: "Build and optimize LanceDB vector indexes, including IVF, HNSW and binary quantized indexes."
icon: "arrow-up-right-dots"
---
import {
@@ -18,33 +18,39 @@ import {
PyVectorIndexCheckStatus as VectorIndexCheckStatus,
} from '/snippets/indexing.mdx';
-LanceDB offers two main vector indexing algorithms: **Inverted File (IVF)** and **Hierarchically Navigable Small Worlds (HNSW)**. You can create multiple vector indexes within a Lance table. This guide walks through common configurations and build patterns.
+You can create and manage multiple vector indexes on any Lance dataset. LanceDB offers two kinds of vector indexing algorithms: **Inverted File (IVF)** and **Hierarchically Navigable Small Worlds (HNSW)**.
-### Option 1: Self-Hosted Indexing
+
+**IVF + HNSW**
-**Manual, Sync or Async:** If using LanceDB Open Source, you will have to build indexes manually, as well as reindex and tune indexing parameters. The Python SDK lets you do this *synchronously and asynchronously*.
+In LanceDB, HNSW is not exposed as a top-level vector index. Instead, it's available as a sub-index inside IVF partitions. What this means in practice is that vectors are first partitioned by IVF, then each selected partition is searched using an HNSW graph (with quantization via `IVF_HNSW_PQ` / `IVF_HNSW_SQ`). This combines IVF's scalability with HNSW's higher-recall ANN search within partitions.
+
-### Option 2: Automated Indexing
+### Manual Indexing
-**Automatic and Async:** Indexing is automatic in LanceDB Cloud/Enterprise. As soon as data is updated, our system automates index optimization. *This is done asynchronously*.
+If using LanceDB OSS, you will have to create the vector index manually, by calling `table.create_index()`, and updating the index as new data arrives and tuning its parameters is also a manual process.
-Here is what happens in the background - when a table contains a single vector column named `vector`, LanceDB automatically:
+### Automatic Indexing
-- Infers the vector column from the schema
-- Creates an optimized `IVF_PQ` index without manual configuration
-- The default distance is `l2` or euclidean
+ Enterprise-only
+Vector indexing is managed **automatically** in LanceDB Cloud/Enterprise. As soon as data is updated, the system updates the index and optimizates it. *This is done asynchronously as a background process*.
-Finally, LanceDB Cloud/Enterprise will analyze your data distribution to **automatically configure indexing parameters**.
+When you create a table in LanceDB Enterprise, LanceDB automatically:
-
-You can create a new index with different parameters using `create_index` - this replaces any existing index
+- Infers the vector columns from the schema
+- Create an optimized `IVF_PQ` index without manual configuration
+- Automatically configure indexing parameters
+The default distance is `l2` (Euclidean).
+
+
+You can call `create_index()` with different parameters to create a new index -- this replaces any existing index.
Although the `create_index` API returns immediately, the building of the vector index is asynchronous. To wait until all data is fully indexed, you can specify the `wait_timeout` parameter.
## Choose the Right Index
-Use this table as a quick starting point:
+Use this table as a quick starting point for choosing the right index type and quantization method for your use case:
| If your top priority is... | Use this index | Why | Typical compressed size vs. raw vectors |
| :--- | :--- | :--- | :--- |
@@ -59,7 +65,7 @@ If your vector search frequently includes metadata filters (`where(...)`), prefe
Compression ratios are practical rules of thumb and can vary with vector distribution, metric, and configuration.
For small dimensions, choose `IVF_PQ` for accuracy, not for guaranteed higher compression than `IVF_RQ`.
-### Indexing Tuning by Index Type
+### Index Tuning
Start with these values, then tune for your workload:
diff --git a/docs/search/vector-search.mdx b/docs/search/vector-search.mdx
index f7118d9..82008d3 100644
--- a/docs/search/vector-search.mdx
+++ b/docs/search/vector-search.mdx
@@ -21,19 +21,30 @@ Ensure you always use the same distance metric that your embedding model was tra
The right metric improves both search accuracy and query performance. Currently, LanceDB supports the following metrics:
-| Metric | Description | Default |
-| :-------- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------ |
-| `l2` | [Euclidean distance](https://en.wikipedia.org/wiki/Euclidean_distance) - measures the straight-line distance between two points in vector space. Calculated as the square root of the sum of squared differences between corresponding vector components. | ✓ |
-| `cosine` | [Cosine similarity](https://en.wikipedia.org/wiki/Cosine_similarity) - measures the cosine of the angle between two vectors, ranging from -1 to 1. Computed as the dot product divided by the product of vector magnitudes. Use for unnormalized vectors. | x |
-| `dot` | [Dot product](https://en.wikipedia.org/wiki/Dot_product) - calculates the sum of products of corresponding vector components. Provides raw similarity scores without normalization, sensitive to vector magnitudes. Use for normalized vectors for best performance. | x |
-| `hamming` | [Hamming distance](https://en.wikipedia.org/wiki/Hamming_distance) - counts the number of positions where corresponding bits differ between binary vectors. Only applicable to binary vectors stored as packed uint8 arrays. | x |
+| Distance metric | Mathematical form | Notes |
+|---|---|---|
+| `l2` | $\|x-y\|_2=\sqrt{\sum_i (x_i-y_i)^2}$ | Measures the straight-line distance between two points in vector space. Calculated as the square root of the sum of squared differences between corresponding vector components. |
+| `cosine` | $1-\frac{x\cdot y}{\|x\|_2\|y\|_2}$ | Measures directional difference between vectors. Computed as 1 minus cosine similarity (the dot product normalized by both vector magnitudes), so vector length does not affect the score. Use for unnormalized vectors. |
+| `dot` | $x\cdot y=\sum_i x_i y_i$ | Calculates the sum of products of corresponding vector components. Provides raw similarity scores without normalization, sensitive to vector magnitudes. Use for normalized vectors for best performance. |
+| `hamming` | $\sum_i \mathbf{1}[x_i\neq y_i]$ | Counts the number of positions where corresponding bits differ between binary vectors. Only applicable to binary vectors stored as packed uint8 arrays. |
+
+For indexed search, supported distance metrics vary by index type:
+
+| Index type | Supported distance metrics |
+|---|---|
+| `IVF_FLAT` | `["l2", "cosine", "dot", "hamming"]` |
+| `IVF_PQ` | `["l2", "cosine", "dot"]` |
+| `IVF_SQ` | `["l2", "cosine", "dot"]` |
+| `IVF_RQ` | `["l2", "cosine", "dot"]` |
+| `IVF_HNSW_PQ` | `["l2", "cosine", "dot"]` |
+| `IVF_HNSW_SQ` | `["l2", "cosine", "dot"]` |
### Configure Distance Metric
By default, `l2` will be used as metric type. You can specify the metric type as
-`cosine` or `dot` if required.
+`cosine` or `dot` if required (`hamming` is supported for `IVF_FLAT` index only).
-**Note:** You can configure the distance metric during search only if there’s no vector index. If a vector index exists, the distance metric will always be the one you specified when creating the index.
+**Note:** You can configure the distance metric during search only if there's no vector index. If a vector index exists, the distance metric will always be the one you specified when creating the index.
```python Python icon="python"