Conversation
|
@Ergus is attempting to deploy a commit to the ClickHouse Team on Vercel. A member of the Team first needs to authorize it. |
| > Note: With text indexes generally availability (GA) starting from ClickHouse version 26.2, bloom filter–based indexes are not recommended anymore for full text search. | ||
| Although they are more compact, unfortunately they tend to produce false positives because they are probabilistic. | ||
| Furthermore, they offer limited configurability. | ||
| > Note: With `text` indexes generally availability (GA) starting from ClickHouse version 26.2, `ngrambf_v1` and `tokenbf_v1` indexes are NOT recommended anymore for full text search. |
There was a problem hiding this comment.
| > Note: With `text` indexes generally availability (GA) starting from ClickHouse version 26.2, `ngrambf_v1` and `tokenbf_v1` indexes are NOT recommended anymore for full text search. | |
| :::note | |
| With general availability (GA) of the `text` index starting from ClickHouse version 26.2, `tokenbf_v1` and `ngrambf_v1` indexes are no longer recommended for full text search. | |
| See page ["Full-text search with text indexes"](/engines/table-engines/mergetree-family/textindexes.md) for details. | |
| ::: |
| > Although they are more compact, unfortunately they tend to produce false positives because they are probabilistic. | ||
| > Furthermore, they offer limited configurability. | ||
| > | ||
| > The `text` index provides a true inverted index with better search performance, more predictable behavior, and greater flexibility and performance compared with token-based Bloom filter indexes. |
There was a problem hiding this comment.
what does "more predictable behavior" mean here? Do you mean the text index is deterministic, so no false-positives?
There was a problem hiding this comment.
maybe just remove that part since it's not based on the probabilistic data structure, so there is no "predictable behavior"? But it's up to you.
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
|
|
||
| > Note: With text indexes generally availability (GA) starting from ClickHouse version 26.2, bloom filter–based indexes are not recommended anymore for full text search. | ||
| Although they are more compact, unfortunately they tend to produce false positives because they are probabilistic. | ||
| :::note |
There was a problem hiding this comment.
Also here, we should keep it short and sweet.
L. 113 is fine.
Instead of l. 115-121, we can just say
:::note
The usage of `ngrambf_v1` indexes for full-text search is deprecated in ClickHouse versions >= 26.2 in favor of `text` indexes (see here for further details).
:::here is a link to the text index docs.
(same below for tokenbf_v1)
Summary
Checklist
Add a few pending noted recommending text index over
ngrambf_v1andtokenbf_v1