Skip to content

feat: Combine regexes targeting same fields#68

Merged
Darkheir merged 5 commits into
sekoiafrom
feat/combine_regex_same_field
Jun 22, 2026
Merged

feat: Combine regexes targeting same fields#68
Darkheir merged 5 commits into
sekoiafrom
feat/combine_regex_same_field

Conversation

@Darkheir

Copy link
Copy Markdown
Collaborator

Description

Combine regexes targeting same field and let the regex creation fail if the limit is reached.

How was this PR tested?

Describe how you tested this PR.

Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
Copilot AI review requested due to automatic review settings June 22, 2026 08:57
@Darkheir Darkheir requested a review from rdettai-sk June 22, 2026 09:02

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR optimizes search warmup for regex-based queries by coalescing multiple regex patterns that target the same (field, json path) into a single multi-pattern automaton, reducing redundant term-dictionary traversals and making combined-regex construction a strict failure point when limits are hit.

Changes:

  • Coalesce regex automatons per field+path into a single Automaton::Regex(..., Vec<String>) and deduplicate/sort patterns before warmup.
  • Update warmup logic to build a combined regex automaton via tantivy_fst::Regex::from_patterns(...) and fail warmup if the combined regex cannot be built.
  • Add tests covering regex coalescing behavior and warmup failure on unbuildable combined regex; bump patched tantivy-fst revision.

Reviewed changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
quickwit/quickwit-search/src/leaf.rs Build a combined regex automaton from multiple patterns during warmup and add a regression test for failure behavior.
quickwit/quickwit-doc-mapper/src/query_builder.rs Coalesce regex automatons by (field, path) and add tests for field-limit behavior and coalescing semantics.
quickwit/quickwit-doc-mapper/src/doc_mapper/mod.rs Update Automaton::Regex to hold multiple patterns (Vec<String>) and adjust tests accordingly.
quickwit/Cargo.toml Update the patched tantivy-fst git revision.
quickwit/Cargo.lock Lockfile update corresponding to the patched tantivy-fst revision bump.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread quickwit/quickwit-doc-mapper/src/query_builder.rs
Comment thread quickwit/quickwit-doc-mapper/src/query_builder.rs
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
@Darkheir Darkheir force-pushed the feat/combine_regex_same_field branch from e62f207 to 37bb03c Compare June 22, 2026 09:36
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
Comment thread quickwit/quickwit-search/src/leaf.rs Outdated
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
Comment thread quickwit/quickwit-doc-mapper/src/query_builder.rs
Signed-off-by: Darkheir <raphael.cohen@sekoia.io>
@Darkheir Darkheir force-pushed the feat/combine_regex_same_field branch from 2d10a0a to 8884e1d Compare June 22, 2026 11:46
@Darkheir Darkheir merged commit d3cdc94 into sekoia Jun 22, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants