Skip to content

[MOD-14930] refactor expire handling without full reindexing#9356

Merged
JoanFM merged 30 commits into
masterfrom
joan-expire-not-fully-reindex
May 14, 2026
Merged

[MOD-14930] refactor expire handling without full reindexing#9356
JoanFM merged 30 commits into
masterfrom
joan-expire-not-fully-reindex

Conversation

@JoanFM
Copy link
Copy Markdown
Collaborator

@JoanFM JoanFM commented Apr 29, 2026

Describe the changes in the pull request

  1. Current: When Redis fires an EXPIRE or PERSIST keyspace notification on an indexed key, RediSearch handles it through Indexes_UpdateMatchingWithSchemaRules, which is the same code path used for content changes. This re-runs schema-rule filter evaluation, re-opens and re-tokenizes the document, and rewrites inverted-index, sortable and tag entries — even though none of that data has actually changed.
  2. Change: Adds a dedicated fast path Indexes_UpdateMatchingDocExpiration in src/spec.c that only refreshes the document-level TTL stored in the matching RSDocumentMetadata entries. It re-reads the key's absolute expiration once via the existing getDocExpirationTime helper, then for each matching spec acquires the write lock, looks up the DMD via DocTable_BorrowByKeyR, and applies the canonical DocTable_UpdateExpiration setter (passing NULL for the field-TTL array so HEXPIRE state in the per-spec TTL table is preserved). The notification handler in src/notifications.c routes expire_cmd/persist_cmd to this fast path when SearchDisk is disabled; disk-backed indexes continue to use the full reindex path. restore_cmd/copy_to_cmd are unaffected and still take the full path because they really do change document content.
  3. Outcome: EXPIRE and PERSIST against indexed documents become substantially cheaper — no schema-rule filter pass, no key re-tokenization, no inverted-index churn. Observable behavior is unchanged: the result-processor's expiration check still reads from the same expirationTimeNs field, written through the same canonical setter the full reindex path uses.
  4. Also, the fast path returns immediately if monitorExpiration configuration is FALSE, which allow users to not pay the price if they do not care about having some missing results on concurrent search/expiration.

Which additional issues this PR fixes

  1. MOD-...
  2. #...

Main objects this PR modified

  1. Indexes_UpdateMatchingDocExpiration (new) — src/spec.c, src/spec.h
  2. OnKeySpaceNotificationsrc/notifications.c (routes expire_cmd/persist_cmd through the fast path in the in-memory flow)
  3. getDocExpirationTimesrc/document_basic.c, src/document.h (promoted from static inline and exposed so the new fast path shares the ms→timespec conversion with the indexer)

Mark if applicable

  • This PR introduces API changes
  • This PR introduces serialization changes

Release Notes

  • This PR requires release notes
  • This PR does not require release notes

Release note candidate:

Optimize handling of EXPIRE and PERSIST on indexed documents: the index now updates only the document's TTL metadata instead of fully reindexing the document, reducing CPU and memory churn for TTL-driven workloads. Applies to in-memory indexes


Note

Medium Risk
Changes keyspace-notification handling for EXPIRE/PERSIST to mutate in-memory index metadata instead of reindexing, and introduces relaxed atomic access to expirationTimeNs, which could affect expiration correctness under concurrency if mishandled.

Overview
Optimizes handling of EXPIRE/PEXPIRE/PERSIST keyspace events for in-memory indexes by adding Indexes_UpdateMatchingDocExpiration, which refreshes only the document-level TTL on existing RSDocumentMetadata entries (no filter re-evaluation and no document reindexing). OnKeySpaceNotification now routes expire/persist to this fast path when SearchDisk is off, while disk-backed indexes keep the full reindex path.

Promotes TTL extraction into a shared helper GetKeyExpirationTime, and updates DocTable expiration reads/writes to use relaxed atomics so TTL updates can occur under a spec read lock. Adds targeted pytests for non-matching indexes and persist/expire behavior, loosens a couple of order-sensitive assertions, and introduces several new benchmark scenarios to track the TTL fast-path performance/regressions.

Reviewed by Cursor Bugbot for commit a90c12d. Bugbot is set up for automated code reviews on this repo. Configure here.

@jit-ci
Copy link
Copy Markdown

jit-ci Bot commented Apr 29, 2026

🛡️ Jit Security Scan Results

CRITICAL HIGH MEDIUM

✅ No security findings were detected in this PR


Security scan by Jit

@JoanFM JoanFM requested a review from kei-nan April 29, 2026 10:52
@JoanFM JoanFM changed the title refactor expire handling without full reindexing [MOD-14930] refactor expire handling without full reindexing Apr 29, 2026
@JoanFM JoanFM requested a review from GuyAv46 April 29, 2026 10:56
Comment thread src/spec.c Outdated
Comment thread src/document.h Outdated
Copy link
Copy Markdown
Collaborator

@kei-nan kei-nan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good overall, left two comments

@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 29, 2026

Codecov Report

❌ Patch coverage is 92.10526% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.84%. Comparing base (a90c3cd) to head (a90c12d).
⚠️ Report is 2 commits behind head on master.

Files with missing lines Patch % Lines
src/spec.c 91.66% 2 Missing ⚠️
src/notifications.c 75.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9356      +/-   ##
==========================================
+ Coverage   81.80%   81.84%   +0.03%     
==========================================
  Files         502      503       +1     
  Lines       68799    69298     +499     
  Branches    25091    25316     +225     
==========================================
+ Hits        56279    56714     +435     
- Misses      12282    12345      +63     
- Partials      238      239       +1     
Flag Coverage Δ
flow 83.71% <92.10%> (+0.06%) ⬆️
unit 51.50% <10.52%> (+0.17%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@JoanFM JoanFM requested a review from kei-nan April 29, 2026 11:46
Comment thread src/spec.c Outdated
Comment thread src/spec.c
Comment thread src/spec.c Outdated
Comment thread tests/pytests/test_expire.py Outdated
@fcostaoliveira
Copy link
Copy Markdown
Contributor

Automated performance analysis summary

This comment was automatically generated given there is performance data available.

Environment:

  • Triggering env: circleci

Architecture: x86_64 — branch-over-branch

Deployment: oss-standalone

In summary:

  • Detected a total of 2 stable tests between versions.
  • Detected a total of 4 highly unstable benchmarks (4 baseline).
  • Latency analysis confirmed regressions in 1 of the unstable tests:
  • Detected a total of 2 regressions bellow the regression water line 8.0%.

You can check a comparison in detail via the grafana link

Performance Regressions and Issues - Comparison between master and joan-expire-not-fully-reindex.

Time Period from a month ago. (environment used: oss-standalone)

Test Case Baseline master (median obs. +- std.dev) Comparison joan-expire-not-fully-reindex (median obs. +- std.dev) % change (higher-better) Note
search-high-cardinality-negation-term-comparison_union_all_other_terms 208 +- 17.8% UNSTABLE (20 datapoints) 159 -23.4% UNSTABLE (baseline high variance); server: FT.SEARCH p50 increased 39.6% (baseline CV=23.9%); client: client latency stable; only server side confirms regression (client side stable) - insufficient evidence
search-high-cardinality-negation-term-baseline 395 +- 17.7% UNSTABLE (20 datapoints) 310 -21.5% UNSTABLE (baseline high variance); server: p50 latency stable; client: Latency increased 27.2% (baseline CV=17.5%); only client side confirms regression (server side stable) - insufficient evidence
search-filtering-tag-numeric-filter-pipeline 11643 +- 5.0% (20 datapoints) 9150 -21.4% REGRESSION
hybrid-arxiv-titles-384-angular-linear-numeric-vector 1695 +- 32.8% UNSTABLE (8 datapoints) 1483 -12.5% UNSTABLE (baseline high variance); server: FT.HYBRID p50 increased 28.4% (baseline CV=35.7%); client: Latency increased 16.7% (baseline CV=26.8%)
search-ftsb-1700K-docs-union-iterators-q3 36 +- 5.5% (20 datapoints) 32 -11.0% REGRESSION
search-filtering-tag-numeric 3942 +- 11.1% UNSTABLE (20 datapoints) 3829 -2.9% UNSTABLE (baseline high variance); server: p50 latency stable; client: client latency stable; neither server nor client side confirms regression
Tests with No Significant Changes (2 tests)

Tests with No Significant Changes

Test Case Baseline master (median obs. +- std.dev) Comparison joan-expire-not-fully-reindex (median obs. +- std.dev) % change (higher-better) Note
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-numeric-filter 2622 +- 4.5% (20 datapoints) 2752 5.0% potential IMPROVEMENT
vecsim-arxiv-titles-384-angular-filters-m16-ef-128-tag-filter 15805 +- 6.1% (20 datapoints) 15080 -4.6% potential REGRESSION

Architecture: aarch64 — branch-over-branch

Deployment: oss-standalone

In summary:

  • Detected a total of 4 stable tests between versions.
  • Detected a total of 1 improvements above the improvement water line.

You can check a comparison in detail via the grafana link

Performance Improvements - Comparison between master and joan-expire-not-fully-reindex.

Time Period from a month ago. (environment used: oss-standalone)

Test Case Baseline master (median obs. +- std.dev) Comparison joan-expire-not-fully-reindex (median obs. +- std.dev) % change (higher-better) Note
search-filtering-tag-numeric 3501 +- 9.8% (13 datapoints) 3872 10.6% waterline=9.8%. IMPROVEMENT
Tests with No Significant Changes (4 tests)

Tests with No Significant Changes

Test Case Baseline master (median obs. +- std.dev) Comparison joan-expire-not-fully-reindex (median obs. +- std.dev) % change (higher-better) Note
search-filtering-tag-numeric-filter-pipeline 9294 +- 1.1% (13 datapoints) 9150 -1.5% No Change
search-ftsb-1700K-docs-union-iterators-q3 33 +- 1.2% (13 datapoints) 32 -1.5% No Change
search-high-cardinality-negation-term-baseline 309 +- 1.1% (13 datapoints) 310 0.1% No Change
search-high-cardinality-negation-term-comparison_union_all_other_terms 159 +- 1.8% (13 datapoints) 159 0.2% No Change

Cross-arch delta on joan-expire-not-fully-reindex (x86_64aarch64)

Same commit (joan-expire-not-fully-reindex) compared across architectures. Positive deltas = aarch64 outperforms x86_64.

In summary:

  • Detected a total of 4 stable tests between versions.
  • Detected a total of 1 regressions bellow the regression water line 8.0%.

You can check a comparison in detail via the grafana link

Performance Regressions and Issues - Comparison between joan-expire-not-fully-reindex and joan-expire-not-fully-reindex.

Time Period from a month ago. (environment used: oss-standalone)

Test Case Baseline joan-expire-not-fully-reindex (median obs. +- std.dev) Comparison joan-expire-not-fully-reindex (median obs. +- std.dev) % change (higher-better) Note
search-ftsb-1700K-docs-union-iterators-q3 36 32 -10.4% REGRESSION
Tests with No Significant Changes (4 tests)

Tests with No Significant Changes

Test Case Baseline joan-expire-not-fully-reindex (median obs. +- std.dev) Comparison joan-expire-not-fully-reindex (median obs. +- std.dev) % change (higher-better) Note
search-filtering-tag-numeric 3829 3872 1.1% No Change
search-filtering-tag-numeric-filter-pipeline 9150 9150 0.0%
search-high-cardinality-negation-term-baseline 310 310 0.0%
search-high-cardinality-negation-term-comparison_union_all_other_terms 159 159 0.0%

@JoanFM JoanFM enabled auto-merge May 13, 2026 14:17
@JoanFM JoanFM closed this May 13, 2026
auto-merge was automatically disabled May 13, 2026 16:19

Pull request was closed

@JoanFM JoanFM reopened this May 13, 2026
@JoanFM JoanFM enabled auto-merge May 13, 2026 16:45
@JoanFM JoanFM closed this May 13, 2026
auto-merge was automatically disabled May 13, 2026 19:28

Pull request was closed

@JoanFM JoanFM reopened this May 13, 2026
@JoanFM JoanFM enabled auto-merge May 13, 2026 19:28
@sonarqubecloud
Copy link
Copy Markdown

@JoanFM JoanFM added this pull request to the merge queue May 13, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 13, 2026
@JoanFM JoanFM added this pull request to the merge queue May 14, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 14, 2026
@JoanFM JoanFM added this pull request to the merge queue May 14, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 14, 2026
@JoanFM JoanFM added this pull request to the merge queue May 14, 2026
@github-merge-queue github-merge-queue Bot removed this pull request from the merge queue due to failed status checks May 14, 2026
@JoanFM JoanFM added this pull request to the merge queue May 14, 2026
@redisearch-backport-pull-request
Copy link
Copy Markdown
Contributor

Backport failed for 8.2, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.2
git worktree add -d .worktree/backport-9356-to-8.2 origin/8.2
cd .worktree/backport-9356-to-8.2
git switch --create backport-9356-to-8.2
git cherry-pick -x f6a44e3781ad381c3b7405b144663fd81faa7a95

@redisearch-backport-pull-request
Copy link
Copy Markdown
Contributor

Backport failed for 8.4, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.4
git worktree add -d .worktree/backport-9356-to-8.4 origin/8.4
cd .worktree/backport-9356-to-8.4
git switch --create backport-9356-to-8.4
git cherry-pick -x f6a44e3781ad381c3b7405b144663fd81faa7a95

@redisearch-backport-pull-request
Copy link
Copy Markdown
Contributor

Backport failed for 8.6, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.6
git worktree add -d .worktree/backport-9356-to-8.6 origin/8.6
cd .worktree/backport-9356-to-8.6
git switch --create backport-9356-to-8.6
git cherry-pick -x f6a44e3781ad381c3b7405b144663fd81faa7a95

@redisearch-backport-pull-request
Copy link
Copy Markdown
Contributor

Backport failed for 8.6-rse, because it was unable to cherry-pick the commit(s).

Please cherry-pick the changes locally and resolve any conflicts.

git fetch origin 8.6-rse
git worktree add -d .worktree/backport-9356-to-8.6-rse origin/8.6-rse
cd .worktree/backport-9356-to-8.6-rse
git switch --create backport-9356-to-8.6-rse
git cherry-pick -x f6a44e3781ad381c3b7405b144663fd81faa7a95

@redisearch-backport-pull-request
Copy link
Copy Markdown
Contributor

Successfully created backport PR for 8.8:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants