Skip to content

feat(scalars): add eql_v3.text encrypted-domain family (eq / match / ord)#260

Merged
tobyhede merged 13 commits into
eql_v3from
v3-domain-type-text
Jun 10, 2026
Merged

feat(scalars): add eql_v3.text encrypted-domain family (eq / match / ord)#260
tobyhede merged 13 commits into
eql_v3from
v3-domain-type-text

Conversation

@tobyhede

@tobyhede tobyhede commented Jun 5, 2026

Copy link
Copy Markdown
Contributor

Adds the eql_v3.text scalar encrypted-domain family at parity with EQL v2 encrypted text: equality (HMAC), match (a new self-contained eql_v3.bloom_filter SEM index term), and ORE ordering. First scalar to add a new index Term (Bloom) and the first non-integer, unbounded ordered kind.

Supported text domains

Domain Operators Index term
eql_v3.text — (storage only)
eql_v3.text_eq = <> HMAC (hm)
eql_v3.text_match @> <@ Bloom filter (bf)
eql_v3.text_ord = <> < <= > >=, min/max ORE block (ob)
eql_v3.text_ord_ore same as text_ord ORE block (ob)

A real encrypted text payload carries hm + bf + ob; callers cast per predicate. Match is bloom-filter containment on text_match — deliberately not SQL LIKE — and never backs equality (that always routes through Hm).

Notes

  • @>/<@ flip from blocker → inlinable wrappers only on Bloom domains, so the int4 golden is byte-identical (codegen:parity green).
  • New eql_v3.bloom_filter SEM type is self-contained (no eql_v2 dependency; test:self_contained_v3 green).
  • SQLx matrix generalised for non-Copy (String) plaintext (CopyClone, to_sql_literal(&Self)); [text] harness marker mirrors [temporal].

Stacked on eql_v3.

Summary by CodeRabbit

  • New Features
    • Added support for encrypted text domain (eql_v3.text) with equality comparison, bloom-filter-based text matching via containment operators (@>/<@), and ordering operations (<, <=, >, >=, min, max)

@coderabbitai

coderabbitai Bot commented Jun 5, 2026

Copy link
Copy Markdown

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 015675cf-46f2-46e6-b565-321a22fdfef8

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch v3-domain-type-text

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

tobyhede added a commit that referenced this pull request Jun 5, 2026
tobyhede added a commit that referenced this pull request Jun 5, 2026
…t wired-kinds comments

Addresses CodeRabbit review on #260:
- Fixture::Zero now resolves to None for non-integer kinds, matching Min/Max and
  the function's documented contract (test-guaranteed never hit, but consistent).
- cast_for_kind/plaintext_sql_type_for_kind doc comments now list Text as wired.
tobyhede added a commit that referenced this pull request Jun 5, 2026
…ven test, match coverage

- bloom_filter(jsonb): LANGUAGE sql (inlinable), drop the redundant RAISE. The
  match capability is tied to the text_match domain CHECK (which guarantees bf),
  so a missing key can only occur on raw jsonb — return NULL there, mirroring
  hmac_256. Add the pin_search_path inline-critical clause + splinter allowlist
  row so it stays unpinned/inlinable. has_bloom_filter unchanged (matches
  has_hmac_256). SEM test now asserts NULL instead of a raise.
- context.rs: table-driven operator-metadata test — a new term's metadata is one
  table row, not another hand-rolled assertion block.
- eql-tests-macros: document the is_temporal -> is_hand_written rename (the
  predicate gates "diverges from the generated integer path", not "is a date").
- text_match: add a bare-operator (`col @> needle`) GIN index-engagement test;
  document the probabilistic / ngram-disjoint basis of the disjoint assertion.
- text_smoke: add empty-bloom set-semantics test (everything contains the empty
  filter; the empty filter contains nothing).
- rename string_to_plaintext_is_utf8 -> string_to_plaintext_is_text.
tobyhede added a commit that referenced this pull request Jun 5, 2026
…x text "" pivot (#262)

The scalar matrix's third pivot was hardwired to `Default::default()` — `0` for
int, the epoch for date, but `""` for text, which encrypts to an empty ORE term
and broke ordering/aggregates/counts (83 CI failures on #260).

Introduce the real taxonomy as traits:
- ScalarType (base) — identity, fixtures, literal rendering.
- OrderedScalar: ScalarType — min_pivot/max_pivot + an overridable interior
  mid_pivot (default Self::default()). int/date inherit (0/epoch); text overrides
  to a real median ("frank"), never the degenerate "".
- SignedScalar: OrderedScalar — origin() (numeric zero / sign boundary). int and
  date only; text is NOT SignedScalar (lexicographic order has no origin).

The pivot SWEEP stays uniform (min/mid/max) across every ordered type, so the
single canonical matrix snapshot is preserved — only a `_pivot_zero_` ->
`_pivot_mid_` rename. The signed-only sign-boundary test (asserts ORE ordering is
monotonic across the origin) is generic over `SignedScalar` and lives outside the
`scalars::` namespace (like text_match), so a `text` instantiation is a compile
error and it never enters the inventory snapshot — no per-capability snapshot,
inventory, or macro branching.

Also: drop "" from TEXT_FIXTURES (text has no numeric origin — #262); the proc
macro emits OrderedScalar+SignedScalar for the generated integer impls; harden
text_match::match_uses_functional_index to force enable_seqscan=off.

Verified: 640 text/int4/date/signed/text_match tests pass (prior 83 "" failures
gone); matrix inventory matches the regenerated snapshot (5 types, no signed
leak); codegen:parity unchanged. Empty-string behavioural decision tracked in #262.
tobyhede added a commit that referenced this pull request Jun 9, 2026
tobyhede added a commit that referenced this pull request Jun 9, 2026
…t wired-kinds comments

Addresses CodeRabbit review on #260:
- Fixture::Zero now resolves to None for non-integer kinds, matching Min/Max and
  the function's documented contract (test-guaranteed never hit, but consistent).
- cast_for_kind/plaintext_sql_type_for_kind doc comments now list Text as wired.
tobyhede added a commit that referenced this pull request Jun 9, 2026
…ven test, match coverage

- bloom_filter(jsonb): LANGUAGE sql (inlinable), drop the redundant RAISE. The
  match capability is tied to the text_match domain CHECK (which guarantees bf),
  so a missing key can only occur on raw jsonb — return NULL there, mirroring
  hmac_256. Add the pin_search_path inline-critical clause + splinter allowlist
  row so it stays unpinned/inlinable. has_bloom_filter unchanged (matches
  has_hmac_256). SEM test now asserts NULL instead of a raise.
- context.rs: table-driven operator-metadata test — a new term's metadata is one
  table row, not another hand-rolled assertion block.
- eql-tests-macros: document the is_temporal -> is_hand_written rename (the
  predicate gates "diverges from the generated integer path", not "is a date").
- text_match: add a bare-operator (`col @> needle`) GIN index-engagement test;
  document the probabilistic / ngram-disjoint basis of the disjoint assertion.
- text_smoke: add empty-bloom set-semantics test (everything contains the empty
  filter; the empty filter contains nothing).
- rename string_to_plaintext_is_utf8 -> string_to_plaintext_is_text.
@tobyhede tobyhede force-pushed the v3-domain-type-text branch from d27940e to 4831578 Compare June 9, 2026 00:01
tobyhede added a commit that referenced this pull request Jun 9, 2026
…x text "" pivot (#262)

The scalar matrix's third pivot was hardwired to `Default::default()` — `0` for
int, the epoch for date, but `""` for text, which encrypts to an empty ORE term
and broke ordering/aggregates/counts (83 CI failures on #260).

Introduce the real taxonomy as traits:
- ScalarType (base) — identity, fixtures, literal rendering.
- OrderedScalar: ScalarType — min_pivot/max_pivot + an overridable interior
  mid_pivot (default Self::default()). int/date inherit (0/epoch); text overrides
  to a real median ("frank"), never the degenerate "".
- SignedScalar: OrderedScalar — origin() (numeric zero / sign boundary). int and
  date only; text is NOT SignedScalar (lexicographic order has no origin).

The pivot SWEEP stays uniform (min/mid/max) across every ordered type, so the
single canonical matrix snapshot is preserved — only a `_pivot_zero_` ->
`_pivot_mid_` rename. The signed-only sign-boundary test (asserts ORE ordering is
monotonic across the origin) is generic over `SignedScalar` and lives outside the
`scalars::` namespace (like text_match), so a `text` instantiation is a compile
error and it never enters the inventory snapshot — no per-capability snapshot,
inventory, or macro branching.

Also: drop "" from TEXT_FIXTURES (text has no numeric origin — #262); the proc
macro emits OrderedScalar+SignedScalar for the generated integer impls; harden
text_match::match_uses_functional_index to force enable_seqscan=off.

Verified: 640 text/int4/date/signed/text_match tests pass (prior 83 "" failures
gone); matrix inventory matches the regenerated snapshot (5 types, no signed
leak); codegen:parity unchanged. Empty-string behavioural decision tracked in #262.
tobyhede added a commit that referenced this pull request Jun 9, 2026
tobyhede added a commit that referenced this pull request Jun 9, 2026
…t wired-kinds comments

Addresses CodeRabbit review on #260:
- Fixture::Zero now resolves to None for non-integer kinds, matching Min/Max and
  the function's documented contract (test-guaranteed never hit, but consistent).
- cast_for_kind/plaintext_sql_type_for_kind doc comments now list Text as wired.
tobyhede added a commit that referenced this pull request Jun 9, 2026
…ven test, match coverage

- bloom_filter(jsonb): LANGUAGE sql (inlinable), drop the redundant RAISE. The
  match capability is tied to the text_match domain CHECK (which guarantees bf),
  so a missing key can only occur on raw jsonb — return NULL there, mirroring
  hmac_256. Add the pin_search_path inline-critical clause + splinter allowlist
  row so it stays unpinned/inlinable. has_bloom_filter unchanged (matches
  has_hmac_256). SEM test now asserts NULL instead of a raise.
- context.rs: table-driven operator-metadata test — a new term's metadata is one
  table row, not another hand-rolled assertion block.
- eql-tests-macros: document the is_temporal -> is_hand_written rename (the
  predicate gates "diverges from the generated integer path", not "is a date").
- text_match: add a bare-operator (`col @> needle`) GIN index-engagement test;
  document the probabilistic / ngram-disjoint basis of the disjoint assertion.
- text_smoke: add empty-bloom set-semantics test (everything contains the empty
  filter; the empty filter contains nothing).
- rename string_to_plaintext_is_utf8 -> string_to_plaintext_is_text.
@tobyhede tobyhede force-pushed the v3-domain-type-text branch from 1ec0f7c to 0cb0e2b Compare June 9, 2026 02:13
tobyhede added a commit that referenced this pull request Jun 9, 2026
…x text "" pivot (#262)

The scalar matrix's third pivot was hardwired to `Default::default()` — `0` for
int, the epoch for date, but `""` for text, which encrypts to an empty ORE term
and broke ordering/aggregates/counts (83 CI failures on #260).

Introduce the real taxonomy as traits:
- ScalarType (base) — identity, fixtures, literal rendering.
- OrderedScalar: ScalarType — min_pivot/max_pivot + an overridable interior
  mid_pivot (default Self::default()). int/date inherit (0/epoch); text overrides
  to a real median ("frank"), never the degenerate "".
- SignedScalar: OrderedScalar — origin() (numeric zero / sign boundary). int and
  date only; text is NOT SignedScalar (lexicographic order has no origin).

The pivot SWEEP stays uniform (min/mid/max) across every ordered type, so the
single canonical matrix snapshot is preserved — only a `_pivot_zero_` ->
`_pivot_mid_` rename. The signed-only sign-boundary test (asserts ORE ordering is
monotonic across the origin) is generic over `SignedScalar` and lives outside the
`scalars::` namespace (like text_match), so a `text` instantiation is a compile
error and it never enters the inventory snapshot — no per-capability snapshot,
inventory, or macro branching.

Also: drop "" from TEXT_FIXTURES (text has no numeric origin — #262); the proc
macro emits OrderedScalar+SignedScalar for the generated integer impls; harden
text_match::match_uses_functional_index to force enable_seqscan=off.

Verified: 640 text/int4/date/signed/text_match tests pass (prior 83 "" failures
gone); matrix inventory matches the regenerated snapshot (5 types, no signed
leak); codegen:parity unchanged. Empty-string behavioural decision tracked in #262.
tobyhede added a commit that referenced this pull request Jun 9, 2026
Documents the `eql_v3.text` family (eq / match / ord) and the `Bloom` index term
in the scalar-encrypted-domain reference guide — including the
`OrderedScalar`/`SignedScalar` pivot-trait section and the catalog-derived
(marker-free) text dispatch — and adds the `[Unreleased]` changelog entry
(#260).
@tobyhede tobyhede force-pushed the v3-domain-type-text branch from 0cb0e2b to f4bde1a Compare June 9, 2026 02:47

@auxesis auxesis left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude caught four gaps in test coverage of new or changed code.

Comment thread src/v3/sem/bloom_filter/functions.sql
Comment thread src/v3/sem/bloom_filter/functions.sql
Comment thread tests/sqlx/src/scalar_domains.rs
Comment thread crates/eql-scalars/src/lib.rs Outdated

@auxesis auxesis left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @tobyhede.

Claude has left comments on four test coverage issues that need to be addressed.

tobyhede added a commit that referenced this pull request Jun 9, 2026
Adds the four characterization tests @auxesis requested on #260, each
pinning a branch the existing suite never reached:

- has_bloom_filter(jsonb) presence predicate (present/absent/{"bf":null}
  -> false) — the IS NOT NULL half of its guard was untested, and it is
  not reached transitively by the extractor or domain CHECK.
- bloom_filter(jsonb) empty-array branch: {"bf":[]} -> empty smallint[],
  not NULL (the extractor basis for empty-set containment semantics).
- String::to_sql_literal single-quote escaping (O'Brien -> 'O''Brien');
  all TEXT fixtures are quote-free so no DB test hit the .replace.
- Fixture::Zero/Min/Max -> None on non-integer kinds (Date/Text), the arm
  changed from unconditional Some(0); previously only guarded indirectly
  by the pivot_sentinels_only_appear_with_integer_kinds catalog invariant.

All four pass; behaviour was already correct, these are regression nets.
tobyhede added a commit that referenced this pull request Jun 9, 2026
Documents the `eql_v3.text` family (eq / match / ord) and the `Bloom` index term
in the scalar-encrypted-domain reference guide — including the
`OrderedScalar`/`SignedScalar` pivot-trait section and the catalog-derived
(marker-free) text dispatch — and adds the `[Unreleased]` changelog entry
(#260).
tobyhede added a commit that referenced this pull request Jun 9, 2026
Adds the four characterization tests @auxesis requested on #260, each
pinning a branch the existing suite never reached:

- has_bloom_filter(jsonb) presence predicate (present/absent/{"bf":null}
  -> false) — the IS NOT NULL half of its guard was untested, and it is
  not reached transitively by the extractor or domain CHECK.
- bloom_filter(jsonb) empty-array branch: {"bf":[]} -> empty smallint[],
  not NULL (the extractor basis for empty-set containment semantics).
- String::to_sql_literal single-quote escaping (O'Brien -> 'O''Brien');
  all TEXT fixtures are quote-free so no DB test hit the .replace.
- Fixture::Zero/Min/Max -> None on non-integer kinds (Date/Text), the arm
  changed from unconditional Some(0); previously only guarded indirectly
  by the pivot_sentinels_only_appear_with_integer_kinds catalog invariant.

All four pass; behaviour was already correct, these are regression nets.
@tobyhede tobyhede force-pushed the v3-domain-type-text branch from 71b3c8b to 72c6fee Compare June 9, 2026 10:11
@tobyhede tobyhede requested a review from auxesis June 9, 2026 10:12

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
src/v3/sem/bloom_filter/functions.sql (1)

23-31: ⚡ Quick win

Use LANGUAGE SQL for has_bloom_filter (single-expression predicate).

This function is a simple expression and can be inlined as SQL, which matches the SQL-file guideline.

Suggested patch
 CREATE FUNCTION eql_v3.has_bloom_filter(val jsonb)
   RETURNS boolean
   IMMUTABLE STRICT PARALLEL SAFE
   SET search_path = pg_catalog, extensions, public
 AS $$
-  BEGIN
-    RETURN val ? 'bf' AND val ->> 'bf' IS NOT NULL;
-  END;
-$$ LANGUAGE plpgsql;
+  SELECT val ? 'bf' AND val ->> 'bf' IS NOT NULL;
+$$ LANGUAGE sql;

As per coding guidelines src/**/*.sql: “Prefer LANGUAGE SQL over LANGUAGE plpgsql for simple functions unless procedural features are needed.”

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/v3/sem/bloom_filter/functions.sql` around lines 23 - 31, Replace the
PL/pgSQL function eql_v3.has_bloom_filter with a single-expression LANGUAGE SQL
function: remove the BEGIN/END block and RETURN, and instead define the function
as an SQL expression that returns the boolean predicate which checks that the
JSONB contains the 'bf' key and that the text extraction of 'bf' is not null;
keep the same RETURNS boolean, IMMUTABLE STRICT PARALLEL SAFE and SET
search_path attributes.

Source: Coding guidelines

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/v3/sem/bloom_filter/types.sql`:
- Around line 6-14: The Doxygen-style doc block for the domain
eql_v3.bloom_filter (CREATE DOMAIN eql_v3.bloom_filter AS smallint[]) is missing
the required `@param` and `@return` tags; update the leading --! comment above the
CREATE DOMAIN to include an `@param` tag (use "`@param` none" or brief note that
there are no constructor parameters for this domain) and an `@return` tag
describing what the domain represents/returns (e.g., "bit array stored as
smallint[] used for bloom-filter matching"), keeping the existing `@brief` and
other notes intact.

---

Nitpick comments:
In `@src/v3/sem/bloom_filter/functions.sql`:
- Around line 23-31: Replace the PL/pgSQL function eql_v3.has_bloom_filter with
a single-expression LANGUAGE SQL function: remove the BEGIN/END block and
RETURN, and instead define the function as an SQL expression that returns the
boolean predicate which checks that the JSONB contains the 'bf' key and that the
text extraction of 'bf' is not null; keep the same RETURNS boolean, IMMUTABLE
STRICT PARALLEL SAFE and SET search_path attributes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: d73d23d4-7e46-48eb-80c0-be95e3ea2a19

📥 Commits

Reviewing files that changed from the base of the PR and between 8cf3303 and 72c6fee.

📒 Files selected for processing (28)
  • CHANGELOG.md
  • crates/eql-codegen/src/context.rs
  • crates/eql-codegen/src/operator_surface.rs
  • crates/eql-scalars/src/fixture.rs
  • crates/eql-scalars/src/lib.rs
  • crates/eql-scalars/src/term.rs
  • crates/eql-scalars/src/tests.rs
  • crates/eql-tests-macros/src/lib.rs
  • docs/reference/adding-a-scalar-encrypted-domain-type.md
  • src/v3/sem/bloom_filter/functions.sql
  • src/v3/sem/bloom_filter/types.sql
  • src/v3/sem/ore_block_u64_8_256/functions.sql
  • tasks/pin_search_path.sql
  • tasks/test/splinter.sh
  • tests/sqlx/snapshots/README.md
  • tests/sqlx/snapshots/matrix_tests.txt
  • tests/sqlx/src/fixtures/driver.rs
  • tests/sqlx/src/fixtures/eql_plaintext.rs
  • tests/sqlx/src/fixtures/scalar_fixture.rs
  • tests/sqlx/src/matrix.rs
  • tests/sqlx/src/scalar_domains.rs
  • tests/sqlx/src/scalar_types.rs
  • tests/sqlx/tests/encrypted_domain.rs
  • tests/sqlx/tests/encrypted_domain/family/sem.rs
  • tests/sqlx/tests/encrypted_domain/scalars/mod.rs
  • tests/sqlx/tests/encrypted_domain/signed.rs
  • tests/sqlx/tests/encrypted_domain/text/text_match.rs
  • tests/sqlx/tests/encrypted_domain/text/text_smoke.rs

Comment thread src/v3/sem/bloom_filter/types.sql
tobyhede added 3 commits June 10, 2026 09:50
Adds the `Bloom` index `Term` (json key `bf`, extractor `match_term`, ctor
`bloom_filter`, role `match`, operators `@>`/`<@`) and the `text` row to the
scalar `CATALOG`: `ScalarKind::Text`, a `_match` (Bloom) domain on top of the
ordered shape, and the `TEXT_FIXTURES` / `TEXT_VALUES` plaintext list
(materialised by a `text_values!` macro alongside `int_values!`). `Fixture::Zero`
is gated to the integer kinds. Covered by `term_tests` and catalog `#[test]`s.
The searchable-encrypted-metadata `match` term for text. Adds the
`eql_v3.bloom_filter` domain (`smallint[]`) and the inlinable
`eql_v3.bloom_filter(jsonb)` extractor + `has_bloom_filter` predicate, mirroring
`eql_v3.hmac_256`: no RAISE, no pinned search_path, so the functional GIN index
on `match_term(col)` engages structurally. The extractor gates on
`jsonb_typeof(val -> 'bf') = 'array'`, returning NULL (not erroring) for absent
or malformed `bf` outside the domain CHECK. Adds the inline-critical clause to
`pin_search_path.sql` and allowlists `match_term`/containment in the splinter
lint so the unpinned extractor stays inlinable. Covered by family/sem.rs.
Teaches the codegen operator surface about the `@>`/`<@` containment operators
so they are generated only on domains carrying the `Bloom` term (the `text_match`
domain), and blocked elsewhere via the usual domain-fallback blockers. The
operator-metadata test is table-driven, so a new term's surface is one table row.
tobyhede added 8 commits June 10, 2026 09:55
Registers `text => String` in the `scalar_types!` list and teaches the harness
about an owned, non-Copy scalar: `ScalarType`/fixtures go `Copy` -> `Clone`,
`to_sql_literal` takes `&Self`, and `String` gets a hand-written `impl ScalarType`
(lexicographic pivots, single-quote SQL literal). The `eql-tests-macros` dispatch
is catalog-derived (`is_int_token`/`is_temporal_token`/`is_text_token` read from
`eql_scalars::CATALOG`) rather than a dispatch-list marker, and the
`scalar_fixture!` macro gains a `text` arm stamping the `Match` index. Adds the
sealed `EqlPlaintext` impl for `String` (text cast + `Plaintext::Text`).
…x text "" pivot (#262)

The scalar matrix's third pivot was hardwired to `Default::default()` — `0` for
int, the epoch for date, but `""` for text, which encrypts to an empty ORE term
and broke ordering/aggregates. Introduce the taxonomy as traits:

- `ScalarType` (base) — identity, fixtures, literal rendering.
- `OrderedScalar: ScalarType` — `min_pivot`/`max_pivot` + an overridable interior
  `mid_pivot` (default `Self::default()`). int/date inherit (0/epoch); text
  overrides to a real median ("frank"), never the degenerate "".
- `SignedScalar: OrderedScalar` — `origin()` (numeric zero / sign boundary). int
  and date only; text is NOT `SignedScalar`.

The proc-macro and the `temporal_values!` macro emit the `OrderedScalar` (+
`SignedScalar`) impls; the unified `scalar_matrix!` sweeps `min`/`mid`/`max` from
`OrderedScalar` (the `_pivot_zero_` -> `_pivot_mid_` snapshot rename). The
signed-only sign-boundary test lives in `encrypted_domain/signed.rs`, generic
over `SignedScalar`, so a `text` instantiation is a compile error. Drops `""`
from `TEXT_FIXTURES`.
SQLx coverage for the text family beyond the generated matrix: `text_smoke`
exercises `eql_v3.text_match @> match` and the blocked `=` plus empty-bloom
set semantics; `text_match` is the dedicated containment suite (self / substring
/ disjoint / bare-operator GIN index engagement). Both live under
`encrypted_domain/text/` (outside the `scalars::` namespace) so the matrix
inventory snapshot stays the uniform per-type set, and are registered alongside
the signed-only suite in `encrypted_domain.rs`.
Documents the `eql_v3.text` family (eq / match / ord) and the `Bloom` index term
in the scalar-encrypted-domain reference guide — including the
`OrderedScalar`/`SignedScalar` pivot-trait section and the catalog-derived
(marker-free) text dispatch — and adds the `[Unreleased]` changelog entry
(#260).
…(STRICT)

The `eql_v3.ore_block_u64_8_256(jsonb)` extractor is `STRICT`, so PostgreSQL
already short-circuits to NULL on a NULL argument — the explicit
`IF val IS NULL` guard is dead code. Adjacent cleanup to the SEM extractors;
no behaviour change.
The bloom containment surface (eql_v3.text_match @>/<@) replaces deprecated
LIKE/ILIKE but is semantically different (probabilistic, ngram-based, no
wildcards/anchoring), which confuses users. Close the coverage gaps:

- <@ (contained-by) was implemented in SQL but completely untested: add
  positive, negative, and commutator (a @> b == b <@ a) assertions, plus a
  literal-payload <@ engage and empty-set test.
- match_null_propagates: @>/<@ are STRICT, so a NULL operand yields NULL.
- text_match_containment_requires_all_elements: pins set-containment
  semantics (every needle ngram must be present) — the property that makes
  @> not LIKE.
- text_match_like_ilike_absent: ~~/~~* resolve to 'operator does not exist'
  on text_match, the domain a LIKE user would reach for.
- text_match_payload_check_rejects_missing_bf: the domain CHECK requires bf.

Hand-written suites only (bloom is text-only, outside the cross-type matrix);
no SQL/fixture changes — reuses existing eql_v2_text fixtures.
Adds the four characterization tests @auxesis requested on #260, each
pinning a branch the existing suite never reached:

- has_bloom_filter(jsonb) presence predicate (present/absent/{"bf":null}
  -> false) — the IS NOT NULL half of its guard was untested, and it is
  not reached transitively by the extractor or domain CHECK.
- bloom_filter(jsonb) empty-array branch: {"bf":[]} -> empty smallint[],
  not NULL (the extractor basis for empty-set containment semantics).
- String::to_sql_literal single-quote escaping (O'Brien -> 'O''Brien');
  all TEXT fixtures are quote-free so no DB test hit the .replace.
- Fixture::Zero/Min/Max -> None on non-integer kinds (Date/Text), the arm
  changed from unconditional Some(0); previously only guarded indirectly
  by the pivot_sentinels_only_appear_with_integer_kinds catalog invariant.

All four pass; behaviour was already correct, these are regression nets.
@tobyhede tobyhede force-pushed the v3-domain-type-text branch from 72c6fee to 125e592 Compare June 10, 2026 00:03
tobyhede added 2 commits June 10, 2026 10:27
Reconcile the adding-a-scalar guide with the post-timestamptz/text catalog:
correct the claim that timestamptz is ordered (it is equality-only via
EQ_ONLY_DOMAINS), add the missing Timestamptz/Date enum variants, and note
that @>/<@ back onto Bloom containment wrappers rather than blockers. Document
the previously-undocumented mechanics a follower needs: eq-only is selected by
the catalog domain slice (caps auto-derived), a new-capability domain like
_match needs hand-written #[path]-registered suites, non-integer types need the
third scalar_domains.rs registration, and the Bloom splinter allowlist names.
The extractor doc and its sibling test comment claimed the text_match domain
CHECK guarantees `bf` is an array, so a non-array `bf` could only occur outside
the domain. The CHECK only asserts key presence (`VALUE ? 'bf'`), so a typed
value like {"bf": null} reaches the extractor with a non-array `bf` — which is
exactly why the array-gate and its test exist. Also fix a stale macro name
(`ordered_numeric_matrix!` -> `scalar_matrix!`) and trim a thrice-repeated
rationale in the text_values wiring.
@tobyhede tobyhede merged commit 3eca6c5 into eql_v3 Jun 10, 2026
11 checks passed
@tobyhede tobyhede deleted the v3-domain-type-text branch June 10, 2026 01:14
tobyhede added a commit that referenced this pull request Jun 20, 2026
Documents the `eql_v3.text` family (eq / match / ord) and the `Bloom` index term
in the scalar-encrypted-domain reference guide — including the
`OrderedScalar`/`SignedScalar` pivot-trait section and the catalog-derived
(marker-free) text dispatch — and adds the `[Unreleased]` changelog entry
(#260).
tobyhede added a commit that referenced this pull request Jun 20, 2026
Adds the four characterization tests @auxesis requested on #260, each
pinning a branch the existing suite never reached:

- has_bloom_filter(jsonb) presence predicate (present/absent/{"bf":null}
  -> false) — the IS NOT NULL half of its guard was untested, and it is
  not reached transitively by the extractor or domain CHECK.
- bloom_filter(jsonb) empty-array branch: {"bf":[]} -> empty smallint[],
  not NULL (the extractor basis for empty-set containment semantics).
- String::to_sql_literal single-quote escaping (O'Brien -> 'O''Brien');
  all TEXT fixtures are quote-free so no DB test hit the .replace.
- Fixture::Zero/Min/Max -> None on non-integer kinds (Date/Text), the arm
  changed from unconditional Some(0); previously only guarded indirectly
  by the pivot_sentinels_only_appear_with_integer_kinds catalog invariant.

All four pass; behaviour was already correct, these are regression nets.
tobyhede added a commit that referenced this pull request Jun 20, 2026
feat(scalars): add eql_v3.text encrypted-domain family (eq / match / ord)
tobyhede added a commit that referenced this pull request Jun 22, 2026
Documents the `eql_v3.text` family (eq / match / ord) and the `Bloom` index term
in the scalar-encrypted-domain reference guide — including the
`OrderedScalar`/`SignedScalar` pivot-trait section and the catalog-derived
(marker-free) text dispatch — and adds the `[Unreleased]` changelog entry
(#260).
tobyhede added a commit that referenced this pull request Jun 22, 2026
Adds the four characterization tests @auxesis requested on #260, each
pinning a branch the existing suite never reached:

- has_bloom_filter(jsonb) presence predicate (present/absent/{"bf":null}
  -> false) — the IS NOT NULL half of its guard was untested, and it is
  not reached transitively by the extractor or domain CHECK.
- bloom_filter(jsonb) empty-array branch: {"bf":[]} -> empty smallint[],
  not NULL (the extractor basis for empty-set containment semantics).
- String::to_sql_literal single-quote escaping (O'Brien -> 'O''Brien');
  all TEXT fixtures are quote-free so no DB test hit the .replace.
- Fixture::Zero/Min/Max -> None on non-integer kinds (Date/Text), the arm
  changed from unconditional Some(0); previously only guarded indirectly
  by the pivot_sentinels_only_appear_with_integer_kinds catalog invariant.

All four pass; behaviour was already correct, these are regression nets.
@tobyhede tobyhede mentioned this pull request Jun 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants