Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
…, fix blocker doc path Hoist the src/v3 path strings into named constants in generate.rs (byte-identical output — parity holds). Remove the now-dead Term::returns() (codegen builds the extractor return type from CORE_SCHEMA + ctor) and its legacy eql_v2 test assertions. Repoint the shared blocker's doc comment to src/v3/scalars/<T>/.
The eql_v3 SEM index-term functions are a hand-port of the eql_v2 originals. The scalar matrix already exercises the array comparator's happy path end-to-end against real ciphertext fixtures, but several branches are structurally unreachable there and were tested only on the eql_v2 copies. Add a sibling family test module covering them: - differential v2<->v3 parity on real `ob` fixtures (both sides routed through extractor -> composite -> compare_..._terms, so the schema prefix is the only variable) — the strongest guard against a faithful-port slip; - the 'Ciphertexts are different lengths' RAISE; - NULL-term ordering branches the STRICT wrappers bypass; - array NULL + empty/cardinality recursion base cases; - has_* presence checks, the missing-`ob` RAISE, and the NULL-jsonb short-circuit. Verified non-vacuous: a deliberately broken comparator fails T1/T2/T3 while the independent T4/T5 stay green.
…entity names Addresses CodeRabbit review: the artifact/test self-containment assertions only matched schema-qualified eql_v2. references; extend them to also reject eql_v2_<entity> names (eql_v2_encrypted, eql_v2_configuration) while still allowing prose mentions of eql_v2 in doc comments. Verified the v3 artifact and src/v3 contain zero eql_v2_ occurrences.
…e SCHEMA
With eql_v3 fully self-contained, the encrypted-domain families and the SEM
index-term types they call live in one schema, so there is no second schema to
point the core types at. Replace the CORE_SCHEMA/DOMAIN_SCHEMA pair with one
SCHEMA = "eql_v3" constant; the templates read it via the {{ schema }} global.
Output is byte-identical (golden parity holds).
…ardening - Cargo.toml: update stale fixture command to fixture:generate:all - eql-functions.md: index examples use distinct _eq/_ord columns (a column carries a single domain, so eq_term/ord_term apply to different columns) - scalars/mod.rs: header describes the scalar_types!(matrix_suites) layout - codegen-parity.sh: ls *.sql -> portable find (no set -e abort on empty dir) - inlinability.rs: narrow arity-1 hmac_256 match to the jsonb overload
…ators The six comparison operator backing functions (_eq, _neq, _lt, _lte, _gt, _gte) were missing required @param and @return Doxygen tags, failing docs:validate:required-tags with 6 errors and 6 warnings. Tags follow the eql_v2 sibling src/ore_block_u64_8_256/operators.sql convention.
cargo fmt --check was failing CI (test:lint and test:crates jobs) on this file; apply rustfmt with no logic changes.
…t consts Make the SQLx scalar-matrix harness type-agnostic ahead of the first non-integer scalar, without adding one. Three integer assumptions are lifted: - `ScalarType::FIXTURE_VALUES` (a `const`) becomes `fn fixture_values()`, so a scalar whose values can't be const-constructed can return a borrow of a lazily-built `Vec` instead. Integer impls still hand back their `eql_scalars::<T>_VALUES` const. - New `min_pivot()` / `max_pivot()` trait methods replace the matrix's direct `<T>::MIN` / `<T>::MAX` pivot references, so a scalar without an inherent `::MIN`/`::MAX` const can supply an explicit sentinel. - The ORDER BY arms build their `WHERE` clause from `to_sql_literal(zero)` instead of a hardcoded `> 0`, so a non-integer plaintext column typechecks. Behaviour-preserving for the existing `int4` / `int2` types: the integer `min_pivot`/`max_pivot` resolve to `Self::MIN`/`Self::MAX`, `to_sql_literal(0)` renders `0`, and the generated test names are unchanged. The int4 cargo-expand snapshot is regenerated to track the method-based bodies.
Co-authored-by: coderabbitai[bot] <136622811+coderabbitai[bot]@users.noreply.github.com>
- scalar_domains.rs: document fixture_values() stable-order contract (callers compare element-wise and index positionally without sorting) - eql-tests-macros: assert emitted min_pivot/max_pivot bodies instead of loose MIN/MAX substrings that also match the doc comment 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
feat(int8): add eql_v3.int8 encrypted-domain type family
…test hardening Collates and resolves code-review feedback on the self-contained eql_v3 PR. Inlinable SEM helpers (coderdan threads): - Convert eql_v3.jsonb_array_to_bytea_array and eql_v3.jsonb_array_to_ore_block_u64_8_256 from LANGUAGE plpgsql to inlinable LANGUAGE sql IMMUTABLE (no SET), using a CASE-scalar-subquery form so JSON-null/empty-array inputs still return NULL rather than raising (a naive FROM-SRF + WHERE rewrite would regress null to error). - These take a bare jsonb (not a domain), so pin_search_path.sql's structural skip does not cover them; opt them in via its documented 'eql-inline-critical' COMMENT marker so they install unpinned and the planner can inline them. Add matching splinter allowlist rows. The eql_v2 copies stay plpgsql by design. Direct SEM tests added (sem.rs). Stale references: - eql-functions.md / sql-support.md: v3 extractors now document eql_v3 SEM return types, not eql_v2. - pin_search_path.sql header, scalar_types.rs and mutations.rs comments. Test/build hardening: - writer.rs validates the AUTO-GENERATED ownership header before writing. - eql-scalars int_values! fails at compile time on narrowed-fixture overflow. - parity.rs asserts the generated file set, not just per-file contents. - mise.toml: shared test:sqlx:prep so test:sqlx:watch gets the same prep. - build.sh v3-only build rejects REQUIRE edges outside src/v3. - fixtures: single-source PAYLOAD_COLUMN const; enforce 63-byte identifier limit. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <noreply@anthropic.com>
Self-contained eql_v3 schema + standalone v3 installer
refactor(tests): generalise scalar matrix harness off integer-inherent consts
Add the `eql_v3.date` encrypted-domain scalar — the first non-integer ordered type — on top of the integer-agnostic harness refactor (parent PR). The SQL codegen needs no change (domains are jsonb-backed and token-driven); the work is one catalog row plus the temporal wiring the refactor enabled. Catalog (`eql-scalars`): `ScalarKind::Date` (`chrono::NaiveDate`), `Fixture::Date(&str)` (zero-dep ISO strings), `DATE_FIXTURES` (16 dates incl. the three matrix pivots), and `pub const DATE` appended to `CATALOG`, with mirrored panic / pivot-presence / token-order tests. Harness: an explicit `[temporal]` marker in the `scalar_types!` dispatch list drives the divergences from the integer path — the `impl ScalarType` for a temporal scalar is hand-written (chrono values can't be a `const` slice; pivots are explicit sentinels), and `scalar_fixture!` stamps a pivot-presence assert instead of the integer signed-extreme asserts. Adds `impl ScalarType for NaiveDate` (LazyLock-parsed values, `min_pivot`/`max_pivot` sentinels, quoted `to_sql_literal`), `EqlPlaintext for NaiveDate` (Cast::DATE), the sqlx `chrono` feature + direct `chrono` dep, the CHANGELOG entry, and a temporal-kinds note in the adding-a-scalar reference.
feat(scalars): add eql_v3.date encrypted-domain type + temporal wiring
… reference CATALOG now includes the non-integer ordered scalar (date) alongside the integers, so the 'only the integer scalars today' wording was stale. Addresses CodeRabbit feedback on PR #256.
ScalarKind loses the five partial accessors that panicked on non-integer kinds; bounds now live on the total BoundedIntKind, reached via ScalarKind::as_bounded_int(). Date::min_symbol() is now a compile error. A CATALOG invariant test replaces the deleted #[should_panic] tests.
Update the adding-a-scalar reference's `kind` bullet to reflect that the bounded-numeric accessors moved to the total BoundedIntKind sub-enum, and add the implementation plan under docs/superpowers/plans/.
refactor(eql-scalars): total BoundedIntKind sub-enum (replaces panicking ScalarKind accessors)
…p [temporal] marker
…a is_eq_only_token The ordered_numeric_matrix! suite exercises </>/min/max; an equality-only scalar (no _ord domain in eql-scalars::CATALOG) does not support those. Route the matrix emitter through matrix_suite_for_entry, which reads eq-only-ness from the catalog (is_eq_only_token) and emits a compile_error! for an eq-only token instead of silently generating ordering tests it cannot pass. Makes the catalog-derived is_eq_only accessor load-bearing and leaves a clean seam for an equality-only matrix path. Both arms unit-tested.
Derived IEEE PartialEq (NaN != NaN, +0.0 == -0.0) was inconsistent with the manual Eq/Ord impls built on total_cmp (NaN == NaN, +0.0 != -0.0), breaking Eq's reflexivity contract and Ord/PartialEq equality agreement.
…eal ciphertext) Adds v3_jsonb_to_ste_vec_query_gin_is_cost_chosen: replicates one real v3_ste_vec document to 5000 rows + 1 distinct pivot, builds the to_ste_vec_query GIN index, and with enable_seqscan left ON asserts the planner CHOOSES the index for a single-row-selective $.hello oc containment needle (matched == 1). Complements the sibling *_gin_engages usability arm (which forces seqscan off). Verified PG17: passes.
…t:sqlx:prep) bench.sh hand-rolled build+cp+migrate but never generated the gitignored per-type fixtures. #[sqlx::test(fixtures(...))] include_str!'s those .sql files at COMPILE time, so once fixtures became generated/gitignored the bench binary stopped compiling: 'couldn't read tests/sqlx/fixtures/eql_v2_numeric.sql'. bench-eql has failed every nightly on main for 8+ days as a result — pre-existing, unrelated to the scale tests, but it blocks them from ever running. Reuse test:sqlx:prep (build + cp + migrate + fixture:generate:all) so bench stays in lockstep with test:sqlx.
With bench.sh now running test:sqlx:prep, fixture:generate:all encrypts via cipherstash-client and needs ZeroKMS auth (CS_CLIENT_ACCESS_KEY + CS_WORKSPACE_CRN) plus a client key (CS_CLIENT_ID + CS_CLIENT_KEY); without them it fails with 'Auth strategy error: Not authenticated'. Add the four secrets to the bench job env, mirroring test-eql.yml. (bench-eql triggers are push:main/schedule/dispatch, all main-repo, so no fork-PR creds exposure.)
feat(v3): float4/float8 encrypted-domain types
Job-scoped secrets are exposed to every step, including third-party actions (actions/checkout, jdx/mise-action, Swatinem/rust-cache) referenced by mutable tags — a compromised tag could exfiltrate ZeroKMS/client creds before the bench script runs. Move the four CS_* vars onto the 'Run bench tests' step that actually needs them (fixture:generate:all). Least privilege; addresses PR review.
Adds the eql_v3 encrypted-domain property suite over the committed, curated real-ciphertext fixtures: - shared all-pairs operator oracle (property.rs): = / <> on _eq and the ordered comparisons + ord_term sort order, checked against a plaintext oracle over every ordered pair of fixture rows; - function-double oracles: the generated eql_v3.eq/neq/lt/lte/gt/gte functions across all three overloads (domain-domain, domain-jsonb, jsonb-domain) plus term-extractor identity (eq_term==hm, ord_term==ob); - bloom match smoke for the text _match domain; NULL/blocker/CHECK edge cases; - the e2e suite (gated behind proptest-e2e) over fresh ZeroKMS encryption. Fixtures are generated from the curated catalog values via FixtureSpec::run().
…P-3141) Adds a dedicated test:sqlx:e2e mise task and CI job for the proptest-e2e suite (needs ZeroKMS creds; the credential-free shards run the fixture suite), and runs source doc validation unconditionally.
Documents the three property-test suites (catalog / fixture / e2e) over the committed curated fixtures, the function-double oracles, and term-extractor identity. CHANGELOG entry under [Unreleased].
test(v3): scaled, cost-chosen index-engagement tests for encrypted-domain surface
…le (CIP-3141) The float4/float8 shared-index-term e2e test pulled both `hm` and `ob` via `as_str()`, but the ORE `ob` term is a JSON array of block strings, not a scalar string (only `hm` is a string). `as_str()` returned None on the array, raising "payload missing string `ob`". Split the helper: `hm` stays a string, `ob` extracts the array and compares directly, matching the canonical extractor in property.rs. Latent until the gated e2e suite started running in CI.
…or (CIP-3141) `float4_and_float8_share_index_terms_for_the_same_value` asserted byte-equality of the raw `ob` ORE arrays of two independently-encrypted payloads. That can never hold: a BlockORE term is `Left (deterministic) ++ Right (16-byte random per-ciphertext nonce + nonce-masked truth tables)`, so two encodings of the SAME value — same width, same cast — are byte-UNEQUAL by construction. Ordering is decided by the ORE compare function, not raw bytes. (The cast is irrelevant: `real`/`double` collapse to one f64 `ColumnType::Float` in cipherstash-client; the deterministic Left halves are byte-identical, which is what proves the two widths share an encoding.) The bug stayed latent because the e2e suite is feature/creds-gated and, when it did run, an earlier `ob`-as-string extraction errored out before the assertion; fixing that extraction (fe52d42) unmasked the wrong assertion. Compare the extracted `ord_term`s through the SQL `eql_v3.ore_block_256` `=` operator (the only correct ORE check) and keep the deterministic `hm` equality term as a direct byte comparison. Also correct the CHANGELOG claim of a "byte-identical ORE term" to "equal under the ORE comparator". Verified: the test now passes against fresh ZeroKMS encryption.
…t_expand [codex] assert scalar scale predicates are selective
…, and STRICT NULL (G3 4a)
test(sqlx): eql_v3 SQL-function property tests (CIP-3141)
docs-static (source-only SQL doxygen coverage + required-tags) ran on every PR, unlike the other heavy jobs. Its inputs are a strict subset of the changes-job relevant filter — src/**, the crates/** codegen build it depends on, and the tasks/docs/** validator scripts — so a PR that leaves relevant false cannot change its outcome. Gate it on the same flag as the other jobs: consistent, and drops a redundant codegen build on markdown-only PRs without losing coverage. A narrower src/**-only filter was rejected — it would risk a silent false-green (ci-required counts skipped as pass) by skipping on a real input change. Sync .github/workflows/README.md (coverage map, known gaps, recently- closed, operator-setup verification) and the stale validate-job comment.
Engineered fixture pair (NEEDLE's ngrams ⊆ HAY's ngram set, yet NEEDLE is not a contiguous substring of HAY) plus a creds-free catalog contiguity guard that pins the plaintext invariant. Strengthen the divergence test with a raw-bf subset assertion: assert needle-bf ⊆ haystack-bf directly on the stored `bf` arrays via native jsonb containment, independent of the `eql_v3` domain `@>` operator and the `match_term` extractor. Previously the subset was only inferred from the domain `@>` it was meant to validate (circular); the direct check localizes a future tokenizer change to a precise 'bf arrays no longer a subset' failure instead of an opaque @>-false.
The cross-ciphertext oracle include_str!s eql_v2_<token>_doubles.sql at compile time, but stub-fixtures.sh only stubbed eql_v2_<token>.sql and the literal .gitignore *.sql entries — neither matches the _doubles glob. So the no-creds matrix-coverage / inventory jobs could not compile the encrypted_domain binary to --list it (rustc: couldn't read eql_v2_int4_doubles.sql, ...). Stub eql_v2_<token>_doubles.sql alongside eql_v2_<token>.sql for every catalog token, consistent with the helper's stub-the-complete-set policy (harmless extras for storage-only tokens with no real doubles fixture).
test(v3): text-search direct-function coverage + bloom-vs-LIKE divergence lock-in
test(v3): cross-ciphertext equality via per-type doubles fixtures (CIP-3141)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
THIS IS THE EQL_V3 STACK.
VERY WIP.