-
Notifications
You must be signed in to change notification settings - Fork 0
feat: eql-types — canonical EQL v3 payload types (Rust) #236
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
3d3f4a7
feat: eql-types canonical types crate (prototype)
coderdan 01dd8fe
fix(eql-types): correct bf signedness and accept k-less scalar payloads
coderdan af715c0
feat(eql-types): v3 domain payload types, parity-gated against the ca…
coderdan 57c0b54
refactor(eql-types): unroll eql_v3_domain! macro into explicit structs
coderdan 21bd887
refactor(eql-types)!: drop the v2.3 tier and split codegen into stack…
coderdan aa46959
fix(eql-types): pin the envelope version, reject unknown keys, derive…
coderdan b3338ff
refactor(eql-types): reshape DomainType as an object-safe trait
coderdan 7777853
refactor(eql-types): drop roundtrip from DomainType; slim the parity …
coderdan b810fad
refactor(eql-types): implement DomainType per type; drop the V3Domain…
coderdan fd5e626
refactor(eql-types): implement DomainType on the payload types themse…
coderdan 035952e
fix(eql-types): drop Default from payload types; power handles via a …
coderdan b3c0cef
test(eql-types): cover wire shape of every non-int4 v3 domain
coderdan File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Oops, something went wrong.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,2 @@ | ||
| /target | ||
| /Cargo.lock |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| [package] | ||
| name = "eql-types" | ||
| version = "0.1.0" | ||
| edition = "2021" | ||
| description = "Canonical wire types for EQL payloads — the single Rust source of truth (TypeScript bindings and JSON Schemas are generated from these types in stacked changes)." | ||
|
|
||
| [dependencies] | ||
| serde = { version = "1", features = ["derive"] } | ||
|
|
||
| [dev-dependencies] | ||
| # Parity oracle: tests/catalog_parity.rs asserts the v3 domain inventory | ||
| # exactly covers eql_scalars::CATALOG, so the types here cannot drift from | ||
| # the generated SQL surface. | ||
| eql-scalars = { path = "../eql-scalars" } | ||
| serde_json = "1" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,82 @@ | ||
| # eql-types | ||
|
|
||
| Canonical wire types for EQL payloads — **one Rust definition per payload | ||
| shape**, the single source of truth for every tool that produces or consumes | ||
| EQL payloads (`cipherstash-client`, `protect-ffi`, CipherStash Proxy). | ||
|
|
||
| TypeScript bindings (via [`ts-rs`]) and JSON Schemas (via [`schemars`]) are | ||
| generated from these definitions in stacked changes; this crate is the | ||
| Rust contract only. | ||
|
|
||
| ## Why | ||
|
|
||
| Type information is lost at every hop of `EQL → cipherstash-client → | ||
| protect-ffi → stack`. protect-ffi hand-writes its TypeScript types; they drift | ||
| from the Rust they describe; stack widens them further. The result is bugs | ||
| like the `protect-dynamodb` search-term check that validates a payload shape | ||
| EQL never actually defined. A generated, single-source crate removes the | ||
| hand-copying. | ||
|
|
||
| ## Capability-encoded types | ||
|
|
||
| The [`src/v3/`](src/v3/) module has one type per **SQL domain** in the | ||
| `eql_v3` schema — `Int4` / `Int4Eq` / `Int4Ord` / `Int4OrdOre`, and likewise | ||
| for `int2`, `int8`, `date`, `timestamptz` (eq-only), and `text` (which adds | ||
| `TextMatch`) — each carrying its index terms as **required** fields. The | ||
| capability is the type identity; `Option` never appears. A payload missing | ||
| its term key fails to deserialize: the Rust analogue of the SQL domain's | ||
| CHECK constraint. | ||
|
|
||
| Shared wire fields are reusable newtypes in | ||
| [`src/v3/terms.rs`](src/v3/terms.rs): | ||
|
|
||
| | Newtype | Wire key | Inner | Backs | | ||
| |---------|----------|-------|-------| | ||
| | `Ciphertext` | `c` | `String` | every domain (envelope) | | ||
| | `Hmac256` | `hm` | `String` | `_eq` domains | | ||
| | `OreBlockU64_8_256` | `ob` | `Vec<String>` | `_ord` / `_ord_ore` domains | | ||
| | `BloomFilter` | `bf` | `Vec<i16>` (signed!) | `_match` domains | | ||
|
|
||
| Note "v3" names the SQL schema generation (`eql_v3.*`); the JSON envelope | ||
| version is still `v: 2` — the generated domain CHECKs assert it, and the wire | ||
| field names are unchanged from v2 (the purpose-named rename in | ||
| `docs/plans/eql-payload-scheme-discipline-rfc.md` is deferred). | ||
|
|
||
| ## Drift protection | ||
|
|
||
| `tests/catalog_parity.rs` asserts the domain inventory — | ||
| [`v3::all()`](src/v3/mod.rs), a `Vec<Box<dyn DomainType>>` of zero-sized | ||
| type-level handles — exactly covers `eql-scalars::CATALOG` (the same catalog | ||
| that generates the `eql_v3` SQL surface): every domain, in order. Adding a | ||
| scalar to the catalog without adding its types here fails the build. | ||
| Wire-key strictness (required term keys, unknown-key rejection, envelope | ||
| version) is covered per-type in `tests/v3_conformance.rs` and pinned against | ||
| the catalog by the JSON Schema parity test in the stacked schemars change. | ||
|
|
||
| ## Develop | ||
|
|
||
| ```sh | ||
| cargo test -p eql-types | ||
| ``` | ||
|
|
||
| The crate is also part of the lean `mise run test:crates` set (fmt, clippy, | ||
| test — no database). | ||
|
|
||
| ## Future direction: self-describing payloads | ||
|
|
||
| On the wire, a v3 payload is discriminated only by *which key is present* | ||
| (`hm` vs `ob` vs `bf`) — the SQL domain name carries the rest. Once the JSON | ||
| leaves SQL (into protect-ffi, into TypeScript, into a log line) that | ||
| information is gone, and a consumer is back to sniffing keys: the untagged | ||
| failure mode that produced the original protect-dynamodb bug. An earlier | ||
| prototype here carried an `Int4Tagged` enum with a one-field capability tag | ||
| (`"x": "int4_eq"`), which generates a clean TypeScript discriminated union | ||
| and a JSON Schema `oneOf` with per-branch `const`s. It was removed because | ||
| the tag is not part of the v3 wire contract (the generated domain CHECKs | ||
| know no `x` key) — but it remains the recommended shape if a future payload | ||
| revision adds a discriminator. See | ||
| `docs/plans/eql-payload-scheme-discipline-rfc.md` for the wider payload | ||
| evolution plan. | ||
|
|
||
| [`ts-rs`]: https://github.com/Aleph-Alpha/ts-rs | ||
| [`schemars`]: https://graham.cool/schemars/ |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,78 @@ | ||
| //! # eql-types — canonical EQL payload types | ||
| //! | ||
| //! One Rust definition per EQL payload shape — the single source of truth | ||
| //! for every tool that produces or consumes EQL payloads | ||
| //! (`cipherstash-client`, `protect-ffi`, CipherStash Proxy). TypeScript | ||
| //! bindings and JSON Schemas are generated from these definitions in | ||
| //! stacked changes; the Rust types are the contract. | ||
| //! | ||
| //! The [`v3`] module holds the `eql_v3` encrypted-domain types: one struct | ||
| //! per SQL domain (`eql_v3.int4_eq`, `eql_v3.text_match`, …), | ||
| //! *capability-encoded* — index terms are required fields, never `Option`. | ||
| //! It mirrors `eql-scalars::CATALOG` 1:1, enforced by | ||
| //! `tests/catalog_parity.rs`. | ||
| //! | ||
| //! Wire rule: **field names ARE wire names** — no `#[serde(rename)]` | ||
| //! anywhere. The struct definition reads exactly like the JSON payload. | ||
|
|
||
| use serde::{Deserialize, Serialize}; | ||
|
|
||
| pub mod v3; | ||
|
|
||
| /// EQL wire-format version. Hard-coded to `2` for every payload — including | ||
| /// the [`v3`] tier, whose generated domain CHECKs assert `VALUE->>'v' = '2'`. | ||
| pub const EQL_SCHEMA_VERSION: u16 = 2; | ||
|
|
||
| /// The envelope version field (`v`) — always exactly [`EQL_SCHEMA_VERSION`] | ||
| /// on the wire. | ||
| /// | ||
| /// Deserialization rejects any other value: the Rust analogue of the domain | ||
| /// CHECK's `VALUE->>'v' = '2'`, so a wrong-version payload fails at the type | ||
| /// boundary instead of at INSERT. The inner value is private; the only | ||
| /// constructible instance is the current version. | ||
| #[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Serialize)] | ||
| pub struct SchemaVersion(u16); | ||
|
|
||
| impl SchemaVersion { | ||
| /// The current (only) wire version, `2`. | ||
| pub const CURRENT: Self = Self(EQL_SCHEMA_VERSION); | ||
|
|
||
| /// The wire value. | ||
| pub const fn get(self) -> u16 { | ||
| self.0 | ||
| } | ||
| } | ||
|
|
||
| impl Default for SchemaVersion { | ||
| fn default() -> Self { | ||
| Self::CURRENT | ||
| } | ||
| } | ||
|
|
||
| impl<'de> Deserialize<'de> for SchemaVersion { | ||
| fn deserialize<D>(deserializer: D) -> Result<Self, D::Error> | ||
| where | ||
| D: serde::Deserializer<'de>, | ||
| { | ||
| let v = u16::deserialize(deserializer)?; | ||
| if v == EQL_SCHEMA_VERSION { | ||
| Ok(Self(v)) | ||
| } else { | ||
| Err(serde::de::Error::custom(format!( | ||
| "unsupported EQL schema version {v} (expected {EQL_SCHEMA_VERSION})" | ||
| ))) | ||
| } | ||
| } | ||
| } | ||
|
|
||
| /// Table + column identifier — wire shape `{"t": "...", "c": "..."}`. | ||
| /// | ||
| /// Shared by every payload. | ||
| #[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)] | ||
| #[serde(deny_unknown_fields)] | ||
| pub struct Identifier { | ||
| /// Table name. | ||
| pub t: String, | ||
| /// Column name. | ||
| pub c: String, | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,107 @@ | ||
| //! The `date` encrypted-domain family — an ordered, non-integer scalar. | ||
| //! Same four-domain ordered shape as [`crate::v3::int4`] (ORE compares | ||
| //! ciphertext, so dates order like integers); see that module for the | ||
| //! capability table. | ||
|
|
||
| use crate::v3::terms::{Ciphertext, Hmac256, OreBlockU64_8_256}; | ||
| use crate::v3::DomainType; | ||
| use crate::{Identifier, SchemaVersion}; | ||
| use serde::{Deserialize, Serialize}; | ||
|
|
||
| /// `eql_v3.date` — storage only; every operator is blocked. | ||
| #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)] | ||
| #[serde(deny_unknown_fields)] | ||
| pub struct Date { | ||
| /// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other | ||
| /// value fails deserialization. | ||
| pub v: SchemaVersion, | ||
| /// Table/column identifier. Required by the domain CHECK. | ||
| pub i: Identifier, | ||
| /// mp_base85 source ciphertext. Required by the domain CHECK. | ||
| pub c: Ciphertext, | ||
| } | ||
|
|
||
| impl DomainType for Date { | ||
| fn sql_domain_static() -> &'static str { | ||
| "eql_v3.date" | ||
| } | ||
|
|
||
| fn sql_domain(&self) -> &'static str { | ||
| Self::sql_domain_static() | ||
| } | ||
| } | ||
|
|
||
| /// `eql_v3.date_eq` — HMAC equality (`=`, `<>`). | ||
| #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)] | ||
| #[serde(deny_unknown_fields)] | ||
| pub struct DateEq { | ||
| /// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other | ||
| /// value fails deserialization. | ||
| pub v: SchemaVersion, | ||
| /// Table/column identifier. Required by the domain CHECK. | ||
| pub i: Identifier, | ||
| /// mp_base85 source ciphertext. Required by the domain CHECK. | ||
| pub c: Ciphertext, | ||
| /// HMAC-SHA-256 equality term. | ||
| pub hm: Hmac256, | ||
| } | ||
|
|
||
| impl DomainType for DateEq { | ||
| fn sql_domain_static() -> &'static str { | ||
| "eql_v3.date_eq" | ||
| } | ||
|
|
||
| fn sql_domain(&self) -> &'static str { | ||
| Self::sql_domain_static() | ||
| } | ||
| } | ||
|
|
||
| /// `eql_v3.date_ord_ore` — full comparison, scheme-explicit name. | ||
| #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)] | ||
| #[serde(deny_unknown_fields)] | ||
| pub struct DateOrdOre { | ||
| /// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other | ||
| /// value fails deserialization. | ||
| pub v: SchemaVersion, | ||
| /// Table/column identifier. Required by the domain CHECK. | ||
| pub i: Identifier, | ||
| /// mp_base85 source ciphertext. Required by the domain CHECK. | ||
| pub c: Ciphertext, | ||
| /// Block-ORE order term. Serves equality too. | ||
| pub ob: OreBlockU64_8_256, | ||
| } | ||
|
|
||
| impl DomainType for DateOrdOre { | ||
| fn sql_domain_static() -> &'static str { | ||
| "eql_v3.date_ord_ore" | ||
| } | ||
|
|
||
| fn sql_domain(&self) -> &'static str { | ||
| Self::sql_domain_static() | ||
| } | ||
| } | ||
|
|
||
| /// `eql_v3.date_ord` — full comparison (`=` `<>` `<` `<=` `>` `>=`). | ||
| #[derive(Clone, Debug, PartialEq, Serialize, Deserialize)] | ||
| #[serde(deny_unknown_fields)] | ||
| pub struct DateOrd { | ||
| /// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other | ||
| /// value fails deserialization. | ||
| pub v: SchemaVersion, | ||
| /// Table/column identifier. Required by the domain CHECK. | ||
| pub i: Identifier, | ||
| /// mp_base85 source ciphertext. Required by the domain CHECK. | ||
| pub c: Ciphertext, | ||
| /// Block-ORE order term. Serves equality too. | ||
| pub ob: OreBlockU64_8_256, | ||
| } | ||
|
|
||
| impl DomainType for DateOrd { | ||
| fn sql_domain_static() -> &'static str { | ||
| "eql_v3.date_ord" | ||
| } | ||
|
|
||
| fn sql_domain(&self) -> &'static str { | ||
| Self::sql_domain_static() | ||
| } | ||
| } | ||
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tobyhede should this be
3now?!