Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

5 changes: 5 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,10 @@
# crates/eql-codegen — the SQL generator binary (stub here; Plan 2 fills it in).
# crates/eql-tests-macros — proc-macros expanding the single scalar-harness
# list into the per-type SQLx-matrix wiring.
# crates/eql-types — canonical Rust wire types for EQL payloads, parity-
# tested against the eql-scalars catalog. (TypeScript
# bindings and JSON Schemas are generated from these
# types in stacked changes.)
# tests/sqlx — the existing `eql_tests` SQLx integration crate.
#
# resolver = "2" keeps the heavy test-crate feature set (sqlx/tokio/cipherstash-
Expand All @@ -20,6 +24,7 @@ members = [
"crates/eql-scalars",
"crates/eql-codegen",
"crates/eql-tests-macros",
"crates/eql-types",
"tests/sqlx",
]
default-members = ["tests/sqlx"]
9 changes: 5 additions & 4 deletions crates/eql-codegen/src/consts.rs
Original file line number Diff line number Diff line change
Expand Up @@ -11,10 +11,11 @@ pub(crate) const AUTO_GENERATED_HEADER: &str = "-- AUTOMATICALLY GENERATED FILE.
/// the core types at.
pub(crate) const SCHEMA: &str = "eql_v3";

/// Always-present payload keys checked for presence in every domain CHECK, in
/// order: envelope version (`v`), ident (`i`), ciphertext (`c`). Term-specific
/// keys are appended after these by `context::domain_block`.
pub(crate) const ENVELOPE_KEYS: &[&str] = &["v", "i", "c"];
/// Always-present payload keys checked for presence in every domain CHECK.
/// Term-specific keys are appended after these by `context::domain_block`.
/// Defined in the catalog (`eql_scalars::ENVELOPE_KEYS`) so the CHECKs and
/// the `eql-types` payload structs share one envelope definition.
pub(crate) const ENVELOPE_KEYS: &[&str] = eql_scalars::ENVELOPE_KEYS;

/// Escape a string for use inside a single-quoted SQL literal by doubling
/// embedded single quotes.
Expand Down
11 changes: 11 additions & 0 deletions crates/eql-scalars/src/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -73,6 +73,17 @@ pub enum ScalarKind {
Timestamptz,
}

/// Always-present payload keys required by every generated domain CHECK,
/// before the domain's term keys, in order: envelope version (`v`), ident
/// (`i`), ciphertext (`c`).
///
/// Lives here — in the catalog — because it is cross-schema contract data
/// consumed on both sides of the generated surface: `eql-codegen` builds
/// every domain CHECK from it, and `eql-types` builds its payload structs
/// and parity tests against it. One definition, so the envelope cannot
/// drift between the SQL and the canonical types.
pub const ENVELOPE_KEYS: &[&str] = &["v", "i", "c"];

/// A fixed index term known to the scalar materializer.
///
/// `Hm` provides equality; `Ore` provides equality plus ordering. The
Expand Down
2 changes: 2 additions & 0 deletions crates/eql-types/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/target
/Cargo.lock
15 changes: 15 additions & 0 deletions crates/eql-types/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
[package]
name = "eql-types"
version = "0.1.0"
edition = "2021"
description = "Canonical wire types for EQL payloads — the single Rust source of truth (TypeScript bindings and JSON Schemas are generated from these types in stacked changes)."

[dependencies]
serde = { version = "1", features = ["derive"] }

[dev-dependencies]
# Parity oracle: tests/catalog_parity.rs asserts the v3 domain inventory
# exactly covers eql_scalars::CATALOG, so the types here cannot drift from
# the generated SQL surface.
eql-scalars = { path = "../eql-scalars" }
serde_json = "1"
82 changes: 82 additions & 0 deletions crates/eql-types/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# eql-types

Canonical wire types for EQL payloads — **one Rust definition per payload
shape**, the single source of truth for every tool that produces or consumes
EQL payloads (`cipherstash-client`, `protect-ffi`, CipherStash Proxy).

TypeScript bindings (via [`ts-rs`]) and JSON Schemas (via [`schemars`]) are
generated from these definitions in stacked changes; this crate is the
Rust contract only.

## Why

Type information is lost at every hop of `EQL → cipherstash-client →
protect-ffi → stack`. protect-ffi hand-writes its TypeScript types; they drift
from the Rust they describe; stack widens them further. The result is bugs
like the `protect-dynamodb` search-term check that validates a payload shape
EQL never actually defined. A generated, single-source crate removes the
hand-copying.

## Capability-encoded types

The [`src/v3/`](src/v3/) module has one type per **SQL domain** in the
`eql_v3` schema — `Int4` / `Int4Eq` / `Int4Ord` / `Int4OrdOre`, and likewise
for `int2`, `int8`, `date`, `timestamptz` (eq-only), and `text` (which adds
`TextMatch`) — each carrying its index terms as **required** fields. The
capability is the type identity; `Option` never appears. A payload missing
its term key fails to deserialize: the Rust analogue of the SQL domain's
CHECK constraint.

Shared wire fields are reusable newtypes in
[`src/v3/terms.rs`](src/v3/terms.rs):

| Newtype | Wire key | Inner | Backs |
|---------|----------|-------|-------|
| `Ciphertext` | `c` | `String` | every domain (envelope) |
| `Hmac256` | `hm` | `String` | `_eq` domains |
| `OreBlockU64_8_256` | `ob` | `Vec<String>` | `_ord` / `_ord_ore` domains |
| `BloomFilter` | `bf` | `Vec<i16>` (signed!) | `_match` domains |

Note "v3" names the SQL schema generation (`eql_v3.*`); the JSON envelope
version is still `v: 2` — the generated domain CHECKs assert it, and the wire
field names are unchanged from v2 (the purpose-named rename in
`docs/plans/eql-payload-scheme-discipline-rfc.md` is deferred).

## Drift protection

`tests/catalog_parity.rs` asserts the domain inventory —
[`v3::all()`](src/v3/mod.rs), a `Vec<Box<dyn DomainType>>` of zero-sized
type-level handles — exactly covers `eql-scalars::CATALOG` (the same catalog
that generates the `eql_v3` SQL surface): every domain, in order. Adding a
scalar to the catalog without adding its types here fails the build.
Wire-key strictness (required term keys, unknown-key rejection, envelope
version) is covered per-type in `tests/v3_conformance.rs` and pinned against
the catalog by the JSON Schema parity test in the stacked schemars change.

## Develop

```sh
cargo test -p eql-types
```

The crate is also part of the lean `mise run test:crates` set (fmt, clippy,
test — no database).

## Future direction: self-describing payloads

On the wire, a v3 payload is discriminated only by *which key is present*
(`hm` vs `ob` vs `bf`) — the SQL domain name carries the rest. Once the JSON
leaves SQL (into protect-ffi, into TypeScript, into a log line) that
information is gone, and a consumer is back to sniffing keys: the untagged
failure mode that produced the original protect-dynamodb bug. An earlier
prototype here carried an `Int4Tagged` enum with a one-field capability tag
(`"x": "int4_eq"`), which generates a clean TypeScript discriminated union
and a JSON Schema `oneOf` with per-branch `const`s. It was removed because
the tag is not part of the v3 wire contract (the generated domain CHECKs
know no `x` key) — but it remains the recommended shape if a future payload
revision adds a discriminator. See
`docs/plans/eql-payload-scheme-discipline-rfc.md` for the wider payload
evolution plan.

[`ts-rs`]: https://github.com/Aleph-Alpha/ts-rs
[`schemars`]: https://graham.cool/schemars/
78 changes: 78 additions & 0 deletions crates/eql-types/src/lib.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
//! # eql-types — canonical EQL payload types
//!
//! One Rust definition per EQL payload shape — the single source of truth
//! for every tool that produces or consumes EQL payloads
//! (`cipherstash-client`, `protect-ffi`, CipherStash Proxy). TypeScript
//! bindings and JSON Schemas are generated from these definitions in
//! stacked changes; the Rust types are the contract.
//!
//! The [`v3`] module holds the `eql_v3` encrypted-domain types: one struct
//! per SQL domain (`eql_v3.int4_eq`, `eql_v3.text_match`, …),
//! *capability-encoded* — index terms are required fields, never `Option`.
//! It mirrors `eql-scalars::CATALOG` 1:1, enforced by
//! `tests/catalog_parity.rs`.
//!
//! Wire rule: **field names ARE wire names** — no `#[serde(rename)]`
//! anywhere. The struct definition reads exactly like the JSON payload.

use serde::{Deserialize, Serialize};

pub mod v3;

/// EQL wire-format version. Hard-coded to `2` for every payload — including
/// the [`v3`] tier, whose generated domain CHECKs assert `VALUE->>'v' = '2'`.
pub const EQL_SCHEMA_VERSION: u16 = 2;

/// The envelope version field (`v`) — always exactly [`EQL_SCHEMA_VERSION`]
/// on the wire.
///
/// Deserialization rejects any other value: the Rust analogue of the domain
/// CHECK's `VALUE->>'v' = '2'`, so a wrong-version payload fails at the type
/// boundary instead of at INSERT. The inner value is private; the only
/// constructible instance is the current version.
#[derive(Clone, Copy, Debug, PartialEq, Eq, Hash, Serialize)]
pub struct SchemaVersion(u16);

impl SchemaVersion {
/// The current (only) wire version, `2`.
pub const CURRENT: Self = Self(EQL_SCHEMA_VERSION);

/// The wire value.
pub const fn get(self) -> u16 {
self.0
}
}

impl Default for SchemaVersion {
fn default() -> Self {
Self::CURRENT
}
}

impl<'de> Deserialize<'de> for SchemaVersion {
fn deserialize<D>(deserializer: D) -> Result<Self, D::Error>
where
D: serde::Deserializer<'de>,
{
let v = u16::deserialize(deserializer)?;
if v == EQL_SCHEMA_VERSION {
Ok(Self(v))
} else {
Err(serde::de::Error::custom(format!(
"unsupported EQL schema version {v} (expected {EQL_SCHEMA_VERSION})"
)))
}
}
}

/// Table + column identifier — wire shape `{"t": "...", "c": "..."}`.
///
/// Shared by every payload.
#[derive(Clone, Debug, PartialEq, Eq, Hash, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct Identifier {
/// Table name.
pub t: String,
/// Column name.
pub c: String,
}
107 changes: 107 additions & 0 deletions crates/eql-types/src/v3/date.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,107 @@
//! The `date` encrypted-domain family — an ordered, non-integer scalar.
//! Same four-domain ordered shape as [`crate::v3::int4`] (ORE compares
//! ciphertext, so dates order like integers); see that module for the
//! capability table.

use crate::v3::terms::{Ciphertext, Hmac256, OreBlockU64_8_256};
use crate::v3::DomainType;
use crate::{Identifier, SchemaVersion};
use serde::{Deserialize, Serialize};

/// `eql_v3.date` — storage only; every operator is blocked.
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct Date {
/// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other
/// value fails deserialization.
pub v: SchemaVersion,
Comment on lines +15 to +17

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tobyhede should this be 3 now?!

/// Table/column identifier. Required by the domain CHECK.
pub i: Identifier,
/// mp_base85 source ciphertext. Required by the domain CHECK.
pub c: Ciphertext,
}

impl DomainType for Date {
fn sql_domain_static() -> &'static str {
"eql_v3.date"
}

fn sql_domain(&self) -> &'static str {
Self::sql_domain_static()
}
}

/// `eql_v3.date_eq` — HMAC equality (`=`, `<>`).
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct DateEq {
/// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other
/// value fails deserialization.
pub v: SchemaVersion,
/// Table/column identifier. Required by the domain CHECK.
pub i: Identifier,
/// mp_base85 source ciphertext. Required by the domain CHECK.
pub c: Ciphertext,
/// HMAC-SHA-256 equality term.
pub hm: Hmac256,
}

impl DomainType for DateEq {
fn sql_domain_static() -> &'static str {
"eql_v3.date_eq"
}

fn sql_domain(&self) -> &'static str {
Self::sql_domain_static()
}
}

/// `eql_v3.date_ord_ore` — full comparison, scheme-explicit name.
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct DateOrdOre {
/// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other
/// value fails deserialization.
pub v: SchemaVersion,
/// Table/column identifier. Required by the domain CHECK.
pub i: Identifier,
/// mp_base85 source ciphertext. Required by the domain CHECK.
pub c: Ciphertext,
/// Block-ORE order term. Serves equality too.
pub ob: OreBlockU64_8_256,
}

impl DomainType for DateOrdOre {
fn sql_domain_static() -> &'static str {
"eql_v3.date_ord_ore"
}

fn sql_domain(&self) -> &'static str {
Self::sql_domain_static()
}
}

/// `eql_v3.date_ord` — full comparison (`=` `<>` `<` `<=` `>` `>=`).
#[derive(Clone, Debug, PartialEq, Serialize, Deserialize)]
#[serde(deny_unknown_fields)]
pub struct DateOrd {
/// Envelope version — always `2` (`EQL_SCHEMA_VERSION`); any other
/// value fails deserialization.
pub v: SchemaVersion,
/// Table/column identifier. Required by the domain CHECK.
pub i: Identifier,
/// mp_base85 source ciphertext. Required by the domain CHECK.
pub c: Ciphertext,
/// Block-ORE order term. Serves equality too.
pub ob: OreBlockU64_8_256,
}

impl DomainType for DateOrd {
fn sql_domain_static() -> &'static str {
"eql_v3.date_ord"
}

fn sql_domain(&self) -> &'static str {
Self::sql_domain_static()
}
}
Loading
Loading