feat: eql-types — JSON Schemas (stacked on #268)#269
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
944fabd to
5f30374
Compare
71d6836 to
6fbb949
Compare
5f30374 to
025fb41
Compare
6fbb949 to
d8bb343
Compare
025fb41 to
cfdb58c
Compare
d8bb343 to
303aff7
Compare
cfdb58c to
d2b8ae0
Compare
303aff7 to
1697912
Compare
d2b8ae0 to
404c053
Compare
1697912 to
bc45122
Compare
404c053 to
29418a0
Compare
bc45122 to
8a03896
Compare
29418a0 to
16ce202
Compare
8a03896 to
0680247
Compare
16ce202 to
73acb8e
Compare
0680247 to
f28b83d
Compare
0250c37 to
db988b3
Compare
f28b83d to
d4a3e01
Compare
freshtonic
left a comment
There was a problem hiding this comment.
Approving. The schemars layer mirrors the ts-rs layer from #268 cleanly, coverage is complete (every domain type + term newtype + Identifier derives JsonSchema; DomainType::schema() is implemented across the board), and I verified the generated schemas against the new tests — all four JSON pointers in schemas_are_strict match the actual output.
Standouts worth calling out: the manual schemas tighten exactly where the derive is too loose (SchemaVersion → const: 2; BloomFilter → minimum/maximum i16 bounds rather than the non-validating draft-07 format: "int16"), and the schemas_are_strict test correctly catches the cases the required-keys parity test alone would miss (a struct silently losing deny_unknown_fields, or v swapped for a bare integer). The types:generate temp-dir swap is correctly extended to two output trees, and dropping the explicit trap - EXIT is fine now since both mvs leave an empty $tmp for the EXIT trap to clean.
Minor items posted inline:
- Description drift risk in the two manual
JsonSchemaimpls (lib.rs,terms.rs). - The
$idinjection is untested (tests/export.rs). - A question about the on-wire version value (
lib.rs).
One trivial nit not tied to a line: the PR description says "23 files" but there are 24 schema files under schema/v3/ (one per v3 domain). The asymmetry vs the 30 TS bindings is correct and intentional — schemas inline the terms / Identifier / SchemaVersion as definitions rather than as separate files.
| const_value: Some(serde_json::json!(EQL_SCHEMA_VERSION)), | ||
| metadata: Some(Box::new(schemars::schema::Metadata { | ||
| description: Some( | ||
| "The envelope version field (`v`) — always exactly `2` on the wire.".to_owned(), |
There was a problem hiding this comment.
Question: should the on-wire version be 3, not 2, in the eql_v3 tier? It looks deliberate — the EQL_SCHEMA_VERSION doc comment states it's hard-coded to 2 for every payload including v3, mirroring the generated eql_v3 domain CHECK VALUE->>'v' = '2', so the wire envelope version is intentionally decoupled from the eql_v3 SQL schema namespace (consistent with the project's stance that the schema name is independent of the release version). Assuming that's right, no code change needed — but it's a genuine footgun for schema consumers: someone validating against v3/<domain>.json sees a v3 path and a const: 2 and will reasonably suspect a mistake. The Rust doc comment explains the decoupling, but schema consumers never see it. Suggest baking the why into this published description (e.g. "envelope format version 2; independent of the eql_v3 SQL schema namespace") so the artifact is self-explaining.
There was a problem hiding this comment.
Waiting on @tobyhede to give the go-ahead to move to v3.
Stacks on the TypeScript bindings change. Every v3 domain type, the term newtypes, and Identifier gain a JsonSchema derive; the registry regains its schema fn; tests/export.rs writes one schema per SQL domain to crates/eql-types/schema/v3/ (23 files, checked in) with a canonical $id (https://schemas.cipherstash.com/eql/v3/<domain>.json) injected at write time (schemars 0.8 emits none). Term newtypes appear as named definitions that every domain schema $refs. catalog_parity.rs gains a third gate: each domain's published schema 'required' list must equal envelope + catalog term keys, pinning the artifact schema consumers validate against (the behavioural serde test already pins the wire contract). types:generate / types:check and the CI freshness step now cover schema/ as well as bindings/.
…keys Review finding: the schema parity gate checked only `required`, so a future token file losing #[serde(deny_unknown_fields)] or declaring v as a bare integer would regenerate a permissive schema that types:check commits as the new baseline — with behavioural spot checks existing for only 5 of 23 domains. schemas_are_strict now asserts, for every domain: additionalProperties: false at the root and on the nested Identifier, v $ref'ing the SchemaVersion definition, and that definition pinning const: EQL_SCHEMA_VERSION. Drilled: stripping deny_unknown_fields from DateEq fails the gate with the targeted message.
d4a3e01 to
f49e65c
Compare
…ion drift
- Factor the published schema `$id` into `DomainType::schema_id`
(`{SCHEMA_ID_BASE}{domain}.json`); `tests/export.rs` now injects that
single source of truth, and a new `schema_id_is_canonical` parity test pins
the URL shape with independent literals (wrong host / dropped `v3/` / wrong
domain now fails a test, not just the freshness diff).
- Mark the `BloomFilter` and `SchemaVersion` doc comments as the canonical
source for the descriptions their manual `JsonSchema` impls hand-copy, so
the two can't silently drift (the derive copies doc comments automatically;
these manual impls can't).
Schema artifacts regenerate byte-identically (schema_id matches the old inline
format) — verified via `mise run types:check`.
…ion drift
- Factor the published schema `$id` into `DomainType::schema_id`
(`{SCHEMA_ID_BASE}{domain}.json`); `tests/export.rs` now injects that
single source of truth, and a new `schema_id_is_canonical` parity test pins
the URL shape with independent literals (wrong host / dropped `v3/` / wrong
domain now fails a test, not just the freshness diff).
- Mark the `BloomFilter` and `SchemaVersion` doc comments as the canonical
source for the descriptions their manual `JsonSchema` impls hand-copy, so
the two can't silently drift (the derive copies doc comments automatically;
these manual impls can't).
Schema artifacts regenerate byte-identically (schema_id matches the old inline
format) — verified via `mise run types:check`.
feat: eql-types — JSON Schemas (stacked on #268)
…ion drift
- Factor the published schema `$id` into `DomainType::schema_id`
(`{SCHEMA_ID_BASE}{domain}.json`); `tests/export.rs` now injects that
single source of truth, and a new `schema_id_is_canonical` parity test pins
the URL shape with independent literals (wrong host / dropped `v3/` / wrong
domain now fails a test, not just the freshness diff).
- Mark the `BloomFilter` and `SchemaVersion` doc comments as the canonical
source for the descriptions their manual `JsonSchema` impls hand-copy, so
the two can't silently drift (the derive copies doc comments automatically;
these manual impls can't).
Schema artifacts regenerate byte-identically (schema_id matches the old inline
format) — verified via `mise run types:check`.
Adds the schemars layer to the eql-types crate. Builds on #236 and #268 (both merged); rebased onto
eql_v3.JsonSchemaderive on every v3 domain type, the term newtypes, andIdentifier; the registry regains itsschemafncrates/eql-types/schema/v3/<domain>.json(24 files, named after the SQL domain — the protocol identity), each with a canonical$id(https://schemas.cipherstash.com/eql/v3/<domain>.json) injected bytests/export.rs. Term newtypes appear as named definitions$ref'd by every domain schemacatalog_parity.rs: each published schema'srequiredlist must equal envelope + catalog term keys (plus a strictness gate)types:generate/types:checkand the CI freshness step extended to coverschema/(footgun-safe temp-dir swap, honoringEQL_TYPES_SCHEMA_DIR)Closes #288.