Skip to content

feat(sdk): Zod-first param schemas that validate, type, and document request params from one source#2

Open
mikec-ai wants to merge 16 commits into
blencorp:mainfrom
mikec-ai:feat/zod-params-schemas
Open

feat(sdk): Zod-first param schemas that validate, type, and document request params from one source#2
mikec-ai wants to merge 16 commits into
blencorp:mainfrom
mikec-ai:feat/zod-params-schemas

Conversation

@mikec-ai

Copy link
Copy Markdown

What

Closes #1.

Makes a Zod schema the single source of truth for every object-param endpoint's request contract. Each schema simultaneously validates inputs in the SDK client method (a one-line, path-qualified ValidationError is thrown before the HTTP call), defines the TS types (z.infer, replacing the hand-written interfaces; same exported names, no call-shape changes), and renders the agent-facing params contract now returned by describe_schema and search_api and indexed by BM25. This closes the discoverability gap described in #1 (agents reverse-engineering conditions shapes by trial-and-error) and turns the significant: true silent-wrong-count footgun into an instant, explanatory local error.

Measured on the built artifact (Claude Opus agents, 3 tasks, pre/post, 3 reps each, live APIs): mean execute calls down 56%, failed calls down 100%, discoverability 2.67 to 5.0 out of 5, 18/18 correct vs gold. Consistent with the original injected-docs A/B in #1 (58% / 80% / 2.6 to 5.0).

How

Notes for a reviewer, in reading order:

  • src/sdk/validate.ts: the one seam every client method uses. Formats ZodError into a single line (up to 3 issues) here, because the sandbox RPC boundary (dispatch in src/sdk/runtime.ts) copies only {name, message, status}; an unformatted ZodError.message is a JSON dump.
  • src/sdk/fr-client.ts / ecfr-client.ts: interfaces become Zod schemas. Non-obvious decisions:
    • .strict() everywhere, calibrated against the live APIs. Both FR and eCFR return HTTP 400 on unknown keys (verified live), so strictness converts a guaranteed remote 400 into a local error; it relaxes nothing. The allow-lists were swept against the APIs' full documented param sets, which added honored keys the old interfaces omitted: sections, regulation_id_number, agency_ids, correction, near. Never default (strip) mode: stripping silently drops typo'd keys and runs the query unfiltered, the same silent-wrong-answer class as the footgun.
    • significant / correction: 0 | 1 with a custom error map rejecting booleans (live-verified: 551/6 correct vs 2393/12 silently wrong).
    • type is union([array(enum), enum]) with no .transform(): both forms are accepted by the API as-is, and skipping normalization keeps z.infer honest (input type = output type).
    • All eCFR search sub-endpoints share EcfrSearchParamsSchema: results, counts, and suggestions accept the same param vocabulary (live-verified; a narrower pick rejected honored date / last_modified_* filters).
    • ancestry / full stay open (z.record): genuinely free-form node selectors.
    • facets now validates the bucket enum and encodes it into the path (it was interpolated raw before).
  • src/sdk/renderParams.ts: Zod to readable contract text. Unwraps Optional/Default/Effects while keeping the nearest .describe() (a .describe().optional() description must survive), parenthesizes array-of-union, and renders unknown node types as a loud unsupported<T> token that a coverage test asserts never ships.
  • src/sdk/paramSchemas.ts: the endpoint-id to schema registry; getCorpus() renders each registered schema into the entry's params at load time. No committed generated artifact, no build step, nothing to drift.
  • schema/field-dictionary.json: the ecfr.versions example moves from the pre-flattened 'issue_date[gte]' form to the nested form the typed SDK signature always specified; two missing corrections entries added; quoting-semantics note on eCFR search.
  • Tests (62 total): calibration tests assert every live-verified accepted form parses and the footgun rejects; wire-level tests assert the parse-then-flatten integration produces the exact bracketed query strings; a drift guard asserts every object-param endpoint has a registered schema, renders a non-empty contract, and mechanically parses its own dictionary example (acorn-extracted, so examples can't silently diverge from schemas).

Test plan

  • pnpm typecheck
  • pnpm lint
  • pnpm test
  • pnpm build
  • If HTTP-visible: added a case in test/http-integration.spec.ts asserting the params contract on describe_schema and search_api results survives the streamable-HTTP transport
  • If sandbox-visible: validation errors cross dispatch as structured {name: 'ValidationError', message} (covered in test/sdk.spec.ts; the sandbox surface itself is unchanged)
  • If new SDK method: no new methods; two missing dictionary entries added (ecfr.admin.corrections, ecfr.admin.corrections_for_title)

Risk

  • Back-compat (the real one): call shapes are unchanged, but the stable fr.* / ecfr.* globals get stricter at runtime: significant: true, misspelled or unknown keys, and the pre-flattened versions key form now throw locally instead of reaching the API. In practice those inputs were already broken (silent wrong data or an upstream 400), with one exception: the old documented example for ecfr.versions used the flattened form, which worked by accident. CHANGELOG documents all of it with migration guidance.
  • Over-strict rejection of honored params: the named risk of .strict(). Mitigated by calibrating the allow-lists against the APIs' full documented param sets (not the old TS interfaces) and encoding every verified form in calibration tests. If an honored key was still missed, the failure is loud (a clear local error naming the key), not silent.
  • Security: none; validation is host-side and the sandbox boundary is untouched. facets path interpolation is now enum-validated and encoded (slight hardening).
  • Perf: per-call .parse() is microseconds against a network round-trip it can prevent; corpus rendering happens once at load.

mikec-ai and others added 16 commits June 11, 2026 10:55
Adds the mechanism for surfacing request-param contracts to agents, with no
behavior change yet — the PARAM_SCHEMAS registry is empty until Phase 1, so
getCorpus() renders nothing and all outputs are unchanged.

- renderParams(): Zod -> compact human-readable param contract; propagates
  descriptions through Optional/Default/Effects wrappers (the JSON-Schema
  converter drops these).
- validate(): safeParse + throw a one-line, path-qualified ValidationError,
  formatted before it crosses the sandbox RPC boundary (dispatch copies only
  name/message/status; ZodError.message is a JSON dump).
- PARAM_SCHEMAS registry + CorpusEntry.params; getCorpus() renders registered
  schemas and indexes the text for BM25.
- Surface params additively through describe_schema and search_api outputs.

Tests: renderParams (types/optionality/description-through-wrapper) and
validate (footgun + typo guards).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Convert the DocumentSearch*, Facets, and EcfrSearch param interfaces to Zod
schemas and derive the types via z.infer. The types only widen (type accepts a
scalar or array; cfr.title accepts number|string; new optional fields), so the
public Sdk type stays backward-compatible. No runtime enforcement yet — client
methods are unchanged; validate() is wired in Phase 2.

- Register the schemas in PARAM_SCHEMAS, so describe_schema and search_api now
  surface a rendered param contract (cfr, significant with the boolean-trap note,
  eCFR hierarchy nesting) — closing the discoverability gap the A/B measured.
- Calibrated key sets vs the live API: add 'sections' and 'regulation_id_number'
  (verified honored, absent from the old interface); note docket_id is singular.
- significant typed 0|1 (rejects boolean true, the silent footgun); type is a
  union with no .transform() so z.infer stays == z.input (R3).
- Document the eCFR phrase-quoting semantics on the search endpoint.

Tests: schemas parse their own field-dictionary examples and every
calibration-verified form, and reject the footgun + unknown keys; params
surfacing verified end-to-end through describe_schema and search_api.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
An array whose element type is a union rendered as `A | B[]`, where the [] binds
to the last member. Wrap in parens -> `(A | B)[]`, and dedupe identical union
members. Surfaces on the documents.search `type` field (array-or-scalar enum).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Wire validate() into fr.documents.search, fr.documents.facets, and
ecfr.search.results. Invalid params now throw a one-line, path-qualified
ValidationError at the call site (surfaced cleanly through dispatch) instead of
being flattened into a query string and sent upstream.

Behavior-affecting (see CHANGELOG [Unreleased] > Changed):
- conditions.significant: true now errors ('must be the integer 0 or 1, not a
  boolean ...') — previously sent upstream and silently returned the wrong count.
- Unknown/misspelled keys error locally with the offending key named. Both the FR
  and eCFR APIs already 400 on unknown keys, so .strict() upgrades that remote 400
  into an actionable local message rather than relaxing anything.
- facet is validated against its enum and URL-encoded into the path (closes an
  unencoded path interpolation).

Tests: SDK-level rejection of the footgun, unknown keys, bad facet, and missing
required query, all asserted to throw before any HTTP call.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
fr.publicInspection.search takes a DIFFERENT conditions shape from
documents.search (available_on, special_filing, ...). Convert its interface to a
Zod schema, register it so describe_schema/search_api surface the contract, and
validate before the HTTP call. Calibration confirmed the public-inspection API
also 400s on unknown condition keys, so .strict() applies.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…unts/corrections

Complete Bucket B for the eCFR binding:
- counts_daily/titles/hierarchy and suggestions validate against picks of the
  search schema (kept .strict()).
- versions: nested issue_date schema; rewrite the field-dictionary example from
  the flattened 'issue_date[gte]' form to the nested form it documents (the two
  had drifted apart).
- ancestry/full: validate the open node-selector query as a string|number record
  (preserving undefined-when-omitted so positional-only calls are unchanged).
- admin.corrections / corrections_for_title: add schemas AND the field-dictionary
  entries they were missing (CONTRIBUTING: SDK methods must be paired with
  dictionary entries so search_api/describe_schema surface them).

Tests: nested-vs-flattened versions guard, pick-stays-strict, corrections typo
guard.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Asserts every Bucket B endpoint has a registered schema, every registry id
resolves to a real field-dictionary entry (catches typos/renames), and every
registered endpoint surfaces a non-empty params contract via describe_schema.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Review caught that .strict() rejected conditions.agency_ids, a real
Federal-Register-honored filter (numeric agency id, distinct from the agencies
slug filter) verified against the live API — both documents.search and
publicInspection.search. The pre-branch interface had no runtime validation, so
this previously reached the API and worked; .strict() turned it into a local
throw. A calibration gap, not intended behavior. Add it to both condition
allow-lists (number-or-string, matching cfr.title) with parse tests.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…iew)

- renderParams: unhandled Zod nodes now render unsupported<Type> instead of a
  silent 'unknown', and a new coverage drift-guard test fails if any registered
  endpoint ever emits one.
- renderParams: ZodRecord uses the schema's real key type (falls back to string),
  so a future non-string-keyed record renders correctly; ecfr ancestry/full still
  render Record<string, string | number>.
- significant errorMap message no longer asserts every rejected value is a boolean
  (it fires for any invalid input); reworded with the boolean caveat parenthetical.
- Comment the array-of-union parenthesization heuristic.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
- Tighten two ValidationError assertions that only matched the context prefix
  (present on every error from the method) to match the discriminating
  'Unrecognized key ...' substring, so they can't pass for an unrelated failure.
- Add interception tests for previously wire-untested, newly-touched paths:
  documents.facets bucket path-interpolation, ecfr.versions nested issue_date ->
  issue_date[gte] flattening (the eCFR flatten() nested branch had no wire test),
  and ecfr.full node-selector query serialization.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…nge, and migration

Review noted the CHANGELOG named only 3 of the ~13 methods that now validate and
throw, omitted the ecfr.versions example shape change, and did not flag the
SemVer impact of a behavior change to the stable fr.*/ecfr.* globals. Enumerate
the full method set, document the versions issue_date nested-form requirement,
and add an explicit Migration/SemVer note (leaving the bump decision to
maintainers).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…counts filters)

Implementation review (F1/F2) caught more honored keys the .strict() lists
rejected — same class as the agency_ids regression. Rather than patch one key at
a time, ran a deliberate full sweep of the documented param sets against the live
APIs:

- FR conditions: add 'correction' (0|1, boolean rejected like significant) and
  'near' ({ location, within }) — both honored by /documents and /facets. The
  whole documented /documents conditions list is now covered.
- eCFR search: counts_daily/titles/hierarchy and suggestions accept the SAME full
  param vocabulary as results (date, last_modified_*, hierarchy, agency_slugs,
  pagination — all verified live; unknown -> 400). Drop the narrow picks and share
  EcfrSearchParamsSchema across all four, so they no longer reject honored filters.

Calibration tests extended (correction, near, eCFR date/last_modified accepted;
unknown rejected).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ti-error messages

Review fast-follows (F3/F4/F5 + rec6):
- coverage: mechanically extract each registered endpoint's documented example
  (acorn) and parse it against its schema, so an example can't drift from its
  schema for any endpoint, present or future.
- validate(): surface up to the first 3 issues on one line, so a multi-mistake
  input is fixable in one round-trip.
- search: ranking guards that the now-indexed params text keeps the right
  endpoints in the top results.
- zodToJsonSchema: note the Zod-3 _def coupling.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
describe_schema and search_api results now carry a rendered params field;
per CONTRIBUTING, new HTTP-visible behavior gets a case in the canonical
end-to-end test.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Drop the version-bump question; versioning is a release-time decision.
Also remove em-dashes from the Unreleased entry per style preference.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Parameter contracts are invisible to agents, and significant: true silently returns wrong counts

1 participant