Skip to content

v0.7.0: Protocol schemas, STM seed, compact JSON + version-pinned $schema#40

Merged
titusz merged 29 commits into
mainfrom
v0.7.0
Jun 1, 2026
Merged

v0.7.0: Protocol schemas, STM seed, compact JSON + version-pinned $schema#40
titusz merged 29 commits into
mainfrom
v0.7.0

Conversation

@titusz

@titusz titusz commented May 30, 2026

Copy link
Copy Markdown
Member

Summary

v0.7.0 adds the Protocol schema category (ISCC Discovery Protocol wire records) alongside the existing Metadata, Seed, and Service categories, introduces an STM seed schema for scholarly works, makes compact self-describing JSON the default for seed and protocol records, and version-pins the $schema and @context of every standalone schema. The versioned $schema reference makes compact data recoverable to JSON-LD on demand, conformant with IEP-0002 (which accepts both application/json and application/ld+json).

IsccMeta is unaffected — it still defaults to ld=True and emits full JSON-LD exactly as before. The breaking changes apply only to the standalone Seed/Service/Protocol models.

New schemas

  • IsccNote (iscc_schema.IsccNote) — first member of the new Protocol category; the permanent ISCC Declaration log record for the HUB. Compact JSON by default; $schema is required and part of the signed jcs() bytes.
  • STM (iscc_schema.STM) — Scientific/Technical/Medical seed metadata for DOI-identified scholarly works. resource_type carries 17 DataCite-style work-type tokens mapped to resolvable schema.org/FaBiO class IRIs.

Breaking changes (standalone models only)

  • Seed models (ISBN, ISRC, STM) now default to compact JSON (ld=False) — $schema only, no @context/@type. Pass ld=True to restore JSON-LD.
  • Standalone $schema URLs are now version-pinned (e.g. …/isbn-0.7.0.json). recover_context() resolves both forms, so v0.6.0 records still recover.
  • $schema is now required on ISBN, ISRC, STM, and IsccNote (auto-populated from its const default).

Serialization

  • Added ld parameter to .dict(), .json(), .jcs().
  • Seed → ld=False; Service (TDM, GenAI) → ld=True; Protocol (IsccNote) → ld=False.

Verification

  • uv run poe all is idempotent — committed artifacts match YAML sources (clean tree after rebuild).
  • 219 passed, 1 skipped.

Full details in CHANGELOG.md.

titusz added 29 commits May 28, 2026 15:02
Seed metadata examples (ISBN, ISRC) now use only $schema instead of
full JSON-LD (@context + @type + $schema). The JSON-LD context can be
recovered from the $schema reference on demand, which is conformant
with IEP-0002 and reduces verbosity.
Explain that compact JSON with $schema is the recommended format for
Meta-Code generation, conformant with IEP-0002. Links to context
recovery docs for upgrading to JSON-LD on demand.
Add ld kwarg to .dict(), .json(), .jcs() on BaseModel controlling
whether JSON-LD fields (@context, @type) are included. Standalone
models (ISBN, ISRC, TDM, GenAI) default to ld=False for compact
self-describing JSON with $schema only. IsccMeta defaults to ld=True
preserving backward compatibility.

Also require $schema in ISBN and ISRC seed schemas, with code-gen
post-processing to use the const value as Pydantic default.
…erop

Seed metadata is input for Meta-Code generation where compact JSON
suffices. Service metadata is served by registries and discovered
through gateways where full JSON-LD semantics aid interoperability.
Update guide, examples, and agent reference with ld parameter usage,
serialization defaults table (seed=False, service/IsccMeta=True), and
compact vs full JSON-LD examples for each model category.
…rialization

Introduce "Protocol Schemas" as a fourth standalone schema category (alongside
core IsccMeta, Seed, and Service) for ISCC Discovery Protocol records. IsccNote,
the permanent signed ISCC Declaration log record, is the first member.

Protocol schemas differ from service schemas in two ways that matter for a
permanent, signed record:

- Compact JSON by default (_default_ld = False): @context and @type are dropped.
- Version-specific, required $schema (e.g. iscc-note-0.7.0.json): with @context
  gone, $schema is the sole version anchor, and it is part of the JCS bytes the
  signature is computed over, so the schema version is pinned into the signed
  record itself.

Changes:
- New PROTOCOL_SCHEMAS/PROTOCOL_SCHEMATA build groups across all six build tools
- build_code: _patch_versioned_schema pins the model $schema to the versioned URL
- build_json_schema: writes a versioned archive (iscc-note-0.7.0.json) and a
  versioned-$schema const, and marks $schema required
- recovery: strips the -X.Y.Z suffix to resolve versioned $schema URLs to context
- Rename generated module service_iscc_note.py -> protocol_iscc_note.py
- Docs: four-category schema index, Protocol vocabulary + terms-protocol.md,
  versioning.md protocol section, nested zensical nav (Seed/Service/Protocol)
- Rewrite IsccNote tests for compact default + versioned $schema (161 pass)

The HUB will require this compact, version-specific-$schema form for declarations.
…jsonld

Compact records (seed + protocol) drop @context/@type, so the recipe for
reconstructing JSON-LD must live in the schema, not the record. build_seed_schema()
now emits a top-level x-iscc-jsonld extension into every standalone schema JSON
(seed, service, protocol):

- context: the versioned context URL, which equals the @context the generated
  model emits with ld=True
- type: the schema's @type const
- upgrade: a human-readable reconstruction recipe

The $schema property description gains a one-liner pointer to the extension,
appended in the build tool (not the YAML) since x-iscc-jsonld exists only in the
generated JSON Schema artifact. Validators ignore the unknown keyword.

Regenerated isbn/isrc/tdm/genai/iscc-note schemas + the iscc-note-0.7.0 archive.
Adds 8 tests (incl. a gate asserting the documented context matches the model's
ld=True output, and an actionable-recipe test). 169 tests pass.

Implements handoff item B.
… archives

The Pydantic models already emit a versioned @context (e.g. .../context/0.7.0.jsonld),
but the published standalone JSON Schemas had no @context default and no versioned
archive copies. build_seed_schema() now treats all standalone categories (seed,
service, protocol) uniformly:

- Versioned @context default: the YAML's unversioned @context const is replaced with a
  versioned default before _patch_context_property, so the JSON Schema matches the
  @context the model emits with ld=True.
- Versioned example @context: JSON-LD examples (TDM, GenAI) get their @context pinned to
  the versioned URL; compact seed/protocol examples carry $schema instead and are left
  untouched.
- Versioned archive for every standalone schema (isbn-0.7.0.json, isrc-0.7.0.json,
  tdm-0.7.0.json, genai-0.7.0.json), byte-identical to the latest file except its $id.
  The versioned_schema flag now controls only the $schema const versioning (protocol),
  decoupled from archive writing.

$schema stays unversioned for seed/service by design (it identifies which schema; the
versioned @context identifies which version). recover_context() already resolved
versioned archive URLs via _VERSION_SUFFIX, verified for seed/service.

Docs: build_docs._version_standalone_examples versions @context in standalone schema
examples; versioning.md documents the @context/archive story. The main IsccMeta docs are
intentionally left untouched (its $schema is versioned by design too, so versioning
@context alone there would leave a half-versioned example - out of scope).

Regenerated isbn/isrc/tdm/genai/iscc-note schemas + 4 new seed/service archives. Adds 9
tests (169 -> 178): versioned-default + archive byte-identity across all five standalone
schemas, recovery from versioned seed/service archive URLs, plus per-schema assertions in
test_seed_isbn.py and test_service_tdm.py.

Implements handoff Step 3 / P1-1.
…tegories)

Following the @context/archive work in ee41a60, seed and service schemas (ISBN,
ISRC, TDM, GenAI) now also version-pin their $schema, so all four standalone
categories - and the main schema - uniformly carry a versioned @context AND a
versioned $schema. This removes the inconsistency where a service record showed
a versioned @context next to an unversioned $schema.

The mechanism already existed for protocol schemas. Since every standalone schema
now versions its $schema, the version_schema/versioned_schema flag is dead and is
removed from build_code, build_json_schema, and build_docs - _patch_versioned_schema
and the const/example patching are now unconditional.

Wire form: a seed/service record's $schema is now e.g.
http://purl.org/iscc/schema/tdm-0.7.0.json. The latest schema document keeps its
unversioned $id (served at the "latest" URL), but its $schema const points at the
versioned archive, so records always carry the version. recover_context() resolves
both forms, so existing v0.6.0 records with an unversioned $schema still recover.

Regenerated seed_*/service_* models + all standalone JSON (latest + archives) + docs.
Rewrote the versioning.md "Standalone Schemas" section (previously: $schema
unversioned). Tests updated to expect the versioned $schema; +2 (178 -> 180) for the
latest-file const/$id split on ISBN and TDM.
Pin both @context and $schema to the versioned whole-schema URLs in
iscc.md - in the four examples (minimal/basic/extended/technical) and the
@context/$schema field-table "Default" column - so the page matches the
model output and the published iscc.json (both default to versioned URLs).
iscc.md was the last spot still rendering unversioned base URLs.

- build_docs.py: add VERSIONED_CONTEXT/VERSIONED_SCHEMA consts (whole-schema
  URLs, not per-schema names); refactor a shared _pin_example_urls primitive
  out of _version_standalone_examples; add _version_main_schema called
  per-file in _render_schema_sections (patches example URLs + the
  @context/$schema property defaults).
- test_unified_schema.py: add test_iscc_md_pins_versioned_context_and_schema.

180 -> 181 tests pass.
Add the STM (Scientific/Technical/Medical) seed metadata schema for
DOI-identified scholarly works, for a scholarly publishing pilot and the
bio-codes.io project. Standalone seed schema like ISBN/ISRC: compact
JSON by default with a version-pinned $schema and @context.

The work-type field `resource_type` uses readable DataCite-style tokens
mapped to resolvable schema.org (primary) and FaBiO (gap-fill) class
IRIs, via a new reusable `x-iscc-enum-context` build mechanism. The
NISO JAV `version_type` field carries the orthogonal version-stage axis,
so a Version-of-Record and an Accepted-Manuscript of one work produce
distinct Meta-Codes while a shared work-level `doi` links them.

Required: $schema, doi, resource_type, title, publisher, pubyear.
Optional: version_type, version_doi, container_title, issn, creator.

- New: models/stm.yaml, seed_stm.py, docs/schema/stm.{json,md} plus the
  stm-0.7.0.json archive, tests/test_seed_stm.py (33 tests)
- Register STM across the build pipeline and export it from __init__
- Add x-iscc-enum-context support to build_json_schema.py and
  build_json_ld_context.py (token -> class IRI with @type:@id coercion)
- 181 -> 214 tests pass; poe all clean
…ard (P0-3, P1-2)

P0-3: Promote 5 load-bearing fields to x-iscc-status stable - $schema
(iscc-jsonld.yaml), nonce + signature (iscc-crypto.yaml), generator
(iscc-technical.yaml), and tdm_reservation (tdm.yaml, was unannotated).

P1-2: Add _check_version_not_released() guard to both build_json_schema.py
and build_json_ld_context.py - aborts the build if a git tag for the current
version exists, preventing silent mutation of released archives. Fail-open
when git is unavailable. New tests/test_versioning.py verifies version-source
consistency and archive immutability against tagged releases.

Document the field-stability contract (stable/draft, one-way promotion) and
the bump-first release workflow in docs/versioning.md.
version_doi carried no functional load: version differentiation comes
from version_type (NISO JAV) while the shared work-level doi links
versions, so it did neither job. It also had no clean CSL-JSON mapping,
quietly breaking the populatable-from-any-DOI promise.

creator removed for PII reasons.

Edit is YAML-only on stm.yaml; all derived artifacts regenerated via
poe all. creator remains in the shared context via the main IsccMeta
schema; version_doi left no frozen archive (v0.7.0 unreleased).
Avoid naming commercial entities in committed files. The STM seed
example now uses an invented publisher, the reserved 10.5555 DOI test
prefix, and fictional journal/title/ISSN values. All derived artifacts
regenerated via poe all; 215 tests pass.
The page showed a single example carrying the optional timestamp, implying
clients set it - but the ISCC-HUB assigns the authoritative timestamp on
receipt. Split into a minimal submission example (no timestamp) and a complete
example exercising every optional field (timestamp, units, metahash, gateway,
signature.keyid), so the docs show both the typical wire form and the full
record. Use {iscc_id} in the gateway URI template, matching the HUB example.

Drive both examples from the schema YAML so their versioned $schema stays in
sync, render all examples on standalone schema pages via a new docs-only
x-iscc-example-titles extension, and add tests gating example field coverage.
Cover the work landed since the initial 0.7.0 serialization changes:
the new IsccNote protocol and STM seed schemas, uniform $schema/@context
version-pinning across standalone schemas, field-status promotions, and
the release-archive build guard. Lead with breaking changes scoped to the
standalone Seed/Service/Protocol models, noting that IsccMeta is
unaffected for downstream consumers.
The prose docs predated the STM seed schema, the IsccNote protocol
category, and the uniform $schema/@context version-pinning that shipped
during the 0.7.0 cycle. Bring them current:

- add STM (seed) and IsccNote (protocol) to every schema inventory:
  README, guide, examples, and the coding-agent reference
- fix ISBN examples that omitted required fields and raised ValidationError
- version the $schema/@context URLs in example records and output comments
  to match what the models emit (e.g. .../schema/isbn-0.7.0.json)
- correct two behavioral claims: TDM uses extra="allow" (not forbid), and
  recover_context is pure (does not mutate its input)
- resolve a versioning.md contradiction about compact vs JSON-LD defaults
- point contributors at main instead of the retired develop branch
- complete the 'add a standalone schema' recipe (nav + llms PAGES steps)

All fixed code snippets were executed to confirm they run.
…g page

docs/context/index.md was a second, unlinked copy of the /terms/
vocabulary. Term IRIs resolve into the /terms/ namespace, so /context/
only needs to document the machine-readable @context documents it serves.
Generate a slim landing page linking the canonical and version-pinned
context URLs and pointing to the Vocabulary page for definitions, drop the
now-dead _render_context_terms helper, and add the page to the nav.
Hand-written docs (guide, examples, versioning) now reference the release
version via Zensical's macros extension as {{ version }}, supplied by
tools/docs_macros.py from iscc_schema.__version__. Version-pinned example
URLs no longer hardcode 0.7.0 and update automatically on a release bump.

- Add tools/docs_macros.py (define_env) + macros config in zensical.toml
- Replace hardcoded 0.7.0 URLs with {{ version }} in guide/examples/versioning
- Make README version-free (macros do not run on GitHub/PyPI); re-sync index.md
- Resolve the {{ version }} token in tools/gen_llms_full.py for the llms bundle
- Add tests/test_docs_version.py guarding against re-hardcoding the version
- Document the change in the CHANGELOG (Documentation subsection)
ISCC identity IRIs (@context, @type, term IRIs, $schema values) use
http:// and are frozen — the scheme is part of the RDF identifier.
Resolution/hosting uses https. This distinction resolves the recurring
confusion about whether to use http or https in schema URLs.
Add missing x-iscc-* extensions (enum-context, category, example-titles,
jsonld), from_attributes config flag, and standalone-schema recipe gotchas
(hardcoded @type IRI, STANDALONE_META) to the for-coding-agents reference.
Add an `indent` argument and `exclude_unset` passthrough to `.json()` so the
compact serialization path offers the same formatting controls as the JSON-LD
path. End-anchor the IsccNote `gateway` URL pattern so values with embedded
whitespace or trailing characters are rejected from permanent declaration
records, and correct the `@type` description ("service" -> "protocol").
Drop `pubkey` from the nested signature's required fields and type it as optional. The ISCC-SIG spec requires only `version` and `proof`; PROOF_ONLY signatures omit the public key, leaving the verifier to obtain it out-of-band. Document the optionality in the field description and cover the mode with a dedicated test.
Add a Seed/Protocol example to the README and index page using STM and recover_context, illustrating compact JSON by default ($schema only), opt-in JSON-LD via ld=True, and on-demand context recovery.
Replace the 0.7.0 "Unreleased" heading with the release date and regenerate the docs changelog. Cuts the v0.7.0 release.
@titusz titusz merged commit e9e0486 into main Jun 1, 2026
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant