SP-DIFFER uses multiple oracles to reduce false positives and produce actionable reports.
Compares outputs from independent implementations byte-for-byte. Any mismatch is reported with a minimal reproduction case.
Validates outputs against BIP 352 rules such as valid taproot outputs, valid scalars, and valid points on the curve.
The repo now vendors the upstream BIP 352 reference implementation snapshot and can run it offline against the pinned official vectors. That vendored upstream oracle is the current semantic baseline used by this repository.
On top of that oracle, the repo now defines a normalized semantic comparison contract for sender and receiver expectations. That contract is paired with a semantic adapter request contract so real implementations can be driven against the official v2 corpus without each one re-implementing the repository's binary case format.
The repo also ships parser, validator, and semantic comparison helpers that check case/output structure and normalized semantic results.
- Official vectors for baseline correctness.
- Deterministic edge cases for boundary conditions.
- Fuzzed cases generated from seeds and mutated corpora.
- Regression cases derived from real mismatches.
Seeded workflows record seeds, input cases, and worker versions. Replayable artifacts are written for semantic adapter failures, fuzz failures, and packaged release checks.
make lintruns the static discipline lane: strict C++ compile warnings,rustfmt --check,cargo clippy -D warnings,gofmt,go vet, public-claim checks, source-comment checks, and workflow hardening checks.make checkruns core I/O, case parser, and header validation smoke tests.make check-rust-clippyruns the Rust warning lane across the in-tree worker and Rust adapters.make check-go-vetrunsgo vetagainst the in-tree Go adapter with readonly module resolution.make check-claimsverifies that public docs and templates avoid unsupported hype and unsupported future-tense release wording.make check-commentsverifies that repo-owned source comments avoid deferred-note markers and hype wording.make check-workflowsverifies that GitHub Actions workflows keep top-level concurrency blocks, least-privilege workflow permissions, and SHA-pinned external actions.make check-abi-symbolsverifies that the compiled worker and semantic-worker shared libraries still export the documented stable ABI entrypoints.make check-clang-tidyruns the curatedclang-tidyprofile across the compiled C++ translation units when the tool is installed.make check-compile-warningsrebuilds the compiled surfaces under-Wall -Wextra -Wpedantic -Werrorin an isolated build directory.make sanitize-smoke SANITIZE_CXX=clang++runs the C++ core, runner, compare, and semantic smoke surfaces underasan/ubsanin an isolated build directory.make cli-smokeexercises the public CLI release-readiness aggregator against a synthetic build tree.make release-reportwrites a combined release-readiness summary from the current local evidence.make smokeruns the compiled runner against the canonical example case.make smoke-rustruns the compiled runner against the Rust byte-worker library.make semantic-smokeruns the compiled runner and compare binaries against synthetic semantic worker fixtures for send and receive v2 cases, expectation-approved alternative sender outputs, and the explicitBOTH_ORACLE_MISMATCHpath.make comparebuilds the differential runner.make diffruns the differential runner against the C++ and Rust byte-worker libraries.make oracleverifies the vendored upstream reference bundle and runs it against the pinned official BIP352 snapshot.make vectors-v2verifies the full derived v2 semantic corpus against the vendored oracle.make adaptersverifies the in-tree semantic adapters against the same derived v2 corpus.make adapter-spdk-ffiverifies the SPDK-backed semantic worker shared library against the same derived v2 corpus.make adapter-silent-paymentsruns a second independent Rust implementation against the same derived v2 corpus.make adapter-silent-payments-ffiexercises that same implementation through the semantic worker ABI.make adapter-bip352runs a third independent Rust implementation against the same derived v2 corpus.make adapter-bip352-ffiexercises that same implementation through the semantic worker ABI.make adapter-go-bip352runs a fourth independent implementation backed by the public Gogo-bip352module against the same derived v2 corpus.make adapter-go-bip352-ffiexercises that same Go implementation through the semantic worker ABI.make adapter-bdk-spruns a fifth independent implementation surface backed bybdk-spagainst the same derived v2 corpus.make adapter-bitcoin-core-exp BITCOIN_CORE_ROOT=/path/to/bitcoinruns the opt-in experimental Bitcoin Core adapter against a local Silent Payments branch checkout. It is intentionally excluded from defaultmake adapters, CI, and release-readiness gates because upstream branches are still moving.make regressions-bitcoin-core-exp BITCOIN_CORE_ROOT=/path/to/bitcoinreplays the tracked regression suite through that same experimental adapter. The repeated-key unique-outpoint send case is now part of the normal green regression story, so new experimental adapters inherit that edge case automatically.make fuzz-semantic-bitcoin-core-exp-adapter BITCOIN_CORE_ROOT=/path/to/bitcoinandmake bench-bitcoin-core-exp BITCOIN_CORE_ROOT=/path/to/bitcoinprovide the same opt-in maintainer hooks for adapter fuzzing and benchmark work.make regressionsreplays the tracked semantic regression suite against all current known-good adapters.make bench-referenceandmake bench-adaptersmeasure harness-level adapter latency and throughput on the same pinned derived v2 corpus, while still failing the run if semantic correctness breaks.make release-evidencehashes the materialized readiness and benchmark outputs into an explicit release-evidence manifest.make verify-release-evidencere-checks that manifest against the current files and can also be paired withgit tag -vduring release review.make verify-release-attestationverifies a downloaded release archive against the GitHub-hosted provenance attestation emitted by.github/workflows/release.yml.make maturity-signoffis the most complete local maturity lane: live readiness, benchmark matrix, refreshed local report, and release-evidence hashing.make fuzz-corpusverifies the checked-in semantic worker fuzz corpus.make fuzz-minimizer-smokeexercises the semantic fuzz reducer against synthetic structured and raw failures.make fuzz-introspectwrites a heuristic semantic-path coverage report for the checked-in corpus.make semantic-error-surfacesvalidates the reserved semantic-status fixtures through the shared contract/compare path and checks deterministic byte-worker defensive cases forinvalid_input,invalid_pubkey, andtweak_out_of_range.make fuzz-semantic-spdk,make fuzz-semantic-silent-payments,make fuzz-semantic-bip352, andmake fuzz-semantic-go-bip352run deterministic semantic-worker fuzzing with replayable and auto-minimized artifacts underbuild/.make fuzz-semantic-adaptersruns the deterministic semantic-adapter matrix and now auto-minimizes structured mismatches into intake-ready regression bundles underminimized/.make fuzz-semantic-workers FUZZ_STRUCTURED_ITERATIONS=64 FUZZ_RAW_ITERATIONS=64runs the longer deterministic local matrix across all semantic workers..github/workflows/ci.ymlcurrently runs the regular UbuntuBuild, Test, and Smokelane on pushes and pull requests targetingmain..github/workflows/ci.ymlalso runs a separate lint-and-warnings lane plus a sanitizer smoke lane so warning regressions and undefined-behavior regressions surface before the longer test matrix finishes..github/workflows/ci.ymlalso runs a dedicated static-analysis lane withclang-tidyon the compiled C++ surfaces..github/workflows/ci.ymlalso runs a macOS build-and-smoke lane so platform drift is exercised on the same operating-system family used by the release workflow..github/workflows/nightly-fuzz.ymlruns the longer scheduled semantic-worker and semantic-adapter fuzz jobs, uploads minimized replay bundles as tarred artifacts, and publishes the semantic fuzz introspection report for corpus blind-spot review..github/workflows/maturity.ymlruns scheduled live release verification, benchmark collection, release-evidence generation, and artifact upload..github/workflows/release.ymlnow attests each packaged release tarball with GitHub artifact attestations in addition to the existing signed checksum flow.sp-differ status --profile release --require-greennow provides a single readiness check over the current oracle, adapter, regression, and fuzz evidence, and it also incorporatesbuild/bip352_external_probe.jsonautomatically when present so stale or failed integrated external-version evidence is reflected in the status report.make verify-release-liveis the stricter networked sign-off path: it runs the release-profile verification suite and, when external-probe candidate metadata is present, refreshes the live upstream probe before writingbuild/sp_differ_release_readiness_live.json. Without that metadata it still writes the live readiness report and notes that upstream freshness was not evaluated.make vectorsruns the upstream oracle, validates the full derived v2 semantic corpus, checks the generated v1-compatible derived subset, and runs that subset through both byte-worker libraries.
Benchmark summaries are intentionally separate from release-readiness verdicts. They are useful for lab comparison and regression detection, but they should only be compared when the corpus selection, timeout, and iteration signature match exactly. Release evidence manifests are intentionally separate from both of those reports: they record exactly which files backed a candidate release so later reviewers can hash and verify the same material.
The authoritative upstream BIP352 send-and-receive vectors are vendored under tests/vectors/bip352/official/ with a pinned manifest. The matching upstream reference bundle is vendored alongside them and checked by SHA256 before oracle execution. A derived sender-side subset that fits the current SP-DIFFER v1 case format lives under tests/vectors/bip352/derived/v1/. The full official send/receive surface encoded as SP-DIFFER v2 cases lives under tests/vectors/bip352/derived/v2/, and ../spec/SEMANTIC_ADAPTER.md defines the stable request/response bridge used to drive real implementations against that corpus.
The vendored current semantic corpus is executed through both the semantic adapter layer and the compiled semantic worker ABI. The compiled runner and compare binaries preserve the original v1 byte-worker ABI and also dispatch v2 cases through the semantic bridge and semantic worker ABI. The compare path is expectation-aware: official sender cases with multiple accepted output sets and count-only receive cases do not produce false VALID_MISMATCH reports, while shared semantic failures surface as BOTH_ORACLE_MISMATCH. The repo also tracks promoted semantic regressions under tests/regressions/semantic/; that suite now supports exact request-backed reproducers, adapter-scoped retained cases, explicit observed_actual entries for known upstream divergences that should flip red once the affected adapter changes behavior, and general oracle-expected edge cases that coexist with adapter-scoped retained divergences. The repo also ships a deterministic semantic-worker fuzz corpus, replayable local fuzz runners for compiled workers and command adapters, automatic reducers that shrink structured failures before promotion, a heuristic semantic-path introspection report for corpus blind spots, a separate tracked semantic error-surface suite for reserved statuses that do not belong in the valid corpus, CI workflows that preserve replay bundles as tarred artifacts, a longer deterministic local fuzz matrix across the SPDK, silent-payments, bip352, and go-bip352 worker surfaces plus the current command-adapter set, an opt-in experimental Bitcoin Core command adapter backed by a local helper build, and the local sp-differ readiness/reporting CLI. The canonical release gate still excludes the experimental Bitcoin Core evidence and includes the bdk-sp semantic adapter plus its regression replay surface. See ./CASE_FORMAT_V2.md, ./SEMANTIC_WORKER_INTERFACE.md, ../spec/SEMANTIC_ADAPTER.md, and ../spec/SEMANTIC_CONTRACT.md for the current state.
A mismatch is a failure unless explicitly classified as a serialization or normalization difference that does not change semantics. The classification is recorded in the report, semantic adapter failures can emit replayable per-case artifacts under build/, semantic adapter and semantic worker fuzz failures now emit reduced replay bundles under minimized/, CI packages those outputs as tar archives for upload, and structured bundles can be promoted into the tracked regression suite with scripts/intake_semantic_regressions.py.