docs: cross-link multi-family admissibility benchmark in README#138
Conversation
There was a problem hiding this comment.
Code Review
This pull request updates the README.md file to include a new section for the 'Multi-Family Operational Admissibility Benchmark,' outlining its validation criteria and providing a link to the methodology documentation. Feedback suggests refining the validation description to align with the existing 'whether...' or 'how...' phrasing used in other sections and adding a link to the corresponding JSON artifact for consistency.
| - **Validates:** Deterministic multi-family operational admissibility benchmark with manifest-driven fixture selection, exact scoring, reproducible JSON artifacts, and progression-regression checks. | ||
| - **Method:** [`docs/benchmarks/multi_family_admissibility_benchmark.md`](docs/benchmarks/multi_family_admissibility_benchmark.md). |
There was a problem hiding this comment.
To maintain consistency with the other benchmark entries in the 'Benchmark family' section (e.g., Paper Replay and Agent Trace Replay), please include a link to the generated JSON artifact. Additionally, consider rephrasing the validation description to follow the established 'whether...' or 'how...' pattern used in adjacent sections to improve readability and alignment with the existing documentation style.
| - **Validates:** Deterministic multi-family operational admissibility benchmark with manifest-driven fixture selection, exact scoring, reproducible JSON artifacts, and progression-regression checks. | |
| - **Method:** [`docs/benchmarks/multi_family_admissibility_benchmark.md`](docs/benchmarks/multi_family_admissibility_benchmark.md). | |
| - **Validates:** whether multi-family operational state remains admissible across manifest-driven fixtures using exact scoring, reproducible JSON artifacts, and progression-regression checks. | |
| - **Artifact:** [`artifacts/multi_family_admissibility_results.json`](artifacts/multi_family_admissibility_results.json). | |
| - **Method:** [`docs/benchmarks/multi_family_admissibility_benchmark.md`](docs/benchmarks/multi_family_admissibility_benchmark.md). |
Motivation
Description
README.mdthat links todocs/benchmarks/multi_family_admissibility_benchmark.mdand includes the requested one-line description about manifest-driven fixture selection, exact scoring, reproducible JSON artifacts, and progression-regression checks.Testing
npm run check(which performs layout, typecheck, validate, build, andpytest) and all checks passed; Changed files:README.md; Risks: Low (docs-only change); Next: optionally add the same cross-link to additional docs index pages in a follow-up docs-only PR.Codex Task