docs: add deterministic multi-family admissibility benchmark documentation by ProfRandom92 · Pull Request #137 · ProfRandom92/Comptextv7

ProfRandom92 · 2026-05-19T16:39:24Z

Motivation

Provide a focused, contributor-facing guide that explains the deterministic multi-family admissibility benchmark purpose, pipeline, invariants, regeneration, validation, and regression protections.
Summary: Document deterministic multi-family admissibility benchmark behavior, pipeline, determinism guarantees, regeneration commands, validation commands, and regression protections.

Description

Added a new benchmark doc at docs/benchmarks/multi_family_admissibility_benchmark.md that includes a Mermaid pipeline diagram, the current fixture families, the four standard degradation levels, determinism guarantees, regeneration commands, validation commands, regression protections, and non-goals; no implementation, fixture, artifact, CI, or package changes were made.
Changed files: docs/benchmarks/multi_family_admissibility_benchmark.md.

Testing

Ran npm run check, which executes layout, typecheck, validation, build, and the full Python test suite, and the command completed successfully.
The test run included pytest where all tests passed (213 passed), and the repository checks are green.
Risks: Low; this is a docs-only change with no code or artifact modifications, and Next: optionally cross-link this benchmark doc from broader benchmark index docs in a follow-up docs-only PR.

Codex Task

gemini-code-assist

Code Review

This pull request introduces documentation for the Deterministic Multi-Family Admissibility Benchmark, outlining its purpose, pipeline, and regression protections. The review identified an inconsistency in the regression protection logic where the stated requirement for distinct behavior between 'mild' and 'moderate' levels conflicted with the use of a non-strict inequality operator in the documentation. A correction was suggested to enforce strict inequality across all degradation levels.

gemini-code-assist · 2026-05-19T16:42:02Z

+- baseline and severe behavior is explicitly checked
+- mild and moderate behavior must be distinct
+- degradation must be progressive:
+  - `baseline > mild >= moderate > severe`


The progression formula should use a strict inequality (>) between mild and moderate to be consistent with the requirement stated on line 86 that these behaviors must be distinct. Using >= allows for identical scores, which contradicts the stated goal of ensuring distinct behavior across these levels.

Suggested change

- `baseline > mild >= moderate > severe`

- baseline > mild > moderate > severe

docs: add multi-family admissibility benchmark guide

7d42119

ProfRandom92 added the codex label May 19, 2026 — with ChatGPT Codex Connector

ProfRandom92 merged commit 9ea36c3 into main May 19, 2026
4 checks passed

gemini-code-assist Bot reviewed May 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs: add deterministic multi-family admissibility benchmark documentation#137

docs: add deterministic multi-family admissibility benchmark documentation#137
ProfRandom92 merged 1 commit into
mainfrom
codex/add-documentation-for-multi-family-benchmark

ProfRandom92 commented May 19, 2026

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

gemini-code-assist Bot May 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	- `baseline > mild >= moderate > severe`
	- baseline > mild > moderate > severe

Conversation

ProfRandom92 commented May 19, 2026

Motivation

Description

Testing

Uh oh!

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist Bot May 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant