Add deterministic reproducibility test for layered admissibility artifact#132
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
There was a problem hiding this comment.
Code Review
This pull request introduces a new test suite to verify the reproducibility of committed artifacts, specifically for layered admissibility results. The feedback focuses on improving the robustness and maintainability of the tests by suggesting the use of a repository root constant for path resolution and refactoring the test logic to utilize pytest's parametrization for better isolation and reporting.
| from collections.abc import Callable | ||
| from dataclasses import dataclass | ||
| from pathlib import Path | ||
|
|
| ARTIFACT_SPECS: tuple[DeterministicArtifactSpec, ...] = ( | ||
| DeterministicArtifactSpec( | ||
| name="layered_admissibility_results", | ||
| committed_path=Path("artifacts/layered_admissibility_results.json"), |
There was a problem hiding this comment.
Use REPO_ROOT to anchor the artifact path. This prevents the test from failing if executed from a directory other than the repository root.
| committed_path=Path("artifacts/layered_admissibility_results.json"), | |
| committed_path=REPO_ROOT / "artifacts/layered_admissibility_results.json", |
| def test_committed_deterministic_artifacts_are_reproducible(tmp_path: Path) -> None: | ||
| for spec in ARTIFACT_SPECS: | ||
| regenerated_path = tmp_path / spec.committed_path.name | ||
| spec.regenerate(regenerated_path) | ||
|
|
||
| regenerated_payload = _load_json(regenerated_path) | ||
| committed_payload = _load_json(spec.committed_path) | ||
|
|
||
| assert regenerated_payload == committed_payload, ( | ||
| f"Deterministic artifact drift detected for '{spec.name}'. " | ||
| f"Regenerate with: python scripts/generate_layered_admissibility_artifact.py" | ||
| ) |
There was a problem hiding this comment.
Refactor the test to use pytest.mark.parametrize. This improves test isolation and provides clearer reporting by treating each artifact as a separate test case, which aligns with the extensible design mentioned in the PR description.
| def test_committed_deterministic_artifacts_are_reproducible(tmp_path: Path) -> None: | |
| for spec in ARTIFACT_SPECS: | |
| regenerated_path = tmp_path / spec.committed_path.name | |
| spec.regenerate(regenerated_path) | |
| regenerated_payload = _load_json(regenerated_path) | |
| committed_payload = _load_json(spec.committed_path) | |
| assert regenerated_payload == committed_payload, ( | |
| f"Deterministic artifact drift detected for '{spec.name}'. " | |
| f"Regenerate with: python scripts/generate_layered_admissibility_artifact.py" | |
| ) | |
| @pytest.mark.parametrize("spec", ARTIFACT_SPECS, ids=lambda s: s.name) | |
| def test_committed_deterministic_artifacts_are_reproducible(spec: DeterministicArtifactSpec, tmp_path: Path) -> None: | |
| regenerated_path = tmp_path / spec.committed_path.name | |
| spec.regenerate(regenerated_path) | |
| regenerated_payload = _load_json(regenerated_path) | |
| committed_payload = _load_json(spec.committed_path) | |
| assert regenerated_payload == committed_payload, ( | |
| f"Deterministic artifact drift detected for '{spec.name}'. " | |
| f"Regenerate with: python scripts/generate_layered_admissibility_artifact.py" | |
| ) |
Summary:
Add a focused deterministic reproducibility test that regenerates committed benchmark artifacts via the existing layered admissibility generator and fails on any JSON drift; the check is implemented as an explicit, extensible artifact spec list so future artifacts can be added deliberately.
Changed files:
Testing:
python scripts/generate_layered_admissibility_artifact.py— producedartifacts/layered_admissibility_results.jsonsuccessfully.pytest tests/test_artifact_reproducibility.py -q— 1 passed.pytest tests/test_degradation_curve_generator.py -q— 13 passed.pytest tests/test_fixture_manifest.py -q— 8 passed.npm run check— completed; full Python test suite ran and reported197 passed.Risks:
Next:
ARTIFACT_SPECSintests/test_artifact_reproducibility.pyto include them in the reproducibility gate.Codex Task