feat(upstream): M1 upstream loader — read-only loader, eligibility gate, and revision pin#20
Merged
Merged
Conversation
…chy, UpstreamPin, helpers - Replace FORBIDDEN_VERIFICATION_STATUSES blocklist with ALLOWED_VERIFICATION_STATUSES allowlist: new upstream statuses now fail closed instead of silently passing the gate. - Add UpstreamError base class; derive all three error classes from it so callers can catch any module error with a single type. Raise UpstreamError (not ValueError/RuntimeError) from _normalize_remote_url and _run_git. - Add UpstreamPin frozen dataclass; upstream_pin_from_checkout now returns UpstreamPin(repo, revision) instead of a plain tuple[str, str]. - Split _run_git into _run_git_raw (always returns CompletedProcess) and _run_git (raises on non-zero). upstream_pin_from_checkout now uses _run_git_raw for the symbolic-ref detached-HEAD check — all subprocess calls go through the same internal layer. - Implement is_eligible independently with short-circuit boolean evaluation; it no longer delegates to explain_ineligible, eliminating the list allocation on every filter-loop call. - Add docstrings to all dataclass classes documenting which fields are required vs. nullable and what values are expected. - load_entries now raises UpstreamError immediately when path does not exist, rather than letting FileNotFoundError propagate unwrapped. - Tests: add test_normalize_remote_url parametrize that tests the regex directly (no subprocess overhead); add test_load_entries_raises_on_missing_file, test_load_error_is_upstream_error, test_checkout_errors_are_upstream_errors, and test_is_eligible_and_explain_ineligible_agree; fix test_pin_normalizes_remote_url to use tmp_path / 'upstream' (each parametrize case gets its own tmp_path — hash() dir names were unnecessary and non-deterministic); rename test_pin_returns_repo_and_revision to test_pin_returns_upstream_pin and assert against UpstreamPin. - Update docs/upstream_integration.md: UpstreamPin return type, allowlist language for verification_status. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
63f010f to
522882d
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Introduces
hletterscriptgen.upstream— the M1 milestone for reading upstream public-domain Hebrew scan entries.What's in this PR
src/hletterscriptgen/upstream.py— typed loader forentries.jsonlfrom the upstreampublic-domain-hand-written-hebrew-scansdataset; includes rights/quality eligibility gate declared inLICENSE-POLICY.mdupstream_pin_from_checkout()— helper that populatesletter_set.v1.upstreamfrom a clean local checkout; refuses detached HEAD and dirty trees so the pinned revision fully describes the bytes readUpstreamFile.role— completes the dataclass at definition timetests/test_upstream.py(192 lines) +tests/fixtures/upstream/entries.jsonl— full test coveragedocs/upstream_integration.md— updated integration notesCommits
b3dd39efeat(upstream): add read-only loader, eligibility gate, and revision pinc60d535fix(upstream): refuse detached HEAD in upstream_pin_from_checkout97c1d45feat(upstream): add UpstreamFile.role to complete the dataclass at definition time🤖 Generated with Claude Code