Skip to content

feat(data): add Bialik Safed 1927 letter crops (C02)#11

Merged
shaypal5 merged 1 commit into
mainfrom
data/bialik-safed-1927
May 13, 2026
Merged

feat(data): add Bialik Safed 1927 letter crops (C02)#11
shaypal5 merged 1 commit into
mainfrom
data/bialik-safed-1927

Conversation

@shaypal5

Copy link
Copy Markdown
Contributor

Summary

  • Expands chaim_nachman_bialik writer set with 18 new entries cropped from the 1927 Safed letter (commons__bialik_letter_safed_1927, page p0003, 403×453 px)
  • 14 new letter forms added to corpus: alef, ayin, he, het, mem_final, nun, nun_final, pe, pe_final, qof, samekh, shin, tsadi, tsadi, vav, zayin
  • 3 new variants of existing letters: bet v0002, lamed v0002, tav v0002
  • Crops sourced from two registers of Bialik's hand on p0003: poem title "לבנות שפיה" + author attribution "ח.נ. ביאליק" (block_ashkenazi, medium legibility); poem body cursive rows (cursive_ashkenazi, low legibility — very small glyphs at native scan resolution)

Closes #5

Validation

  • python3 scripts/validate_indexes.py --upstream-path ../public-domain-hand-written-hebrew-scansok: 1 writers, 25 entries, 25 files verified, 25 upstream-cross-checked
  • python3 -m pytest → 62 passed, 1 skipped
  • git diff --check → clean

Test plan

  • CI runs validate_indexes with upstream cross-check (all 25 entries, bbox within 403×453 for p0003 crops)
  • pytest passes (62 tests)
  • LFS objects pushed successfully (18 PNG crops)
  • Release artefacts (NOTICE.md, CITATION.cff, datapackage.json) committed and regenerate cleanly

🤖 Generated with Claude Code

Expands the chaim_nachman_bialik writer set by cropping 18 Hebrew
letter images from the 3-page letter Bialik sent from Safed in 1927
(commons__bialik_letter_safed_1927, pages p0001–p0003).

All 18 crops are from page p0003 (403×453 px, LicenseRef-Public-Domain-Israel):
- 11 new letter forms: alef, ayin, he, het, mem_final, nun, nun_final,
  pe, pe_final, samekh, shin, tsadi, vav, zayin (14 unique new entries)
- 3 new variants: bet v0002, lamed v0002, tav v0002
- Sources: poem title "לבנות שפיה" and author attribution "ח.נ. ביאליק"
  (block_ashkenazi); poem body cursive rows (cursive_ashkenazi)

All entries pass validate_indexes.py --upstream-path with upstream
cross-check, all 62 pytest tests pass.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@shaypal5 shaypal5 merged commit e6ca166 into main May 13, 2026
3 checks passed
@shaypal5 shaypal5 deleted the data/bialik-safed-1927 branch May 13, 2026 20:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

C02: Expand Bialik corpus — Safed 1927 letter (3 pages)

1 participant