Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CITATION.cff
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ cff-version: 1.2.0
message: Please cite this dataset using the metadata below.
type: dataset
title: Hebrew Handwritten Per-Letter Image Dataset
abstract: Per-letter image crops of handwritten Hebrew letters, grouped into sets by writer. Each crop is a derivative of a permissively-licensed upstream scan in HeOCR/public-domain-hand-written-hebrew-scans, with per-image rights inherited and attribution recorded. The index is line-oriented JSON (JSONL). Release 0.0.0-rc contains 7 per-letter image entries drawn from 1 verified writers (7 PDM-1.0).
abstract: Per-letter image crops of handwritten Hebrew letters, grouped into sets by writer. Each crop is a derivative of a permissively-licensed upstream scan in HeOCR/public-domain-hand-written-hebrew-scans, with per-image rights inherited and attribution recorded. The index is line-oriented JSON (JSONL). Release 0.0.0-rc contains 25 per-letter image entries drawn from 1 verified writers (18 LicenseRef-Public-Domain-Israel, 7 PDM-1.0).
authors:
- name: Shay Palachy-Affek
version: 0.0.0-rc
Expand Down
2 changes: 1 addition & 1 deletion NOTICE.md
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@ Repository-authored metadata is dedicated to the public domain under CC0 1.0 Uni
Per-letter image crops are derivatives of upstream scans in [HeOCR/public-domain-hand-written-hebrew-scans](https://github.com/HeOCR/public-domain-hand-written-hebrew-scans) and carry per-entry rights inherited from the source page. The entries listed below carry a license that requires attribution (currently CC-BY-4.0, CC-BY-SA-4.0). Anyone redistributing or reusing these crops must keep the listed credit and link to the source page on which the rights claim was verified.

- Corpus release: `0.0.0-rc`
- Released at (corpus state): `2026-05-12T22:30:00Z`
- Released at (corpus state): `2026-05-13T10:00:00Z`

## Attribution-required entries

Expand Down
18 changes: 18 additions & 0 deletions data/index/entries.jsonl

Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
41 changes: 31 additions & 10 deletions datapackage.json
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,11 @@
"scope": "metadata",
"title": "Creative Commons Zero v1.0 Universal"
},
{
"name": "LicenseRef-Public-Domain-Israel",
"scope": "images",
"title": "Public Domain (Israel; life + 70)"
},
{
"name": "PDM-1.0",
"path": "https://creativecommons.org/publicdomain/mark/1.0/",
Expand All @@ -33,18 +38,18 @@
],
"name": "hletterscript",
"profile": "data-package",
"released_at": "2026-05-12T22:30:00Z",
"released_at": "2026-05-13T10:00:00Z",
"resources": [
{
"bytes": 14820,
"bytes": 48555,
"description": "Per-letter image index. One JSON object per cropped letter image, with upstream provenance, extraction provenance, file checksums, and inherited rights.",
"encoding": "utf-8",
"format": "jsonl",
"mediatype": "application/x-ndjson",
"name": "entries",
"path": "data/index/entries.jsonl",
"profile": "data-resource",
"record_count": 7
"record_count": 25
},
{
"bytes": 1506,
Expand All @@ -65,22 +70,38 @@
"stats": {
"attribution_required_count": 0,
"entry_writer_count": 1,
"image_byte_count": 22227,
"image_byte_count": 25613,
"letter_breakdown": {
"bet": 1,
"alef": 1,
"ayin": 1,
"bet": 2,
"he": 1,
"het": 1,
"kaf": 1,
"lamed": 1,
"lamed": 2,
"mem": 1,
"mem_final": 1,
"nun": 1,
"nun_final": 1,
"pe": 1,
"pe_final": 1,
"qof": 1,
"resh": 1,
"tav": 1,
"yod": 1
"samekh": 1,
"shin": 1,
"tav": 2,
"tsadi": 1,
"vav": 1,
"yod": 1,
"zayin": 1
},
"license_breakdown": {
"LicenseRef-Public-Domain-Israel": 18,
"PDM-1.0": 7
},
"record_count": 7,
"record_count": 25,
"writer_breakdown": {
"chaim_nachman_bialik": 7
"chaim_nachman_bialik": 25
},
"writer_record_count": 1,
"writer_status_breakdown": {
Expand Down
Loading