Build Worldreader-style EPUBs from PDFs, matching the Magic Garage: Meet Pax standard.
- Python 3.8+
- A PDF source file and a cover image (PNG or JPG)
git clone https://github.com/your-org/WorldreaderEpubPrep.git
cd WorldreaderEpubPrepOr download and extract the ZIP, then open a terminal in the project folder.
python -m venv venvmacOS / Linux:
source venv/bin/activateWindows (Command Prompt):
venv\Scripts\activate.batWindows (PowerShell):
venv\Scripts\Activate.ps1pip install -r requirements.txtThis installs PyMuPDF (fitz), Pillow, and lxml.
python build_worldreader_epub.py --helpYou should see the usage and available options.
python build_worldreader_epub.py --pdf "input.pdf" --cover "cover.png" --title "Book Title" --creator "Worldreader" --out "output.epub"| Argument | Required | Description |
|---|---|---|
--pdf |
Yes | Path to the input PDF file |
--cover |
Yes | Path to the cover image (PNG or JPG) |
--title |
Yes | Book title (used in metadata and TOC) |
--creator |
No | Author/creator (default: Worldreader) |
--out |
Yes | Path for the output EPUB file |
python build_worldreader_epub.py \
--pdf "books/My Story.pdf" \
--cover "covers/My Story.png" \
--title "My Story" \
--creator "Jane Author" \
--out "output/My Story.epub"mkdir -p epub-output
for f in my_books/*.pdf; do
base=$(basename "$f" .pdf)
python build_worldreader_epub.py \
--pdf "$f" \
--cover "covers/${base}.png" \
--title "$base" \
--creator "Worldreader" \
--out "epub-output/${base}.epub"
doneNew-Item -ItemType Directory -Force epub-output
Get-ChildItem my_books\*.pdf | ForEach-Object {
$base = [System.IO.Path]::GetFileNameWithoutExtension($_.Name)
python build_worldreader_epub.py `
--pdf $_.FullName `
--cover "covers\$base.png" `
--title $base `
--creator "Worldreader" `
--out "epub-output\$base.epub"
}- The script writes the EPUB to the path you specify with
--out. - A temporary
*_buildfolder is created during processing and removed after packaging (unless an error occurs). - The output EPUB is a valid EPUB 3 package with OEBPS layout, images, and NCX fallback.
- OEBPS layout:
content.opf,cover.xhtml,chap01.xhtml,copy.xhtml,nav.xhtml,toc.ncx,stylesheet.css,images/ - Images: Cover image + sequential page images extracted from the PDF
- ALT text: Bracketed items like
[11. Alt Text: ...]in the PDF are stripped from visible text and applied to the next image’saltattribute - EPUB 3 with NCX fallback:
spine toc="ncx"for broad reader compatibility
Worldreader_EPUB_Template/Magic Garage_ Meet Pax.epub is the reference EPUB. The Python script outputs the same structure (OEBPS, NCX, images, etc.). See Worldreader_EPUB_Template/README.md for ALT text conventions and other details.
A browser-based PWA lives in web-app/. It uses the File System Access API (Chrome/Edge) to pick a folder and write EPUBs. It produces a simpler EPUB than the Python Magic Garage standard and may be extended later.
cd web-app
npm install
npm run devMIT