Skip to content

slacey/WorldreaderEpubPrep

Repository files navigation

Worldreader EPUB Prep

Build Worldreader-style EPUBs from PDFs, matching the Magic Garage: Meet Pax standard.

Prerequisites

  • Python 3.8+
  • A PDF source file and a cover image (PNG or JPG)

Setup (local machine)

1. Clone or download the repo

git clone https://github.com/your-org/WorldreaderEpubPrep.git
cd WorldreaderEpubPrep

Or download and extract the ZIP, then open a terminal in the project folder.

2. Create a virtual environment (recommended)

python -m venv venv

macOS / Linux:

source venv/bin/activate

Windows (Command Prompt):

venv\Scripts\activate.bat

Windows (PowerShell):

venv\Scripts\Activate.ps1

3. Install dependencies

pip install -r requirements.txt

This installs PyMuPDF (fitz), Pillow, and lxml.

4. Verify the setup

python build_worldreader_epub.py --help

You should see the usage and available options.

Usage

Basic command

python build_worldreader_epub.py --pdf "input.pdf" --cover "cover.png" --title "Book Title" --creator "Worldreader" --out "output.epub"

Arguments

Argument Required Description
--pdf Yes Path to the input PDF file
--cover Yes Path to the cover image (PNG or JPG)
--title Yes Book title (used in metadata and TOC)
--creator No Author/creator (default: Worldreader)
--out Yes Path for the output EPUB file

Example: single book

python build_worldreader_epub.py \
  --pdf "books/My Story.pdf" \
  --cover "covers/My Story.png" \
  --title "My Story" \
  --creator "Jane Author" \
  --out "output/My Story.epub"

Example: batch from a folder (macOS / Linux)

mkdir -p epub-output
for f in my_books/*.pdf; do
  base=$(basename "$f" .pdf)
  python build_worldreader_epub.py \
    --pdf "$f" \
    --cover "covers/${base}.png" \
    --title "$base" \
    --creator "Worldreader" \
    --out "epub-output/${base}.epub"
done

Example: batch (Windows PowerShell)

New-Item -ItemType Directory -Force epub-output
Get-ChildItem my_books\*.pdf | ForEach-Object {
  $base = [System.IO.Path]::GetFileNameWithoutExtension($_.Name)
  python build_worldreader_epub.py `
    --pdf $_.FullName `
    --cover "covers\$base.png" `
    --title $base `
    --creator "Worldreader" `
    --out "epub-output\$base.epub"
}

Output

  • The script writes the EPUB to the path you specify with --out.
  • A temporary *_build folder is created during processing and removed after packaging (unless an error occurs).
  • The output EPUB is a valid EPUB 3 package with OEBPS layout, images, and NCX fallback.

What the script produces

  • OEBPS layout: content.opf, cover.xhtml, chap01.xhtml, copy.xhtml, nav.xhtml, toc.ncx, stylesheet.css, images/
  • Images: Cover image + sequential page images extracted from the PDF
  • ALT text: Bracketed items like [11. Alt Text: ...] in the PDF are stripped from visible text and applied to the next image’s alt attribute
  • EPUB 3 with NCX fallback: spine toc="ncx" for broad reader compatibility

Reference standard

Worldreader_EPUB_Template/Magic Garage_ Meet Pax.epub is the reference EPUB. The Python script outputs the same structure (OEBPS, NCX, images, etc.). See Worldreader_EPUB_Template/README.md for ALT text conventions and other details.

Web app (secondary / future)

A browser-based PWA lives in web-app/. It uses the File System Access API (Chrome/Edge) to pick a folder and write EPUBs. It produces a simpler EPUB than the Python Magic Garage standard and may be extended later.

cd web-app
npm install
npm run dev

License

MIT

About

EPUB preparation and QA for Worldreader

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors