Releases · AryanBV/pdf-edit-engine

Bugfix release — fixes three classes of CIDFont/Identity-H text-replacement failures discovered on real-world Chrome and Word PDFs. Fully backwards-compatible: public API unchanged.

[0.1.1] — 2026-04-15

Fixed

ARY-276: Identity-H CIDFont replacement on large-font titles with per-glyph Tm+Tj emission (Word and Chrome generators) no longer garbles spacing. The operator merge logic now has an all-narrow anchor fallback that collapses chains of narrow Tm+Tj operators into a single anchor, so replacement text flows past the original operator boundaries as the PDF spec allows (surgeon.py F0 fallback, commit f2b4aad).
ARY-278: Narrow Identity-H subsets (e.g., Chrome's 179-glyph ArialMT) now extend via in-place glyph injection. Missing glyphs are appended to the embedded font at fresh GIDs, preserving every pre-existing CID→GID mapping. The previous Tier 2 subset-and-replace approach renumbered CIDs and corrupted unrelated content-stream text (the 1ova ,ndustries Mode 2 symptom) — replaced entirely (fonts.py _extend_tier2, commits 4c262d4..77d3912).
Cross-font resolver pollution in replace_all: _apply_single_replacement now always re-fetches the resolver from match.characters[0].font_name, discarding any stale resolver passed in by the caller. Previously, replace_all's per-page loop reused one pre-fetched resolver across every match on the page. When matches used different fonts, the stale resolver validated can_encode against the wrong font, extension was skipped, and content-stream operators were encoded with the wrong font's CIDs. Symptom on real Chrome PDFs with multiple Identity-H fonts per page: "ova ndustries" extraction because the emitted CIDs only mapped to N/I in the other font's ToUnicode CMap. Pre-existing bug, surfaced during 0.1.1 real-PDF validation.
FontResolverCache: now evicts by font-dict object generation number, so pages that share a font via indirect reference are invalidated together after font mutation (encoding.py, commit 8acbd49).
/W and /ToUnicode dedup entries on repeat extend_subset calls to prevent bloat (fonts.py, commit 60a1697).
mypy strict: resolved 15 pre-existing strict-mode errors in structural.py and reflow.py. The CI mypy step is now blocking (previously had || true).

Verified

Tested against real-world Chrome (Skia/PDF m147) and Microsoft Word PDFs that reproduced the original ARY-276 garble. Both round-trip cleanly with no Mode-1 or Mode-2 garble tokens in extracted text and no silent font substitutions.
636 tests passing (up from 628), mypy strict clean on all 16 source files, ruff clean.

Known scope limits

CFF / Type1 embedded fonts still raise FontNotFoundError with a clear message when the engine needs to inject glyphs into them. Tier 1.5 handles TrueType only; CFF support is tracked in ARY-279 for 0.2.0.

[0.1.0] — 2026-04-07

Initial release — format-preserving PDF text editing.

Text search, replacement, and batch editing at the content stream operator level

Two-tier font subset extension (CMap-only fast path + full re-embed)

FidelityReport on every edit — programmatic quality verification

15 PDF wrapper operations (merge, split, rotate, encrypt, etc.)

Paragraph detection and greedy line-breaking reflow

628 tests, 85% coverage

Zero external binaries, zero API keys, zero network calls

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.1.1] — 2026-04-15

Fixed

Verified

Known scope limits

Uh oh!

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

[0.1.0] — 2026-04-07

Uh oh!

Releases: AryanBV/pdf-edit-engine

v0.1.1 — ARY-276 + ARY-278 bugfix release

[0.1.1] — 2026-04-15

Fixed

Verified

Known scope limits

Uh oh!

v0.1.0

[0.1.0] — 2026-04-07

Uh oh!