fast-pdf-extract

Rust backed PDF text extraction library for Python.

Features

Detect and remove headers and footers
Clean bilingual PDFs
Mark headings in bold (basic Markdown)
High accuracy
Performance

Development

uv sync --only-dev

# run tests (it rebuilds automatically)
uv run python -m unittest

# updating dependencies
cargo update
uv lock --upgrade

Publishing a new version

Check the latest published version.

python - <<'PY'
import json
import urllib.request

with urllib.request.urlopen("https://pypi.org/pypi/fast-pdf-extract/json") as response:
    data = json.load(response)

print(data["info"]["version"])
PY

Bump the version in Cargo.toml.

[package]
version = "0.6.1"

Refresh lockfiles and run checks.

cargo check
uv lock
just test

Build the release artifacts.

rm -rf target/wheels dist
uv run maturin build --release

Publish to PyPI.

# MATURIN_PYPI_TOKEN must be set in the environment.
uv run maturin publish --skip-existing

Verify PyPI shows the new version.

python - <<'PY'
import json
import urllib.request

with urllib.request.urlopen("https://pypi.org/pypi/fast-pdf-extract/json") as response:
    data = json.load(response)

print(data["info"]["version"])
PY

Commit the version bump.

git add Cargo.toml Cargo.lock
git commit -m "Bump version to <version>"

Troubleshooting

If cargo build complains of missing python version.

cargo clean
cargo build

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
.github/workflows		.github/workflows
src		src
tests		tests
.gitignore		.gitignore
AGENTS.md		AGENTS.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
justfile		justfile
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

fast-pdf-extract

Features

Development

Publishing a new version

Troubleshooting

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

fast-pdf-extract

Features

Development

Publishing a new version

Troubleshooting

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages