Skip to content

Bump packages, clean up uv commands#564

Merged
PastelStorm merged 3 commits into
mainfrom
evoss/bump-packages
Apr 3, 2026
Merged

Bump packages, clean up uv commands#564
PastelStorm merged 3 commits into
mainfrom
evoss/bump-packages

Conversation

@PastelStorm

@PastelStorm PastelStorm commented Apr 3, 2026

Copy link
Copy Markdown
Contributor

Note

Medium Risk
Moderate risk because this updates the runtime dependency set and changes build/CI/Docker provisioning (uv lock enforcement and spaCy model preloading), which can cause install or runtime regressions if the new unstructured/model behavior differs.

Overview
Bumps the release to 0.1.2 and refreshes dependencies (including adding python-multipart), aligning with an unstructured update that replaces NLTK usage with spaCy.

Updates Dockerfile, Makefile, and GitHub workflows to pre-download spaCy models (replacing download_nltk_packages) and standardizes dependency installs by switching uv sync from --frozen to --locked across CI, Docker, and local install targets.

Written by Cursor Bugbot for commit d9d6362. This will update automatically on new commits. Configure here.

@cursor cursor Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

@@ -1 +1 @@
__version__ = "0.1.1" # pragma: no cover
__version__ = "0.1.2" # pragma: no cover

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Version bump without matching CHANGELOG entry breaks CI

High Severity

__version__ was bumped to 0.1.2 but CHANGELOG.md still has 0.1.1 as its latest entry. The make check-version CI step (which runs scripts/version-sync.sh -c) extracts the latest version from CHANGELOG.md and verifies it matches __version__.py. This mismatch will cause the check-version step to fail, blocking CI.

Fix in Cursor Fix in Web

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's already updated you doofus

@PastelStorm PastelStorm merged commit d57d709 into main Apr 3, 2026
14 checks passed
@PastelStorm PastelStorm deleted the evoss/bump-packages branch April 3, 2026 20:07
tylorbayer added a commit to SchoolAI/unstructured-api that referenced this pull request Jun 22, 2026
* Bump packages, clean up uv commands (Unstructured-IO#564)

<!-- CURSOR_SUMMARY -->
> [!NOTE]
> **Medium Risk**
> Moderate risk because this updates the runtime dependency set and
changes build/CI/Docker provisioning (uv lock enforcement and spaCy
model preloading), which can cause install or runtime regressions if the
new `unstructured`/model behavior differs.
> 
> **Overview**
> Bumps the release to `0.1.2` and refreshes dependencies (including
adding `python-multipart`), aligning with an `unstructured` update that
replaces NLTK usage with spaCy.
> 
> Updates Dockerfile, Makefile, and GitHub workflows to **pre-download
spaCy models** (replacing `download_nltk_packages`) and standardizes
dependency installs by switching `uv sync` from `--frozen` to `--locked`
across CI, Docker, and local install targets.
> 
> <sup>Written by [Cursor
Bugbot](https://cursor.com/dashboard?tab=bugbot) for commit
d9d6362. This will update automatically
on new commits. Configure
[here](https://cursor.com/dashboard?tab=bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

* fix(deps): upgrade vulnerable transitive dependencies [security] (Unstructured-IO#566)

## Summary

Automated scan found CVEs in transitive dependencies locked in `uv.lock`
files.
These packages were upgraded to patched versions.

### Remediated vulnerabilities

| Package | From | To | Severity | CVE |
|---|---|---|---|---|
| cryptography | 46.0.6 | 46.0.7 | Medium | CVE-2026-39892 |
| pypdf | 6.9.2 | 6.10.0 | Medium | CVE-2026-40260 |
| starlette | 0.41.2 | 0.47.2 | Medium | CVE-2025-54121 |
| starlette | 0.41.2 | 0.49.1 | High | CVE-2025-62727 |

### What this PR does
1. Scans all `uv.lock` files with
[grype](https://github.com/anchore/grype) for known CVEs
2. Runs `uv lock --upgrade-package <pkg>` for each fixable vulnerability
(skips major bumps)
3. Bumps component versions (patch) and updates CHANGELOGs via
`version-bump`

> Created by
[lockfile-security-scan](https://github.com/Unstructured-IO/infra/actions/workflows/lockfile-security-scan.yml).
> Targets **transitive dependencies** that Renovate cannot reach.

Co-authored-by: utic-renovate[bot] <utic-renovate[bot]@users.noreply.github.com>

* fix(docker): replace PyPI opencv wheel with ffmpeg-free build [security] (Unstructured-IO#569)

## Summary
Mirrors
[Unstructured-IO/unstructured#4336](Unstructured-IO/unstructured#4336)
in this repo so the `quay.io/unstructured-io/unstructured-api` image no
longer ships the 14 ffmpeg 5.1.x CVEs bundled in PyPI `opencv-python`
wheels.

After `uv sync`, the Dockerfile now:
- Downloads the architecture-specific `opencv-contrib-python-headless`
wheel (built with `WITH_FFMPEG=OFF` + `ENABLE_CONTRIB=1` +
`ENABLE_HEADLESS=1`) from the upstream `Unstructured-IO/unstructured`
GitHub release (`opencv-4.12.0.88`)
- SHA-256-verifies against the hashes published by the upstream
`build-opencv-wheels.yml` workflow
- Uninstalls any installed PyPI opencv variants and installs the
verified wheel with `--no-deps`

The contrib-headless variant is a strict superset of the `cv2` API
exposed by `opencv-python`, `opencv-python-headless`, and
`opencv-contrib-python`, so a single wheel transparently replaces
whichever variant is present.

## One deviation from upstream
Upstream uninstalls all four opencv variants in a single `uv pip
uninstall …` call because their image pulls all four transitively (via
`unstructured-paddleocr`). Our `uv.lock` currently only resolves
`opencv-python`, so a single combined uninstall would fail on the three
that aren't installed. Replaced with a per-package loop using `|| true`
— same end state, robust if transitive deps change.

## Version / Changelog
- Bumps service version `0.1.3` → `0.1.4`
- `CHANGELOG.md` entry under `0.1.4` → Security
- No `uv lock` changes needed; the lockfile still resolves
`opencv-python 4.13.0.92`, and we overlay the 4.12.0.88 contrib-headless
wheel only at image build time (upstream 4.13.0.92 has no sdist on PyPI,
which is why the build-from-source workflow is pinned to 4.12.0.88).

## Test plan
- [ ] `make docker-build` succeeds on `amd64` and `arm64`; the opencv
replacement step resolves the architecture-specific wheel and the
SHA-256 check passes
- [ ] `docker run … python -c "import cv2; print(cv2.__version__)"`
prints `4.12.0.88` inside the built image
- [ ] `make docker-test` passes against the rebuilt image
- [ ] Container scan of the rebuilt image no longer flags the 14 ffmpeg
CVEs called out by upstream PR #4336

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Medium risk because it changes a core binary dependency (`opencv`) at
image build time via an external wheel download and forced
uninstall/reinstall, which could impact image build reliability or
runtime CV2 behavior across architectures.
> 
> **Overview**
> Updates the Docker build to **remove vulnerable ffmpeg-bundled PyPI
OpenCV wheels** by downloading an arch-specific, SHA-256-verified
`opencv-contrib-python-headless` wheel built with `WITH_FFMPEG=OFF`,
uninstalling any installed OpenCV variants, and reinstalling the
verified wheel.
> 
> Bumps the service version to `0.1.4` and adds a `CHANGELOG.md`
security entry documenting the OpenCV/ffmpeg CVE mitigation.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
7e23afc. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix(docker): purge uv wheel cache after opencv swap [security] (Unstructured-IO#570)

## Summary
Follow-up to Unstructured-IO#569 (v0.1.4). That PR replaced the PyPI `opencv-python`
wheel with an ffmpeg-free build, but image scanners were still flagging
the 14 ffmpeg CVEs against v0.1.4. Root cause is scanner scope, not a
broken replacement.

## Root cause
`uv pip uninstall` only drops a package from `site-packages`. The
extracted wheel archive stays in the uv cache. Inspecting the pushed
v0.1.4 image:

- ✅ `cv2.__version__` reports `4.12.0` (our replacement wheel)
- ✅ `site-packages/cv2/` has no `.libs/` directory
- ❌
`/home/notebook-user/.cache/uv/archive-v0/<hash>/opencv_python.libs/`
still contains the full extracted old wheel:
  - `libavcodec-*.so.59.37.100`
  - `libavformat-*.so.59.27.100`
  - `libavutil-*.so.57.28.100`
  - plus `libavfilter`, `libavdevice`, `libswscale`, `libswresample`

SO-version suffixes (avcodec 59.37 / avformat 59.27 / avutil 57.28) are
ffmpeg 5.1.x — matching the CVE set the upstream PR called out. Scanners
walk the whole filesystem and flag these even though nothing links
against them at runtime. `UV_LINK_MODE=copy` (set globally in this
Dockerfile) compounds it — the cache keeps its own copy independent of
`site-packages`.

## Fix
Add `uv cache clean` to the end of the opencv replacement `RUN` to wipe
the cache (including the old opencv wheel archive) from the final image
layer. Single minimal change — scoped to the opencv-fix RUN, not a
broader image-slimming pass.

Safe because `UV_LINK_MODE=copy` means the live venv copies files out of
cache — wiping the cache doesn't affect the installed packages.

## False positives ignored (not fixed here)
Two other `libav*` filenames in the image that are **not** ffmpeg and
don't trigger these CVEs:
- `/usr/lib/libreoffice/program/libavmedia{gst,lo}.so` — LibreOffice's
\"avmedia\" framework shim
- `pillow.libs/libavif-*.so.16` — AV1 image codec

## Version / Changelog
- Bumps service version `0.1.4` → `0.1.5`
- `CHANGELOG.md` entry under `0.1.5` → Security
- No `uv lock` changes

## Test plan
- [ ] `make docker-build` succeeds on `amd64` and `arm64`
- [ ] In the rebuilt image, `find / -name \"libavcodec*\" -o -name
\"libavformat*\" -o -name \"libswscale*\"` returns nothing under
`/home/notebook-user/.cache/uv/` and nothing under
`site-packages/cv2/.libs/`
- [ ] `cv2.__version__` still reports `4.12.0.88` and `import cv2;
cv2.imdecode(...)` smoke check works
- [ ] Container scan of the rebuilt image no longer flags the 14 ffmpeg
CVEs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Low Risk**
> Low risk: a single Docker build-step cleanup (`uv cache clean`) plus
version/changelog bumps; main risk is unintended impact on Docker layer
caching or build time, not runtime behavior.
> 
> **Overview**
> Removes leftover ffmpeg `.so` files from the built image by adding `uv
cache clean` after uninstalling/reinstalling OpenCV wheels in the
Dockerfile, preventing scanners from flagging CVEs from cached wheel
contents.
> 
> Bumps the service version to `0.1.5` and adds a matching
`CHANGELOG.md` security entry describing the cache purge.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
f73143d. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

* fix: remediate CVEs for unstructured-api (Unstructured-IO#571)

## Summary

- **starlette** 0.41.2 → 1.0.0: remediates CVE-2025-54121 (MEDIUM) and
CVE-2025-62727 (HIGH). Removes the `starlette==0.41.2` constraint pin
from `[tool.uv]` — the only middleware in this repo is FastAPI's
built-in CORS middleware, which is compatible with starlette 1.0.0.
- **python-multipart** 0.0.22 → 0.0.27: remediates CVE-2026-40347
(MEDIUM).
- Bumps service version from 0.1.5 → 0.1.6.
- Does **not** touch lxml (handled by PR Unstructured-IO#525).

## Test plan

- [x] `uv sync --locked` succeeds (lockfile is consistent)
- [x] `make check-src` passes (ruff format, ruff check, mypy)
- [ ] CI lint + unit tests pass
- [ ] Docker smoke tests pass

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- CURSOR_SUMMARY -->
---

> [!NOTE]
> **Medium Risk**
> Primarily dependency/version changes, but removing the
`starlette==0.41.2` constraint can introduce runtime incompatibilities
due to a major Starlette upgrade affecting FastAPI/middleware behavior.
> 
> **Overview**
> Updates the service to `0.1.6` and documents a new security release in
`CHANGELOG.md`.
> 
> Removes the `starlette==0.41.2` constraint from `pyproject.toml`
(allowing Starlette to upgrade to remediate CVE-2025-54121 and
CVE-2025-62727) and bumps `python-multipart` to a non-vulnerable release
to address CVE-2026-40347.
> 
> <sup>Reviewed by [Cursor Bugbot](https://cursor.com/bugbot) for commit
ddaeefc. Bugbot is set up for automated
code reviews on this repo. Configure
[here](https://www.cursor.com/dashboard/bugbot).</sup>
<!-- /CURSOR_SUMMARY -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix: remediate starlette, lxml, and python-multipart CVEs for unstructured-api (Unstructured-IO#573)

## Summary

- **Bump starlette** 1.0.0 → 1.1.0 (transitive via fastapi) — fixes
CVE-2025-62727 (HIGH, SLA breach +36d) and CVE-2025-54121 (MEDIUM, SLA
breach +20d)
- **Bump lxml** 6.1.0 → 6.1.1 (transitive via unstructured) — fixes
CVE-2026-41066 (HIGH, SLA breach +20d)
- **Bump python-multipart** 0.0.27 → 0.0.29 (direct dep) — fixes
CVE-2026-40347 (MEDIUM, SLA breach +4d)
- **Rebuild** picks up latest python-3.12 apk — resolves CVE-2025-12781
(MEDIUM)

All 5 CVEs are in SLA breach.

### Changes

- `pyproject.toml`: bumped `python-multipart` minimum from `>=0.0.18` to
`>=0.0.29`; added `starlette>=1.1.0` and `lxml>=6.1.1` to `[tool.uv]
constraint-dependencies` to pin transitive dep floors
- `uv.lock`: regenerated with upgraded packages
- `prepline_general/api/__version__.py`: patch bump 0.1.6 → 0.1.7
- `CHANGELOG.md`: added 0.1.7 security entry

## Test plan

- [x] `make install-test` succeeds with `--locked`
- [x] `make test` — 133 passed, 0 failures
- [ ] CI passes on this PR
- [ ] Image build + scan confirms CVEs resolved

🤖 Generated with [Claude Code](https://claude.com/claude-code)

<!-- This is an auto-generated description by cubic. -->
---
## Summary by cubic
Remediates five SLA-breached CVEs by updating `starlette`, `lxml`, and
`python-multipart`, rebuilding for the latest Python 3.12 APK, and
tightening transitive floors. Also fixes CI by regenerating `uv.lock`
with the real `uv` binary.

- **Dependencies**
  - `starlette` 1.0.0 → 1.1.0 — fixes CVE-2025-62727, CVE-2025-54121.
  - `lxml` 6.1.0 → 6.1.1 — fixes CVE-2026-41066.
  - `python-multipart` 0.0.27 → 0.0.29 — fixes CVE-2026-40347.
- Rebuild image to include latest Python 3.12 APK — fixes
CVE-2025-12781.
- Add `[tool.uv]` constraints (`starlette>=1.1.0`, `lxml>=6.1.1`);
regenerate `uv.lock` with real `uv` to fix `uv sync --locked` in CI.
  - Bump version to 0.1.7 and update `CHANGELOG.md`.

<sup>Written for commit 73f74ba.
Summary will update on new commits. <a
href="https://cubic.dev/pr/Unstructured-IO/unstructured-api/pull/573?utm_source=github">Review
in cubic</a></sup>

<!-- End of auto-generated description by cubic. -->

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* remove cicd

---------

Co-authored-by: Emily Voss <github@emilyvoss.dev>
Co-authored-by: utic-github-cicd-token-generator[bot] <258069197+utic-github-cicd-token-generator[bot]@users.noreply.github.com>
Co-authored-by: utic-renovate[bot] <utic-renovate[bot]@users.noreply.github.com>
Co-authored-by: Lawrence Elitzer (LoLo) <lawrence@unstructured.io>
Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants