Skip to content

fix: bump idna + starlette to patched versions#103

Merged
constk merged 1 commit into
developfrom
fix/cve-bumps-idna-starlette
May 26, 2026
Merged

fix: bump idna + starlette to patched versions#103
constk merged 1 commit into
developfrom
fix/cve-bumps-idna-starlette

Conversation

@constk
Copy link
Copy Markdown
Owner

@constk constk commented May 25, 2026

What & why

pip-audit on develop flags two transitive-dep CVEs (surfaced via fastapi / httpx):

Package Current Fixed in Advisory
idna 3.13 3.15+ CVE-2026-45409
starlette 1.0.0 1.0.1+ PYSEC-2026-161

Bumped via:

uv lock --upgrade-package idna --upgrade-package starlette

Resolves to idna 3.16 (3.15 was the listed fix; 3.16 is a further patch on the same line) and starlette 1.1.0 (minor bump). The currently pinned fastapi 0.136.1 accepts Starlette 1.1.x — confirmed locally by running the unit suite. All 192 unit tests pass on the upgraded lock; pip-audit then returns "No known vulnerabilities found".

Also bumps the project self-version 0.2.10 → 0.2.11 per docs/DEVELOPMENT.md.

Why this is its own PR. Code review on #99 (the README opener PR) recommended landing the CVE bumps separately so the dep-bump risk doesn't pile onto a docs-only change. PRs #99, #100, #101, #102/#105 all currently inherit these CVEs from develop and cannot pass pip-audit until this PR lands — so this is the unblocking commit for the whole release-blockers stack.

Test plan

  • uv lock --upgrade-package idna --upgrade-package starlette regenerates the lock cleanly
  • uv sync --frozen --extra dev succeeds with the new lock
  • uv run pytest tests/ -q → 192 passed (fastapi 0.136.1 + starlette 1.1.0 compatible)
  • uv run pip-audit → "No known vulnerabilities found"
  • CI pip-audit gate passes

Invariants affected

None.

New deps / actions / external surface

No new direct deps; two transitive deps bumped. starlette jumps minor (1.0.0 → 1.1.0) — verified compatible with the pinned fastapi 0.136.1.

Linked issue

None — surfaced by the #99 code review.

pip-audit on develop is flagging two transitive-dep CVEs:

- idna 3.13            CVE-2026-45409   (fix in 3.15+)
- starlette 1.0.0      PYSEC-2026-161   (fix in 1.0.1+)

Both are surfaced via fastapi/httpx. Bumps via:

    uv lock --upgrade-package idna --upgrade-package starlette

Resolves to idna 3.16 (3.15 was the listed fix; 3.16 is a further
patch with the same fix) and starlette 1.1.0 (minor bump; FastAPI is
compatible with it). All 192 unit tests pass on the upgraded lock.

Bumps the project self-version 0.2.10 -> 0.2.11 per
docs/DEVELOPMENT.md.

Unblocks the pip-audit CI gate on #99, #100, #101, #102 (and any
other PRs currently sitting on develop), all of which inherit the
flagged transitive CVEs from develop and cannot pass that gate until
this lands.
@constk constk merged commit d256e32 into develop May 26, 2026
22 checks passed
constk added a commit that referenced this pull request May 26, 2026
* feat: eval pattern examples calling Azure OpenAI (#94)

The eval slice previously shipped one toy case (echo-hello) and a
disabled-by-default nightly. A reader expecting an LLM-eval story
found the infrastructure without conviction.

Adds four worked-pattern cases that exercise the existing three
tolerance modes against a real Azure OpenAI deployment. These are
not benchmarks — they demonstrate what an eval case *looks like* for
the four LLM-eval patterns you most often need to write:

  - factual-http-200             exact_match       format-constrained recall
  - numeric-seconds-per-day      numeric_close     numeric reasoning + tolerance
  - definitional-fastapi-depends semantic_similar  free-form judge-scored prose
  - structured-json-status       exact_match       structured-output adherence

When the template is forked for a real project, replace these four
with cases that exercise the project's own prompts; the patterns
transfer regardless of what product is bolted on.

Provider choice — Azure OpenAI via the openai SDK with AzureOpenAI
client — is intentionally distinct from the rest of the harness
(which uses Claude via Claude Code). Demonstrates that the LLMClient
Protocol in src/eval/judge.py does its job: the eval core never
imports openai, vendor lock-in lives only in the adapter.

Changes:

  - src/eval/adapters/azure_openai.py — implements LLMClient via the
    openai.AzureOpenAI SDK. Reads endpoint/key/deployment/api-version
    from env. Lazy-imports the SDK so the module is importable without
    the optional extra installed; the adapter raises a clear
    AzureOpenAIConfigError if the env or SDK is missing.

  - eval/golden_patterns.json — the four cases with notes explaining
    which pattern each demonstrates.

  - eval/test_golden_patterns.py — separate test file gated on the
    Azure env vars via pytestmark. Skipped on a stock checkout, so
    `uv run pytest eval/` always exits 0. The toy test_golden_qa.py
    keeps running as before.

  - pyproject.toml — new optional [project.optional-dependencies] eval
    extra (just `openai>=1.40.0`), mypy override for openai.* matching
    the existing opentelemetry.* pattern, and a 0.2.10 -> 0.2.11
    self-version bump.

  - .github/workflows/eval-nightly.yml — env vars renamed from the
    placeholder LLM_* set to AZURE_OPENAI_*. Header comment updated
    with the Azure setup recipe. uv sync now passes --extra eval.

  - docs/EVAL_HARNESS.md — new "Worked patterns" section with the
    table mapping case -> tolerance -> pattern, the local setup
    recipe, and a "Swapping providers" note documenting the
    Protocol-based extension path.

Local gates: mypy --strict clean on 42 source files (was 31), ruff
clean, ruff format clean, import-linter both contracts kept, 192
unit tests pass, eval/ runs 1 passed + 4 skipped without LLM env.

Closes #94

* test: add adapter unit tests + adapters README (#94 review fixes)

Addresses two gate failures on #104 surfaced by code review:

1. "Tests required" gate — feat: prefix declared a behaviour change
   but tests/ had no test for the new adapter (the eval/-side test
   only runs with live Azure credentials). Adds
   tests/test_eval_azure_openai_adapter.py: 13 fully-offline cases
   covering _resolve_config (defaults, override, empty-string
   fallback, missing-env error listing), the constructor (env
   wiring, explicit API version, missing-env, missing-SDK), and the
   two SDK call paths (complete_json structured-output mode,
   complete user-message dispatch, null-content returns "" / "{}").

   The SDK is mocked at sys.modules level so the test never hits the
   network and never requires the openai extra to be installed.

2. "src/ README audit" gate — every src/ package needs a README.md
   per CLAUDE.md. Adds src/eval/adapters/README.md documenting the
   layer's purpose, the current adapter, a 7-step "adding a new
   adapter" recipe, and why the layer lives at the top of the import
   order.

Also applies the reviewer's non-blocking sentinel-string suggestion:
the magic "azure-deployment" string passed as judge_model in
eval/test_golden_patterns.py is now the named constant
_AZURE_DEPLOYMENT_SENTINEL with a comment explaining why the runner
threads it through but the Azure adapter discards it.

Local gates: 205 unit tests pass (was 192, +13 new), mypy clean on
43 source files, ruff/format/import-linter all green.

Refs #94

* docs: add Key interfaces section to adapters README (#94 review)

src/ README audit gate looks for a `## Key interfaces` (or `## Public
surface`) anchor — the existing README had purpose / table /
extension recipe / layering rationale, but no exported-names section.

Adds a `## Key interfaces` section listing the two exported names:

  - AzureOpenAIClient — the LLMClient implementation with notes on
    complete() vs complete_json() and the discarded `model` arg
    (Azure dispatches by deployment, not model).
  - AzureOpenAIConfigError — the construction-time error type,
    noting that it batches every missing env var into a single
    message instead of failing-and-retrying.

Both already documented in the adapter docstrings; this section
hoists them to the README anchor the audit gate enforces.

Refs #94

* chore: bump version to 0.2.12 (rebase onto develop after #103)
constk added a commit that referenced this pull request May 26, 2026
…sed post-#103/#104)

main moved ahead of develop on 2026-05-25 when PR #86 was merged
directly to main rather than via develop -> release flow. The
divergence is one squash commit (eff5b1c) carrying:

  - docs/BEADS.md (optional Beads issue-queue guidance)
  - .github/pull_request_template.md (Beads PR-template block)
  - .github/scripts/check_aspirational_tickets.py (PEP 758 reformat)
  - .github/scripts/check_pin_freshness.py / check_tests_present.py /
    check_version_bump.py (touch-ups)
  - .gitattributes / .gitignore (.beads/ ignore, Windows renormalise)
  - CONTRIBUTING.md (line-ending normalisation)
  - tests/test_scripts_compile.py (new CI-script compile gate)
  - docs/DEVELOPMENT.md / docs/HARNESS.md / docs/HARNESS_PRIMER.md
    cross-refs
  - pyproject.toml + uv.lock self-version 0.2.10 -> 0.2.11

This PR was rebased after #103 (CVE fix, develop -> 0.2.11) and
#104 (eval pattern examples, develop -> 0.2.12) merged. The version
on main (0.2.11) is now behind develop (0.2.12); the conflict is
resolved by bumping develop -> 0.2.13.

After this lands, develop is at 0.2.13 and contains everything main
has. Remaining in-flight PRs (#99, #100, #101, #105) need to rebase
to bump 0.2.13 -> 0.2.14 (and onward sequentially as they merge).

No behaviour change beyond what #86 already added to main.

# Conflicts:
#	pyproject.toml
#	uv.lock
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant