Skip to content

feat: add optional Headroom compression proxy support#22

Open
mydisha wants to merge 1 commit into
mainfrom
feature/headroom-integration
Open

feat: add optional Headroom compression proxy support#22
mydisha wants to merge 1 commit into
mainfrom
feature/headroom-integration

Conversation

@mydisha

@mydisha mydisha commented Jun 24, 2026

Copy link
Copy Markdown
Owner

What

Adds opt-in integration with the Headroom compression proxy. When enabled, KeiRouter sends requests through Headroom for token compression and tracks exact savings (tokens before/after, compression ratio, transforms) alongside existing optimization modes.

Why

Headroom reports exact token counts from its compression proxy (unlike RTK's byte/4 estimate), giving more accurate savings analytics. Integration is fully opt-in so the default make dev / make setup flow remains unchanged.

How

  • New package backend/internal/headroom/ — HTTP client that sends OpenAI-format messages to Headroom, applies compressed results back to the canonical core.ChatRequest, and returns Stats (tokens before/after/saved, compression ratio, transforms, CCR hashes). Skips unsupported content parts gracefully.
  • Pipeline integration (pipeline.go) — Headroom runs as a compression step in applyTokenSaving. When Headroom.Enabled, it takes precedence over Slimmer; otherwise Slimmer/Terse/Caveman run as before. Failures are non-fatal (logs a warning and continues uncompressed).
  • Persistence — Migration 0021_headroom_savings.sql adds headroom_* columns to usage_records; store/models.go and repo_usage.go persist and aggregate headroom stats. Separate from RTK estimates so both systems can be observed independently.
  • Observabilitymeter and observ/metrics track headroom activity and savings; gateway/insights.go and gateway/settings.go expose headroom config and savings in the admin API.
  • Dev experience — New make headroom target starts the proxy via Docker or native CLI. KEIROUTER_HEADROOM_AUTO=1 make dev auto-starts it alongside backend + dashboard. compose.yaml adds a headroom service with healthcheck and wires KEIROUTER_HEADROOM__BASE_URL.
  • Frontend — Settings page gains headroom controls; Usage page and SavingsBreakdown component surface headroom savings separately from RTK.
  • Docs — README and scripts/quickstart.sh updated with headroom setup instructions.

Changed files (21 files, +967/-61)

Area Files
Backend core headroom/headroom.go, headroom/headroom_test.go, pipeline/pipeline.go
Storage store/migrations/0021_headroom_savings.sql, store/models.go, store/repo_usage.go
Observability meter/meter.go, observ/metrics.go
Gateway gateway/insights.go, gateway/settings.go, gateway/gemini.go, gateway/handlers.go, app/app.go
Frontend pages/Settings.tsx, pages/Usage.tsx, components/SavingsBreakdown.tsx, lib/api.ts
Infra/Docs Makefile, compose.yaml, README.md, scripts/quickstart.sh

Checklist

  • make test passes
  • make vet passes
  • npm run typecheck passes (frontend changed)
  • npm run lint — N/A (no lint script configured in frontend/package.json)
  • Documentation updated (README, quickstart)
  • Config example updated (compose.yaml, Makefile defaults)

Add opt-in Headroom support for local development via a new make target and
KEIROUTER_HEADROOM_AUTO flag, keeping the default dev/setup flow unchanged.

Track Headroom activity, token savings, and transforms in usage records so
compression behavior can be observed alongside existing optimization modes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant