Skip to content

feat: per-session usage/cost aggregation helper#4

Open
alexanderkreidich wants to merge 1 commit into
appx-org:mainfrom
alexanderkreidich:feat/session-usage-metrics
Open

feat: per-session usage/cost aggregation helper#4
alexanderkreidich wants to merge 1 commit into
appx-org:mainfrom
alexanderkreidich:feat/session-usage-metrics

Conversation

@alexanderkreidich

Copy link
Copy Markdown

Closes #2.

Adds a pure aggregateSessionUsage(messages, { contextWindow, costRates }) fold over the transcript returned by getSessionMessages (or kept current from message_end wire events), plus emptySessionUsageMetrics for the pre-history state.

What it computes

  • Token totals — input / output / cacheRead / cacheWrite / total across all assistant turns.
  • Cost — taken from the wire (AssistantMessage.usage.cost); when the wire reports zero (custom LiteLLM-routed models), recalculated from a consumer-supplied per-million-token costRates map. Wire cost wins whenever non-zero.
  • Context utilization — anchored on the last clean assistant turn (aborted/errored turns still count toward spend, but not context), plus a chars/4 estimate for trailing messages. A compactionSummary after the anchor marks context unknown (null) until the next assistant turn re-measures it.
  • Cache-hit rate, message/tool-call counts, latest provider/model ref.

This is item (1) from #2 — the optional costRates map keeps the rates question decoupled; rates-in-AgentModelRow will be filed against agent-server separately.

The behavior mirrors the reference implementation in create-appx-app (src/lib/pi-metrics.ts), adapted from JSONL session entries to the contract AgentMessage[] (history is already the active branch, so no parentId walking; compaction arrives as a compactionSummary message).

Verification

  • 12 new tests in src/core/__tests__/usage.test.ts (aggregation, rates fallback, wire-cost precedence, aborted-turn handling, compaction, model ref).
  • npm run typecheck and the full npm test suite (77 tests) pass.

🤖 Generated with Claude Code

Consumers migrating from bespoke chat UIs showed per-session usage
(tokens, cache, cost, context %) by reading Pi session JSONL files from
disk. That path died with the containerized agent-server: sessions now
live inside the container, so every consumer renders zeros.

The data was already on the wire — AssistantMessage.usage arrives via
both getSessionMessages and message_end SSE events — but agent-client
had no notion of usage at all.

Add a pure aggregateSessionUsage(messages, {contextWindow, costRates})
fold over the transcript: token totals, cost (recalculated from
consumer-supplied per-million rates when LiteLLM-routed models report
zero cost), cache-hit rate, and context-window utilization anchored on
the last clean assistant turn (aborted/errored turns still count toward
spend but not context; compaction marks context unknown until the next
assistant turn re-measures it).

Closes appx-org#2.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Per-session usage/cost metrics (tokens, cache, cost, context %)

1 participant