feat: capture provider cost details by harrisony · Pull Request #520 · mcowger/plexus

harrisony · 2026-05-22T06:35:35Z

No description provided.

…r total cost The previous implementation used a single safeCost() call wrapping a ?? chain: safeCost(details.total_cost ?? usage?.cost ?? usage?.estimated_cost). This had two problems: - upstream_inference_cost was not considered at all - The entire ?? chain was validated as one, so safeCost() could not distinguish which source provided the value Restructure into an explicit three-step fallback, each validated independently: 1. cost_details.total_cost (LLM gateway, most detailed) 2. usage.cost (direct cost for OpenRouter-style providers) 3. cost_details.upstream_inference_cost (upstream OpenRouter-style cost) Steps 2→3 use || (falsy coalescing) to handle the OpenRouter quirk where usage.cost and/or upstream_inference_cost may be populated depending on the upstream provider and key type used (e.g. BYOK keys report 0 for usage.cost, but upstream_inference_cost carries the actual provider cost).

… from standard fields Previously upstream_inference_prompt_cost was aliased directly into input_cost. However, these fields have different semantics: upstream_inference_prompt_cost = input_cost + cached_input_cost (i.e., the combined prompt cost including cached tokens) Aliasing it into input_cost caused applyUsageCostDetails to zero out costCached, silently merging the cached portion into costInput. Changes: - Stop aliasing upstream_inference_prompt_cost → input_cost and upstream_inference_completions_cost → output_cost - Add same-tier aliasing: upstream_inference_input_cost → upstream_inference_prompt_cost and upstream_inference_output_cost → upstream_inference_completions_cost (same semantics, different provider naming conventions) - Update extraction tests to assert upstream fields stay separate - Add tests for BYOK fallback, Responses API input/output variants, and LLM Gateway field priority

Previously applyUsageCostDetails only had two branches: superset (per-bucket breakdown) and minimal (proportional distribution). Now that extractUsageCostDetails no longer aliases normal-tier upstream_inference_prompt_cost into input_cost, the normal-tier case needs explicit handling. Three tiers: 1. Superset: input_cost/cached_input_cost/cache_write_input_cost present → use per-bucket breakdown directly 2. Normal: upstream_inference_prompt_cost/completions_cost present but no input-side superset fields → use upstream prompt/completions split, then distribute the prompt portion by Plexus's own cache ratio 3. Minimal: no breakdown at all → proportional distribution from previously calculated costs

Add tests for: zero-cost non-BYOK requests, OpenRouter markup (cost >> upstream sum), zero prompt tokens, heavy-cache-hit ratio split, end-to-end BYOK and non-BYOK extract+apply flows. Also normalise scientific notation literals and add missing upstream_inference_* fields to existing superset fixtures. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…che token shape Three fixes informed by real provider testdata: - applyUsageCostDetails: tighten Normal-tier guard from != null to > 0 on upstream_inference_prompt/completions_cost. Vercel AI Gateway emits these as 0 (not null) when it has no upstream breakdown, causing the Normal tier to fire and zero out all sub-costs despite total_cost being correct. Now falls through to Minimal tier (proportional distribution) as intended. - normalizeOpenAIChatUsage: add prompt_cache_hit_tokens as final fallback for cached token count. DeepSeek sends both prompt_tokens_details.cached_tokens and prompt_cache_hit_tokens (consistent values), so real responses already work; the fallback is defensive for stripped-details payloads. - response-handler / usage-logging: warn when both SSE :cost and usage.cost_details are present on the same response. SSE priority is preserved but the conflict is now visible in logs. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

…st_details block Handles two providers that report cost at the top level of usage instead of inside a cost_details block: - usage.cost without cost_details: OpenRouter-routed models (e.g. Kimi-k2.5) that surface cost at the top level but don't populate cost_details. These were previously invisible to extractUsageCostDetails which required the cost_details object to be present. - usage.cost_in_usd_ticks: xAI grok models. 1 USD = 10^10 ticks per xAI API docs. Converted to USD and returned as total_cost with all sub-cost fields null, so applyUsageCostDetails uses the Minimal tier (proportional distribution from Plexus-calculated token costs). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

harrisony · 2026-05-22T06:37:40Z

sorry, fat fingers

harrisony and others added 7 commits May 22, 2026 16:34

feat(provider-cost): expand real-world cassette test coverage

5f2e2b7

harrisony closed this May 22, 2026

harrisony mentioned this pull request May 22, 2026

feat: capture provider cost details #521

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: capture provider cost details#520

feat: capture provider cost details#520
harrisony wants to merge 7 commits into
mcowger:mainfrom
harrisony:feature/capture-provider-cost-details

harrisony commented May 22, 2026

Uh oh!

harrisony commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

harrisony commented May 22, 2026

Uh oh!

harrisony commented May 22, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant