Refactor provider usage and especially oauth by billybonks · Pull Request #46 · Hyper-Unearthing/llm_gateway

billybonks · 2026-04-03T13:34:29Z

I reviewed the branch (main...HEAD) and updated this summary to include all public API changes.

Public API changes

1) Breaking: Claude Code custom path removed

Removed model.start_with?("claude_code/") routing in LlmGateway::Client.
Removed provider registration for anthropic_oauth_messages.
Removed Claude Code adapter path (LlmGateway::Adapters::ClaudeCode::MessagesAdapter and related mappers).
New usage pattern:
- Use LlmGateway::Clients::Claude directly with Claude Code OAuth access tokens (sk-ant-oat...).
- Claude now auto-detects Claude Code OAuth tokens and automatically:
  - switches to Claude Code OAuth headers,
  - prepends the Claude Code identity system prompt.
New helper on Claude client:
- LlmGateway::Clients::Claude#get_oauth_access_token(access_token:, refresh_token:, expires_at:) { |new_access, new_refresh, new_expires| ... }
- Consumers are still responsible for persisting refreshed credentials.

2) Breaking: OpenAI Codex custom client removed

Deleted LlmGateway::Clients::OpenAiCodex custom client class path.
openai_oauth_codex provider now uses LlmGateway::Clients::OpenAi.
LlmGateway::Adapters::OpenAiCodex::Client alias now points to LlmGateway::Clients::OpenAi.
New usage pattern for Codex via OpenAI client:
- chat_codex(...)
- stream_codex(...)
- get_oauth_access_token(...)

3) New/expanded request options in public chat/stream APIs

Across LlmGateway::Client.chat, responses, and adapter-backed usage, these user-facing options are now normalized consistently:

cache_key
cache_retention (short, long, none)
reasoning (none, low, medium, high, xhigh)
max_completion_tokens
response_format (provider-specific mapping)

Provider behavior:

OpenAI Chat Completions:
- cache_key -> prompt_cache_key
- cache_retention -> prompt_cache_retention (short -> in_memory, long -> 24h, none removes cache key)
- reasoning -> reasoning_effort
OpenAI Responses:
- max_completion_tokens -> max_output_tokens
- same cache mapping as above
- reasoning -> { effort: ..., summary: "detailed" }
OpenAI Codex:
- inherits OpenAI Responses reasoning mapping
- removes token-limit params for Codex endpoint compatibility
- keeps prompt_cache_key but strips retention params from request body
Anthropic:
- max_completion_tokens -> max_tokens
- reasoning -> thinking (budget token mapping)
- response_format normalized to output_config
- forwards cache_retention

4) New Anthropic cache behavior (publicly observable)

When using Claude with cache_retention, the client now automatically applies cache control:

sets top-level cache_control, and
sets cache_control on the last system block and last tool block.

For cache_retention: "long" against official Anthropic endpoint, ttl: "1h" is used.

5) Streaming error behavior changed (publicly observable)

Streaming mappers now raise LlmGateway::Errors::PromptTooLong when provider stream error payloads indicate context overflow (same semantic as non-streaming paths).

This applies to streaming through:

Claude
OpenAI Chat Completions
OpenAI Responses / Codex

6) OAuth credential script behavior changed

Credential scripts now default to writing/merging into a shared auth file:

~/.config/llm_gateway/auth.json (override with LLM_GATEWAY_AUTH_FILE)
providers are persisted under separate keys (anthropic, openai) in one JSON document.

7) Minor public-facing error detail improvement

OpenAI fallback API status errors now include response body text when no structured error payload is present.

the custom path tried to make things easy but consumers must manage there own token details as when its refreshed they must update the values. deleting all these custom paths also makes things much simpler on our side

for anthropic it will automatically add the cache tags, i have not tested what happens when the user already set it but it will probably override it

just makes it easier to see what are the behaviours this test is not extensive for all options that could be passed

also update all te test to test super set of all options we know of

…ptions

this adds a cache at the last message, so we dont have to do it ourselves

billybonks added 2 commits April 3, 2026 23:32

breaking: delete claude code custom path

8a5ed44

the custom path tried to make things easy but consumers must manage there own token details as when its refreshed they must update the values. deleting all these custom paths also makes things much simpler on our side

breaking: delete codex custom path

3f0bcda

the custom path tried to make things easy but consumers must manage there own token details as when its refreshed they must update the values. deleting all these custom paths also makes things much simpler on our side

billybonks force-pushed the refactor/claude branch from c32b388 to fd7c22e Compare April 3, 2026 15:32

billybonks and others added 6 commits April 6, 2026 13:53

udpate stream test

147bfc9

update oauth scripts

8241a2b

fixup! refactor: move option wraggling to option mapper

1812e4d

fixup! refactor: move option wraggling to option mapper

27636f7

fix: throw prompt too long errors when streaming as well

f5fb0e1

refactor: move all live tests to a shared helper

bb61205

billybonks force-pushed the refactor/claude branch from 76b4029 to bb61205 Compare April 6, 2026 05:58

gruv added 6 commits April 6, 2026 14:46

test: fix test asserting wrong error type for prompt too long

4b9ee9c

feat: support prompt caching, with cache_retention cache_key options

b767fe7

for anthropic it will automatically add the cache tags, i have not tested what happens when the user already set it but it will probably override it

test: add tests for all option mappers

1135629

just makes it easier to see what are the behaviours this test is not extensive for all options that could be passed

fix: bug in response format mapping anthropic

3a1b1bf

also update all te test to test super set of all options we know of

fixup! feat: support prompt caching, with cache_retention cache_key o…

03c4336

…ptions

refactor: claude supports automatic caching with cache-control option

479a93a

this adds a cache at the last message, so we dont have to do it ourselves

billybonks merged commit 5427256 into main Apr 6, 2026
1 check passed

billybonks deleted the refactor/claude branch April 6, 2026 11:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Refactor provider usage and especially oauth#46

Refactor provider usage and especially oauth#46
billybonks merged 14 commits into
mainfrom
refactor/claude

billybonks commented Apr 3, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

billybonks commented Apr 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Public API changes

1) Breaking: Claude Code custom path removed

2) Breaking: OpenAI Codex custom client removed

3) New/expanded request options in public chat/stream APIs

4) New Anthropic cache behavior (publicly observable)

5) Streaming error behavior changed (publicly observable)

6) OAuth credential script behavior changed

7) Minor public-facing error detail improvement

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

billybonks commented Apr 3, 2026 •

edited

Loading