Refactor provider usage and especially oauth#46
Merged
Conversation
the custom path tried to make things easy but consumers must manage there own token details as when its refreshed they must update the values. deleting all these custom paths also makes things much simpler on our side
the custom path tried to make things easy but consumers must manage there own token details as when its refreshed they must update the values. deleting all these custom paths also makes things much simpler on our side
c32b388 to
fd7c22e
Compare
76b4029 to
bb61205
Compare
added 6 commits
April 6, 2026 14:46
for anthropic it will automatically add the cache tags, i have not tested what happens when the user already set it but it will probably override it
just makes it easier to see what are the behaviours this test is not extensive for all options that could be passed
also update all te test to test super set of all options we know of
this adds a cache at the last message, so we dont have to do it ourselves
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
I reviewed the branch (
main...HEAD) and updated this summary to include all public API changes.Public API changes
1) Breaking: Claude Code custom path removed
model.start_with?("claude_code/")routing inLlmGateway::Client.anthropic_oauth_messages.LlmGateway::Adapters::ClaudeCode::MessagesAdapterand related mappers).LlmGateway::Clients::Claudedirectly with Claude Code OAuth access tokens (sk-ant-oat...).LlmGateway::Clients::Claude#get_oauth_access_token(access_token:, refresh_token:, expires_at:) { |new_access, new_refresh, new_expires| ... }2) Breaking: OpenAI Codex custom client removed
LlmGateway::Clients::OpenAiCodexcustom client class path.openai_oauth_codexprovider now usesLlmGateway::Clients::OpenAi.LlmGateway::Adapters::OpenAiCodex::Clientalias now points toLlmGateway::Clients::OpenAi.chat_codex(...)stream_codex(...)get_oauth_access_token(...)3) New/expanded request options in public chat/stream APIs
Across
LlmGateway::Client.chat,responses, and adapter-backed usage, these user-facing options are now normalized consistently:cache_keycache_retention(short,long,none)reasoning(none,low,medium,high,xhigh)max_completion_tokensresponse_format(provider-specific mapping)Provider behavior:
cache_key -> prompt_cache_keycache_retention -> prompt_cache_retention(short -> in_memory,long -> 24h,noneremoves cache key)reasoning -> reasoning_effortmax_completion_tokens -> max_output_tokensreasoning -> { effort: ..., summary: "detailed" }prompt_cache_keybut strips retention params from request bodymax_completion_tokens -> max_tokensreasoning -> thinking(budget token mapping)response_formatnormalized tooutput_configcache_retention4) New Anthropic cache behavior (publicly observable)
When using Claude with
cache_retention, the client now automatically applies cache control:cache_control, andcache_controlon the lastsystemblock and lasttoolblock.For
cache_retention: "long"against official Anthropic endpoint,ttl: "1h"is used.5) Streaming error behavior changed (publicly observable)
Streaming mappers now raise
LlmGateway::Errors::PromptTooLongwhen provider stream error payloads indicate context overflow (same semantic as non-streaming paths).This applies to streaming through:
6) OAuth credential script behavior changed
Credential scripts now default to writing/merging into a shared auth file:
~/.config/llm_gateway/auth.json(override withLLM_GATEWAY_AUTH_FILE)anthropic,openai) in one JSON document.7) Minor public-facing error detail improvement
OpenAI fallback API status errors now include response body text when no structured error payload is present.