Skip to content

Refactor provider usage and especially oauth#46

Merged
billybonks merged 14 commits into
mainfrom
refactor/claude
Apr 6, 2026
Merged

Refactor provider usage and especially oauth#46
billybonks merged 14 commits into
mainfrom
refactor/claude

Conversation

@billybonks
Copy link
Copy Markdown
Contributor

@billybonks billybonks commented Apr 3, 2026

I reviewed the branch (main...HEAD) and updated this summary to include all public API changes.

Public API changes

1) Breaking: Claude Code custom path removed

  • Removed model.start_with?("claude_code/") routing in LlmGateway::Client.
  • Removed provider registration for anthropic_oauth_messages.
  • Removed Claude Code adapter path (LlmGateway::Adapters::ClaudeCode::MessagesAdapter and related mappers).
  • New usage pattern:
    • Use LlmGateway::Clients::Claude directly with Claude Code OAuth access tokens (sk-ant-oat...).
    • Claude now auto-detects Claude Code OAuth tokens and automatically:
      • switches to Claude Code OAuth headers,
      • prepends the Claude Code identity system prompt.
  • New helper on Claude client:
    • LlmGateway::Clients::Claude#get_oauth_access_token(access_token:, refresh_token:, expires_at:) { |new_access, new_refresh, new_expires| ... }
    • Consumers are still responsible for persisting refreshed credentials.

2) Breaking: OpenAI Codex custom client removed

  • Deleted LlmGateway::Clients::OpenAiCodex custom client class path.
  • openai_oauth_codex provider now uses LlmGateway::Clients::OpenAi.
  • LlmGateway::Adapters::OpenAiCodex::Client alias now points to LlmGateway::Clients::OpenAi.
  • New usage pattern for Codex via OpenAI client:
    • chat_codex(...)
    • stream_codex(...)
    • get_oauth_access_token(...)

3) New/expanded request options in public chat/stream APIs

Across LlmGateway::Client.chat, responses, and adapter-backed usage, these user-facing options are now normalized consistently:

  • cache_key
  • cache_retention (short, long, none)
  • reasoning (none, low, medium, high, xhigh)
  • max_completion_tokens
  • response_format (provider-specific mapping)

Provider behavior:

  • OpenAI Chat Completions:
    • cache_key -> prompt_cache_key
    • cache_retention -> prompt_cache_retention (short -> in_memory, long -> 24h, none removes cache key)
    • reasoning -> reasoning_effort
  • OpenAI Responses:
    • max_completion_tokens -> max_output_tokens
    • same cache mapping as above
    • reasoning -> { effort: ..., summary: "detailed" }
  • OpenAI Codex:
    • inherits OpenAI Responses reasoning mapping
    • removes token-limit params for Codex endpoint compatibility
    • keeps prompt_cache_key but strips retention params from request body
  • Anthropic:
    • max_completion_tokens -> max_tokens
    • reasoning -> thinking (budget token mapping)
    • response_format normalized to output_config
    • forwards cache_retention

4) New Anthropic cache behavior (publicly observable)

When using Claude with cache_retention, the client now automatically applies cache control:

  • sets top-level cache_control, and
  • sets cache_control on the last system block and last tool block.

For cache_retention: "long" against official Anthropic endpoint, ttl: "1h" is used.

5) Streaming error behavior changed (publicly observable)

Streaming mappers now raise LlmGateway::Errors::PromptTooLong when provider stream error payloads indicate context overflow (same semantic as non-streaming paths).

This applies to streaming through:

  • Claude
  • OpenAI Chat Completions
  • OpenAI Responses / Codex

6) OAuth credential script behavior changed

Credential scripts now default to writing/merging into a shared auth file:

  • ~/.config/llm_gateway/auth.json (override with LLM_GATEWAY_AUTH_FILE)
  • providers are persisted under separate keys (anthropic, openai) in one JSON document.

7) Minor public-facing error detail improvement

OpenAI fallback API status errors now include response body text when no structured error payload is present.

the custom path tried to make things easy but consumers
must manage there own token details as when its refreshed
they must update the values.

deleting all these custom paths also makes things much simpler
on our side
the custom path tried to make things easy but consumers
must manage there own token details as when its refreshed
they must update the values.

deleting all these custom paths also makes things much simpler
on our side
gruv added 6 commits April 6, 2026 14:46
for anthropic it will automatically add the cache tags, i have not tested
what happens when the user already set it but it will probably override it
just makes it easier to see what are the behaviours this test
is not extensive for all options that could be passed
also update all te test to test super set of all options we know of
this adds a cache at the last message, so we dont have to do it ourselves
@billybonks billybonks merged commit 5427256 into main Apr 6, 2026
1 check passed
@billybonks billybonks deleted the refactor/claude branch April 6, 2026 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant