fix: Anthropic proxy timeout — native SSE streaming + hardened HTTP client#485
Draft
worldofgeese wants to merge 4 commits intospacedriveapp:mainfrom
Draft
fix: Anthropic proxy timeout — native SSE streaming + hardened HTTP client#485worldofgeese wants to merge 4 commits intospacedriveapp:mainfrom
worldofgeese wants to merge 4 commits intospacedriveapp:mainfrom
Conversation
Contributor
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Increases default HTTP client timeout from 120s to 300s and adds connect_timeout, tcp_keepalive, and pool_idle_timeout settings to prevent corporate proxy idle-timeout kills during long-running LLM completions. - timeout: 120s → 300s (overall request timeout) - connect_timeout: 30s (connection establishment) - tcp_keepalive: 30s (TCP keepalive probes) - pool_idle_timeout: 90s (connection pool cleanup) Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Adds native Server-Sent Events (SSE) streaming support for Anthropic API
to prevent corporate proxy 504 Gateway Timeout errors during long-running
completions. Previously, non-streaming requests would idle and trigger
proxy timeouts; now both call_anthropic() and stream_anthropic() use SSE
with continuous data flow to keep the connection alive.
## Key changes
### SSE streaming infrastructure
- Custom SSE client with auto-decompression disabled (no_gzip/no_brotli/
no_deflate) to handle proxies that incorrectly advertise Content-Encoding
- build_anthropic_sse_request(): shared request builder with proper headers
(Accept: text/event-stream, accept-encoding: identity)
- parse_anthropic_sse_event(): unified event parser returning type-safe enum
- AnthropicSseEvent enum: structured representation of all Anthropic SSE events
### Error handling improvements
- Full error cause chain preservation: e.to_string() → format!("{e:#}")
- Better error messages for failed streams, JSON parsing, and API errors
- Graceful handling of malformed SSE chunks with tracing
### Refactoring
- Eliminated ~90% code duplication between call_anthropic() and stream_anthropic()
- Both methods now share the same request builder and event parser
- Removed debug response header logging (was added for proxy diagnosis)
### OAuth tool name mapping
- Preserved reverse-mapping for Claude Code canonical ↔ original tool names
- Handled in both streaming and non-streaming paths
## Compatibility
- All existing behavior preserved (tests pass, no API changes)
- SSE used internally for call_anthropic() but returns full CompletionResponse
- stream_anthropic() yields events as before
- Proper handling of thinking blocks, tool calls, and text deltas
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
4e0f03c to
0d7deee
Compare
Adds tracing::warn! for content-type, content-encoding, and transfer-encoding from the SSE response, plus a hex+text dump of the first 128/200 bytes of the first SSE chunk. This will reveal exactly what the proxy is sending.
Adds std::error::Error::source() to the 'Anthropic stream read failed' error message. This will show the underlying hyper/h2/io error that causes 'error decoding response body', helping identify whether the failure is a proxy timeout, connection reset, or actual decode error.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When using the Anthropic API through a corporate proxy (e.g.
https://$URL/claude/v1/messages), completions from Opus consistently fail:CompletionError: ProviderError: error sending request for url— the 120s basereqwestclient timeout expired before Opus finished thinking.504 Gateway Timeout— the proxy killed the idle TCP connection because the non-streaming Anthropic path sends a single POST and waits for the entire response as one blob, with no intermediate data to keep the connection alive.Root Cause
Three issues compound:
No Anthropic SSE streaming. The
stream()method forApiType::Anthropiccallsattempt_completion()(non-streaming) then wraps the result in a fake stream viastream_from_completion_response(). Corporate proxies with gateway timeouts (typically 120–300s) kill the idle connection before Opus finishes generating.Base HTTP client timeout too low. Both
reqwest::Client::builder()sites use a 120s flat timeout with noconnect_timeoutortcp_keepalive.Error chain discarded. Every
.map_err(|e| CompletionError::ProviderError(e.to_string()))only captures the top-level reqwest error. The actual cause (timeout, connection reset, proxy disconnect) is silently lost.The Fix
Commit 1: Hardened HTTP client + error diagnostics
src/llm/manager.rsconnect_timeout(30s)— fail fast on connection establishmenttcp_keepalive(30s)— keeps proxy connections alivepool_idle_timeout(90s)— prevents stale pooled connectionssrc/llm/model.rs.timeout(STREAM_REQUEST_TIMEOUT_SECS)tocall_anthropic()request builder (1800s, matching OpenAI)e.to_string()withformat!("{e:#}")for full error cause chain preservationCommit 2: Native Anthropic SSE streaming (~200 lines)
Adds
stream_anthropic()that:"stream": truein the request bodymessage_start,content_block_start/delta/stop,message_delta,message_stop,error,pingSupporting changes:
anthropic/params.rs: Addedbodyfield toAnthropicRequestandanthropic_messages_url()helperanthropic.rs: Re-exported new symbolsCommit 3: Compilation fixes + code review improvements
ReasoningContent::Textvariant (struct syntax withsignaturefield)message_idto preserveOption<String>typeStreamingCompletionResponseconstructor calltracing::warn!for malformed tool JSON (was silently replaced with{})Testing
cargo check --release: ✅ cleancargo test: 623 passed, 2 failed (pre-existing on v0.3.3 —config::tests::test_llm_provider_tables_parse_with_env_and_lowercase_keys,config::tests::toml_round_trip_with_named_instances)v0.3.3Impact