Skip to content

fix: harden proxy security fallbacks and passthrough guards#964

Open
ding113 wants to merge 1 commit intodevfrom
security-audit-hardening-20260321
Open

fix: harden proxy security fallbacks and passthrough guards#964
ding113 wants to merge 1 commit intodevfrom
security-audit-hardening-20260321

Conversation

@ding113
Copy link
Owner

@ding113 ding113 commented Mar 21, 2026

Summary

Security hardening for the proxy layer that addresses three areas: endpoint policy separation for passthrough endpoints, session ID generation fallback, and circuit breaker accounting accuracy.

Problem

This PR addresses several security and reliability concerns:

  1. Endpoint Policy Gap: The /v1/responses/compact endpoint was using raw_passthrough policy, which bypassed essential guards (rate limiting, session tracking, request filters). This endpoint needs passthrough forwarding but should still enforce normal security guards.

  2. Session ID Content-Hash Collision Risk: When clients did not provide a session ID, the system fell back to a content-hash based session ID, which could merge different users/keys with identical message content into the same session - a security and accounting risk.

  3. Circuit Breaker False Positives: All non-200 status codes were being counted toward circuit breaker failures, including client-driven errors (400, 404, 422, etc.). This could incorrectly trip circuit breakers on provider health.

  4. Billing Header Exposure: Billing header values were being logged in full during audit, potentially exposing sensitive information.

  5. SSE Parsing Gap: The SSE detection only recognized event: and data: prefixes, missing legal SSE fields like id: and retry:.

Solution

1. Endpoint Policy Separation

  • Created a new guarded_passthrough policy kind
  • /v1/responses/compact now uses guarded_passthrough instead of raw_passthrough
  • /v1/messages/count_tokens keeps strict raw_passthrough
  • guarded_passthrough enables: normal guard pipeline, rate limiting, session tracking, circuit breaker accounting, request filters
  • guarded_passthrough preserves: raw forwarding (bypasses forwarder preprocessing, special settings, response rectifier)

2. Session ID Fallback Removal

  • Removed content-hash based session ID derivation
  • When no client session ID is provided, system now always generates a fresh opaque session ID
  • Eliminates cross-user/session collision risk

3. Circuit Breaker Accounting Refinement

  • Added shouldRecordProviderCircuitFailure() helper
  • Only provider-side failures count toward circuit breaker: 401, 402, 403, 408, 429, 451, and all 5xx
  • Client-driven errors excluded: 400, 404, 409, 413, 415, 422

4. Billing Header Redaction

  • Billing header values in audit logs now show [REDACTED] instead of actual content

5. SSE Field Recognition

  • isSSEText() now accepts id: and retry: as valid SSE field prefixes

Changes

Core Changes

  • src/app/v1/_lib/proxy/endpoint-policy.ts - Add guarded_passthrough policy kind
  • src/app/v1/_lib/proxy/endpoint-family-catalog.ts - Change response-compact accountingTier to required_usage
  • src/app/v1/_lib/proxy/circuit-breaker-accounting.ts - New helper for circuit breaker status filtering
  • src/app/v1/_lib/proxy/response-handler.ts - Use refined circuit breaker accounting
  • src/lib/session-manager.ts - Remove content-hash fallback, always generate fresh session
  • src/lib/utils/sse.ts - Accept id: and retry: SSE prefixes
  • src/app/v1/_lib/proxy/billing-header-rectifier.ts - Redact billing header values

Supporting Changes

  • Test updates across multiple test files to reflect new behavior

Testing

Automated Tests

  • Unit tests added for circuit breaker accounting (circuit-breaker-accounting.test.ts)
  • Unit tests added for session ID fallback (session-manager-session-id-fallback.test.ts)
  • Updated endpoint policy tests for new guarded_passthrough policy
  • Updated SSE parsing tests for id: and retry: recognition
  • Updated billing header tests for redaction verification

Verification

  • npm run typecheck
  • npm run lint
  • npm run build
  • Targeted vitest suites covering passthrough, session fallback, SSE usage parsing, and circuit-breaker regressions

Description enhanced by Claude AI

Greptile Summary

This PR hardens proxy security and reliability across five distinct areas: it splits /v1/responses/compact into a new "guarded_passthrough" policy that restores normal guards/rate-limiting/session tracking while keeping raw forwarding semantics; stops reusing content-hash-derived session IDs when no client session ID is provided (eliminating cross-user session merging); redacts billing header values in audit payloads; fixes SSE detection to accept legal id: and retry: field prefixes; and narrows circuit-breaker accounting to only fire on genuine provider-side failures (excluding client-driven 4xx codes).

  • endpoint-policy.ts: New "guarded_passthrough" kind added; /v1/responses/compact moved from rawPassthroughEndpointPathSet to guardedPassthroughEndpointPathSet restoring "chat" guard preset, concurrent tracking, and circuit-breaker accounting while keeping bypassForwarderPreprocessing: true.
  • circuit-breaker-accounting.ts: New shouldRecordProviderCircuitFailure() helper applied consistently at all four recording sites in response-handler.ts; prevents 400/404/409/413/415/422 from polluting circuit-breaker state. HTTP 403 is included as a provider-side failure — this is debatable as 403 can also be client-permission-driven.
  • session-manager.ts: Content-hash fallback (Redis hash→session mapping) fully removed; fallback now always generates a fresh opaque session ID, closing the cross-user session merge vulnerability.
  • billing-header-rectifier.ts: Extracted values in audit results are now always [REDACTED] instead of the raw header content. The new redactBillingHeaderAuditValue helper contains a dead-code branch (if (!pattern.test(value))) since every call site already verified the pattern.
  • sse.ts: isSSEText extended to recognise id: and retry: prefixes; parseSSEData correctly ignores them per spec (connection-level fields), but this asymmetry is undocumented.
  • All changes are covered by targeted unit tests, including regression guards for the session fallback, circuit-breaker filtering, SSE usage extraction, and endpoint policy separation.

Confidence Score: 4/5

  • PR is safe to merge; all changes are well-tested and address real security/reliability gaps with no logic regressions observed.
  • The five changes are independent, well-scoped, and each backed by targeted tests. No regressions to the primary request path are introduced. The two minor concerns — dead-code in redactBillingHeaderAuditValue and the debatable inclusion of HTTP 403 in the circuit-breaker failure set — are non-blocking style/design notes that do not affect runtime correctness.
  • circuit-breaker-accounting.ts (403 classification) and billing-header-rectifier.ts (dead-code branch) warrant a quick look before merge, but neither blocks shipping.

Important Files Changed

Filename Overview
src/app/v1/_lib/proxy/billing-header-rectifier.ts Adds redaction of the extracted billing-header audit values; the new helper function contains a dead-code branch since all callers already verify the pattern before calling it.
src/app/v1/_lib/proxy/circuit-breaker-accounting.ts New helper that filters which HTTP status codes should count toward the circuit breaker; correctly excludes client-driven 4xx errors, though HTTP 403 inclusion is debatable.
src/app/v1/_lib/proxy/endpoint-family-catalog.ts Promotes response-compact accounting tier from "none" to "required_usage", enabling usage tracking for /v1/responses/compact.
src/app/v1/_lib/proxy/endpoint-policy.ts Introduces a new "guarded_passthrough" policy kind for /v1/responses/compact that restores normal guards, rate-limiting, session tracking, and circuit-breaker accounting while keeping passthrough forwarding semantics.
src/app/v1/_lib/proxy/response-handler.ts Consistently applies shouldRecordProviderCircuitFailure() across all four circuit-breaker recording sites, replacing the previous ad-hoc !== 404 guard.
src/lib/session-manager.ts Removes the content-hash-based session reuse fallback, replacing it with a simple fresh-session generation when no client session ID is provided; eliminates the cross-user session-merging vulnerability.
src/lib/utils/sse.ts Expands isSSEText to recognise id: and retry: as valid SSE field prefixes; parseSSEData intentionally does not parse these fields per SSE spec, but this asymmetry is undocumented.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    REQ[Incoming Request] --> RESOLVE[resolveEndpointPolicy]

    RESOLVE -->|/v1/messages/count_tokens| RAW[raw_passthrough policy\nguardPreset: raw_passthrough\nbypassRequestFilters: true\ntrackConcurrentRequests: false\nallowCircuitBreaker: false]
    RESOLVE -->|/v1/responses/compact| GUARDED[guarded_passthrough policy\nguardPreset: chat\nbypassForwarderPreprocessing: true\ntrackConcurrentRequests: true\nallowCircuitBreaker: true]
    RESOLVE -->|/v1/messages etc.| DEFAULT[default policy\nguardPreset: chat\nallowRetry: true\nallowProviderSwitch: true]

    GUARDED --> CB{shouldRecordProvider\nCircuitFailure?}
    DEFAULT --> CB

    CB -->|401,402,403,408,429,451,5xx| RECORD[recordFailure]
    CB -->|400,404,409,413,415,422| SKIP[skip circuit breaker]

    GUARDED --> SESSION{clientSessionId\nprovided?}
    SESSION -->|yes| REUSE[reuse / refresh TTL]
    SESSION -->|no| FRESH[generateSessionId\nfresh opaque session]
Loading
Prompt To Fix All With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/billing-header-rectifier.ts
Line: 22-28

Comment:
**Dead-code branch in `redactBillingHeaderAuditValue`**

Every call site already guards with `BILLING_HEADER_PATTERN.test(...)` before invoking this function, so the `if (!BILLING_HEADER_PATTERN.test(value))` branch can never be reached in practice. The `return value.trim()` path is unreachable dead code. The function could be simplified to:

```suggestion
function redactBillingHeaderAuditValue(): string {
  return REDACTED_BILLING_HEADER_VALUE;
}
```

Or, alternatively, inline `REDACTED_BILLING_HEADER_VALUE` directly at the two call sites and remove the helper altogether. As-is, the function gives the misleading impression that it might return the original value, which could cause confusion during future refactors.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/circuit-breaker-accounting.ts
Line: 1-9

Comment:
**HTTP 403 classification as a provider-side failure**

`403 Forbidden` is included in `PROVIDER_FAILURE_STATUSES`, but 403 is ambiguous in a proxy context: it is often client-driven (e.g., a client API key that lacks permission for the requested resource / model), not a signal that the upstream provider itself is unhealthy. Treating every 403 as a provider circuit failure could trigger unnecessary failovers when many clients send permission-constrained requests.

The PR already correctly excludes clear client-error codes (400, 404, 409, 413, 415, 422), and the same reasoning applies to 403. If the intent is specifically to trip the circuit on upstream key-revocation responses, consider adding a comment explaining the reasoning, or narrow the check to only treat 403 as a provider failure when there is additional context (e.g., an upstream error body) indicating a key-level access denial rather than a scope/permission mismatch.

How can I resolve this? If you propose a fix, please make it concise.

---

This is a comment left during a code review.
Path: src/lib/utils/sse.ts
Line: 73-85

Comment:
**`parseSSEData` silently drops `id:` and `retry:` fields**

`isSSEText` now correctly recognises responses whose first meaningful line starts with `id:` or `retry:`, which is good. However, `parseSSEData` does not handle these two fields — they fall through the if-else chain and are silently dropped. This matches the HTML SSE spec (these fields affect the connection-level reconnection state, not the event payload), but it is worth a comment to make the intent clear and to prevent a future contributor from treating the omission as a bug:

```suggestion
  const sseFieldPrefixes = ["event:", "data:", "id:", "retry:"];
  // Note: `id:` and `retry:` are valid SSE field prefixes and must be
  // recognised here so isSSEText returns true for responses that begin
  // with them. parseSSEData intentionally does NOT process these fields
  // because they control SSE reconnection state, not event data.
  let start = 0;
```

How can I resolve this? If you propose a fix, please make it concise.

Last reviewed commit: "fix(security): harde..."

Greptile also left 3 inline comments on this PR.

@coderabbitai
Copy link

coderabbitai bot commented Mar 21, 2026

📝 Walkthrough

概览

本 PR 引入了一系列针对代理端点处理的结构性变更,包括:计费头部脱敏、电路断路器会计逻辑集中化、将 /v1/responses/compact 端点从原始直传改为受保护直传策略、移除会话 ID 的内容哈希重用机制,以及对应的单元测试更新。

变更

内聚组 / 文件 摘要
计费头部脱敏
src/app/v1/_lib/proxy/billing-header-rectifier.ts, tests/unit/proxy/billing-header-rectifier.test.ts
添加了 redactBillingHeaderAuditValue() 函数和 REDACTED_BILLING_HEADER_VALUE 常量,使提取的审计值在匹配 x-anthropic-billing-header: 模式时返回脱敏占位符而非原始值。测试预期值已更新为脱敏形式。
电路断路器会计
src/app/v1/_lib/proxy/circuit-breaker-accounting.ts, src/app/v1/_lib/proxy/response-handler.ts, tests/unit/proxy/circuit-breaker-accounting.test.ts, tests/unit/proxy/response-handler-endpoint-circuit-isolation.test.ts
引入 shouldRecordProviderCircuitFailure() 函数,根据 HTTP 状态码(5xx 及 401/402/403/408/429/451)判断是否记录电路故障。响应处理器中多个位置已用此集中化谓词替代硬编码的 404 排除逻辑。
端点策略重构
src/app/v1/_lib/proxy/endpoint-policy.ts, src/app/v1/_lib/proxy/endpoint-family-catalog.ts, tests/unit/proxy/endpoint-policy.test.ts, tests/unit/proxy/endpoint-policy-parity.test.ts, tests/unit/proxy/session.test.ts, tests/unit/proxy/guard-pipeline-warmup.test.ts
EndpointPolicy.kind 扩展为包含新的 "guarded_passthrough" 选项。RESPONSES_COMPACT 端点现映射至此新策略(包含标准保护配置),而 MESSAGES_COUNT_TOKENS 保持原始直传行为。相应地,endpoint-family-catalogresponse-compactaccountingTier 已从 "none" 改为 "required_usage"
会话 ID 衍生简化
src/lib/session-manager.ts, tests/unit/lib/session-manager-session-id-fallback.test.ts
移除了基于消息内容哈希的会话重用机制及关联的 Redis 查询和 storeSessionMapping 助手。无客户端会话 ID 时,现在总是生成新的不透明会话 ID。新增测试验证相同内容的多次调用产生不同的会话 ID。
SSE 检测扩展
src/lib/utils/sse.ts, tests/unit/proxy/extract-usage-metrics.test.ts
isSSEText 函数扩展为识别 id:retry: 前缀(原仅识别 event:data:),增强 SSE 流检测的覆盖范围。新测试案例验证包含 id:retry: 行的 SSE 流解析。
代理处理器端点隔离
tests/unit/proxy/proxy-handler-session-id-error.test.ts
将参数化测试拆分为两个独立测试:count_tokens 端点(预期仅 ["startRequest"] 跟踪调用)和 responses/compact 端点(预期 ["inc", "startRequest", "dec"] 跟踪调用),反映两个端点的差异化处理逻辑。

代码审查工作量估计

🎯 4 (复杂) | ⏱️ ~60 分钟

可能相关的 PR

  • PR #801:同时修改代理端点策略表面(端点策略、会话端点解析、保护管道行为、直传端点的转发器/响应处理器门控),代码级别直接相关。
  • PR #784:修改同一 billing-header-rectifier 模块,主 PR 添加了提取值脱敏而该 PR 引入了脱敏器本身,直接相关。
  • PR #793:都改变会话 ID 衍生逻辑以停止使用确定性/回退会话 ID 并避免跨会话冲突,相关联。
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed 标题准确概括了拉取请求的主要变化:强化代理安全性和后传递保护。
Description check ✅ Passed 描述与变更集相关,列出了核心改进:恢复守卫、限流和会话跟踪、停止会话ID重用、审计日志脱敏、支持SSE前缀和电路断路器改进。

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch security-audit-hardening-20260321

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on enhancing the security and reliability of the proxy service. It addresses potential security vulnerabilities by redacting sensitive billing information and improves the accuracy of the circuit breaker by only considering provider-side errors. Additionally, it restores essential guards and tracking mechanisms for a specific endpoint while maintaining its passthrough functionality, and it hardens session ID generation.

Highlights

  • Security Hardening: Redacted billing headers in audit payloads to prevent sensitive information leakage.
  • Passthrough Guard Restoration: Re-enabled normal guards, rate-limiting, and session tracking for the /v1/responses/compact endpoint while preserving passthrough forwarding.
  • Session ID Handling: Stopped reusing content-hash session IDs when the client does not provide a session ID, ensuring better session isolation.
  • Circuit Breaker Improvement: Limited circuit breaker accounting to provider-side statuses, preventing client-induced errors from triggering the breaker.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@github-actions github-actions bot added bug Something isn't working area:core area:session area:provider size/L Large PR (< 1000 lines) labels Mar 21, 2026
Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces several security and reliability improvements. A new 'guarded passthrough' policy is implemented for the /v1/responses/compact endpoint, which restores standard guards while preserving passthrough forwarding. The fallback mechanism of using content hashes for session IDs has been removed to prevent potential session collisions. Other changes include refining circuit breaker logic to only count provider-side failures, redacting billing header payloads in audit logs, and improving SSE parsing to be more spec-compliant. The changes are consistent and well-supported by updated and new tests.

Comment on lines +22 to +28
function redactBillingHeaderAuditValue(value: string): string {
if (!BILLING_HEADER_PATTERN.test(value)) {
return value.trim();
}

return REDACTED_BILLING_HEADER_VALUE;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Dead-code branch in redactBillingHeaderAuditValue

Every call site already guards with BILLING_HEADER_PATTERN.test(...) before invoking this function, so the if (!BILLING_HEADER_PATTERN.test(value)) branch can never be reached in practice. The return value.trim() path is unreachable dead code. The function could be simplified to:

Suggested change
function redactBillingHeaderAuditValue(value: string): string {
if (!BILLING_HEADER_PATTERN.test(value)) {
return value.trim();
}
return REDACTED_BILLING_HEADER_VALUE;
}
function redactBillingHeaderAuditValue(): string {
return REDACTED_BILLING_HEADER_VALUE;
}

Or, alternatively, inline REDACTED_BILLING_HEADER_VALUE directly at the two call sites and remove the helper altogether. As-is, the function gives the misleading impression that it might return the original value, which could cause confusion during future refactors.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/billing-header-rectifier.ts
Line: 22-28

Comment:
**Dead-code branch in `redactBillingHeaderAuditValue`**

Every call site already guards with `BILLING_HEADER_PATTERN.test(...)` before invoking this function, so the `if (!BILLING_HEADER_PATTERN.test(value))` branch can never be reached in practice. The `return value.trim()` path is unreachable dead code. The function could be simplified to:

```suggestion
function redactBillingHeaderAuditValue(): string {
  return REDACTED_BILLING_HEADER_VALUE;
}
```

Or, alternatively, inline `REDACTED_BILLING_HEADER_VALUE` directly at the two call sites and remove the helper altogether. As-is, the function gives the misleading impression that it might return the original value, which could cause confusion during future refactors.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines +1 to +9
const PROVIDER_FAILURE_STATUSES = new Set([401, 402, 403, 408, 429, 451]);

export function shouldRecordProviderCircuitFailure(statusCode: number): boolean {
if (statusCode >= 500) {
return true;
}

return PROVIDER_FAILURE_STATUSES.has(statusCode);
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 HTTP 403 classification as a provider-side failure

403 Forbidden is included in PROVIDER_FAILURE_STATUSES, but 403 is ambiguous in a proxy context: it is often client-driven (e.g., a client API key that lacks permission for the requested resource / model), not a signal that the upstream provider itself is unhealthy. Treating every 403 as a provider circuit failure could trigger unnecessary failovers when many clients send permission-constrained requests.

The PR already correctly excludes clear client-error codes (400, 404, 409, 413, 415, 422), and the same reasoning applies to 403. If the intent is specifically to trip the circuit on upstream key-revocation responses, consider adding a comment explaining the reasoning, or narrow the check to only treat 403 as a provider failure when there is additional context (e.g., an upstream error body) indicating a key-level access denial rather than a scope/permission mismatch.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/app/v1/_lib/proxy/circuit-breaker-accounting.ts
Line: 1-9

Comment:
**HTTP 403 classification as a provider-side failure**

`403 Forbidden` is included in `PROVIDER_FAILURE_STATUSES`, but 403 is ambiguous in a proxy context: it is often client-driven (e.g., a client API key that lacks permission for the requested resource / model), not a signal that the upstream provider itself is unhealthy. Treating every 403 as a provider circuit failure could trigger unnecessary failovers when many clients send permission-constrained requests.

The PR already correctly excludes clear client-error codes (400, 404, 409, 413, 415, 422), and the same reasoning applies to 403. If the intent is specifically to trip the circuit on upstream key-revocation responses, consider adding a comment explaining the reasoning, or narrow the check to only treat 403 as a provider failure when there is additional context (e.g., an upstream error body) indicating a key-level access denial rather than a scope/permission mismatch.

How can I resolve this? If you propose a fix, please make it concise.

Comment on lines 73 to +85
@@ -81,7 +82,7 @@ export function isSSEText(text: string): boolean {
if (!line) continue;
if (line.startsWith(":")) continue;

return line.startsWith("event:") || line.startsWith("data:");
return sseFieldPrefixes.some((prefix) => line.startsWith(prefix));
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 parseSSEData silently drops id: and retry: fields

isSSEText now correctly recognises responses whose first meaningful line starts with id: or retry:, which is good. However, parseSSEData does not handle these two fields — they fall through the if-else chain and are silently dropped. This matches the HTML SSE spec (these fields affect the connection-level reconnection state, not the event payload), but it is worth a comment to make the intent clear and to prevent a future contributor from treating the omission as a bug:

Suggested change
const sseFieldPrefixes = ["event:", "data:", "id:", "retry:"];
// Note: `id:` and `retry:` are valid SSE field prefixes and must be
// recognised here so isSSEText returns true for responses that begin
// with them. parseSSEData intentionally does NOT process these fields
// because they control SSE reconnection state, not event data.
let start = 0;
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/utils/sse.ts
Line: 73-85

Comment:
**`parseSSEData` silently drops `id:` and `retry:` fields**

`isSSEText` now correctly recognises responses whose first meaningful line starts with `id:` or `retry:`, which is good. However, `parseSSEData` does not handle these two fields — they fall through the if-else chain and are silently dropped. This matches the HTML SSE spec (these fields affect the connection-level reconnection state, not the event payload), but it is worth a comment to make the intent clear and to prevent a future contributor from treating the omission as a bug:

```suggestion
  const sseFieldPrefixes = ["event:", "data:", "id:", "retry:"];
  // Note: `id:` and `retry:` are valid SSE field prefixes and must be
  // recognised here so isSSEText returns true for responses that begin
  // with them. parseSSEData intentionally does NOT process these fields
  // because they control SSE reconnection state, not event data.
  let start = 0;
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)
src/lib/utils/sse.ts (1)

68-73: ⚠️ Potential issue | 🟡 Minor

文档注释与实现不一致。

第 70 行的注释说明 "只认行首的 event: / data:",但实现已扩展为包含 id:retry: 前缀。建议更新注释以保持一致性。

建议修复
 /**
  * 严格检测文本是否"看起来像" SSE。
  *
- * 只认行首的 `event:` / `data:`(或前置注释行 `:`),避免 JSON 里包含 "data:" 误判。
+ * 只认行首的 `event:` / `data:` / `id:` / `retry:`(或前置注释行 `:`),避免 JSON 里包含 "data:" 误判。
  */
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/utils/sse.ts` around lines 68 - 73, The doc comment above is out of
sync with the implementation: update the comment for isSSEText to reflect that
it recognizes line-start SSE fields beyond just `event:`/`data:` (specifically
`id:` and `retry:` as listed in the sseFieldPrefixes constant) or alternatively
narrow sseFieldPrefixes to match the original comment; modify the prose around
the function (the block comment above isSSEText) to explicitly list `event:`,
`data:`, `id:`, and `retry:` so the documentation and the sseFieldPrefixes array
stay consistent.
src/lib/session-manager.ts (1)

262-327: ⚠️ Potential issue | 🟡 Minor

calculateMessagesHash 是死代码,应该删除。

搜索确认该方法没有任何调用者。getOrCreateSessionId 在所有分支中都不调用此方法,而是直接调用 generateSessionId() 生成新的 session ID。整个代码库中不存在对 calculateMessagesHash 的实际函数调用。

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/lib/session-manager.ts` around lines 262 - 327, Remove the dead static
method calculateMessagesHash from the SessionManager class: delete the entire
calculateMessagesHash implementation (including its internal use of crypto and
related logging) since there are no callers (getOrCreateSessionId uses
generateSessionId instead and the codebase has no references). After removal,
run a quick search for calculateMessagesHash and remove any now-unused imports
(e.g., crypto or logger) from the file if they become unused.
🧹 Nitpick comments (2)
src/app/v1/_lib/proxy/circuit-breaker-accounting.ts (1)

1-9: 建议添加文档注释说明状态码分类逻辑。

当前实现逻辑正确,但缺少解释为何选择这些特定状态码的文档。这有助于未来维护者理解设计意图。

📝 建议的文档注释
+/**
+ * 供应商侧故障状态码(非 5xx):
+ * - 401/402/403: 认证/授权/计费问题
+ * - 408: 请求超时
+ * - 429: 速率限制
+ * - 451: 法律原因不可用
+ */
 const PROVIDER_FAILURE_STATUSES = new Set([401, 402, 403, 408, 429, 451]);

+/**
+ * 判断是否应记录供应商熔断失败。
+ * 仅计入供应商侧状态码(5xx + 特定 4xx),排除客户端驱动的错误(如 400/404/422)。
+ */
 export function shouldRecordProviderCircuitFailure(statusCode: number): boolean {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@src/app/v1/_lib/proxy/circuit-breaker-accounting.ts` around lines 1 - 9, Add
a short doc comment above the PROVIDER_FAILURE_STATUSES constant and the
shouldRecordProviderCircuitFailure function that explains the classification:
treat any 500+ as provider/server failures, and include the rationale for each
explicit code in PROVIDER_FAILURE_STATUSES (401 auth failures, 402/403
billing/forbidden, 408 request timeout, 429 rate limiting, 451
legal/unavailable) being treated as provider-side failures for circuit-breaker
accounting; mention that these are used to decide when to record provider
circuit failures and that the >=500 branch handles generic server errors.
tests/unit/proxy/circuit-breaker-accounting.test.ts (1)

1-16: LGTM! 测试覆盖全面。

测试正确验证了 shouldRecordProviderCircuitFailure 对已知状态码的行为。

可选改进:考虑添加边界情况测试(如 499、501 等)以增强覆盖。

🧪 可选的边界测试
+  test.each([
+    499, 501, 599,
+  ])("handles boundary status %s correctly", (statusCode) => {
+    const expected = statusCode >= 500;
+    expect(shouldRecordProviderCircuitFailure(statusCode)).toBe(expected);
+  });
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/proxy/circuit-breaker-accounting.test.ts` around lines 1 - 16, Add
boundary-case test entries to the existing shouldRecordProviderCircuitFailure
test matrix to cover edge status codes (for example 499 and 501) so we validate
behavior around the 4xx/5xx boundaries; update the test.each arrays in the
describe("shouldRecordProviderCircuitFailure") block to include these additional
status codes and keep expectations consistent with the function
shouldRecordProviderCircuitFailure.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@tests/unit/proxy/response-handler-endpoint-circuit-isolation.test.ts`:
- Line 442: The test passes an extra body argument to createNon200StreamResponse
but the function currently only accepts statusCode and uses a hardcoded body;
update createNon200StreamResponse to accept an optional body parameter (e.g.,
body?: string | object) and use that when constructing the Response (instead of
the hardcoded { error: "rate limit exceeded" }), and update the test call in
response-handler-endpoint-circuit-isolation.test.ts to pass the JSON string (or
an object) so the response body matches the test intent; alternatively, if you
prefer the simpler fix, remove the extra second argument from the test call so
createNon200StreamResponse(statusCode) is used as originally implemented.

---

Outside diff comments:
In `@src/lib/session-manager.ts`:
- Around line 262-327: Remove the dead static method calculateMessagesHash from
the SessionManager class: delete the entire calculateMessagesHash implementation
(including its internal use of crypto and related logging) since there are no
callers (getOrCreateSessionId uses generateSessionId instead and the codebase
has no references). After removal, run a quick search for calculateMessagesHash
and remove any now-unused imports (e.g., crypto or logger) from the file if they
become unused.

In `@src/lib/utils/sse.ts`:
- Around line 68-73: The doc comment above is out of sync with the
implementation: update the comment for isSSEText to reflect that it recognizes
line-start SSE fields beyond just `event:`/`data:` (specifically `id:` and
`retry:` as listed in the sseFieldPrefixes constant) or alternatively narrow
sseFieldPrefixes to match the original comment; modify the prose around the
function (the block comment above isSSEText) to explicitly list `event:`,
`data:`, `id:`, and `retry:` so the documentation and the sseFieldPrefixes array
stay consistent.

---

Nitpick comments:
In `@src/app/v1/_lib/proxy/circuit-breaker-accounting.ts`:
- Around line 1-9: Add a short doc comment above the PROVIDER_FAILURE_STATUSES
constant and the shouldRecordProviderCircuitFailure function that explains the
classification: treat any 500+ as provider/server failures, and include the
rationale for each explicit code in PROVIDER_FAILURE_STATUSES (401 auth
failures, 402/403 billing/forbidden, 408 request timeout, 429 rate limiting, 451
legal/unavailable) being treated as provider-side failures for circuit-breaker
accounting; mention that these are used to decide when to record provider
circuit failures and that the >=500 branch handles generic server errors.

In `@tests/unit/proxy/circuit-breaker-accounting.test.ts`:
- Around line 1-16: Add boundary-case test entries to the existing
shouldRecordProviderCircuitFailure test matrix to cover edge status codes (for
example 499 and 501) so we validate behavior around the 4xx/5xx boundaries;
update the test.each arrays in the
describe("shouldRecordProviderCircuitFailure") block to include these additional
status codes and keep expectations consistent with the function
shouldRecordProviderCircuitFailure.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 55675d5f-b644-4ba2-93a8-7e5118ad2e6d

📥 Commits

Reviewing files that changed from the base of the PR and between f2ab6d2 and 0a37351.

📒 Files selected for processing (18)
  • src/app/v1/_lib/proxy/billing-header-rectifier.ts
  • src/app/v1/_lib/proxy/circuit-breaker-accounting.ts
  • src/app/v1/_lib/proxy/endpoint-family-catalog.ts
  • src/app/v1/_lib/proxy/endpoint-policy.ts
  • src/app/v1/_lib/proxy/response-handler.ts
  • src/lib/session-manager.ts
  • src/lib/utils/sse.ts
  • tests/unit/lib/session-manager-session-id-fallback.test.ts
  • tests/unit/proxy/billing-header-rectifier.test.ts
  • tests/unit/proxy/circuit-breaker-accounting.test.ts
  • tests/unit/proxy/endpoint-family-catalog.test.ts
  • tests/unit/proxy/endpoint-policy-parity.test.ts
  • tests/unit/proxy/endpoint-policy.test.ts
  • tests/unit/proxy/extract-usage-metrics.test.ts
  • tests/unit/proxy/guard-pipeline-warmup.test.ts
  • tests/unit/proxy/proxy-handler-session-id-error.test.ts
  • tests/unit/proxy/response-handler-endpoint-circuit-isolation.test.ts
  • tests/unit/proxy/session.test.ts

upstreamStatusCode: 400,
});

const response = createNon200StreamResponse(400, '{"error":{"message":"Invalid request"}}');
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

函数签名不匹配:多余的参数被忽略。

createNon200StreamResponse 函数(第 276-289 行)只接受一个 statusCode 参数,但这里传递了两个参数。第二个参数 '{"error":{"message":"Invalid request"}}' 会被忽略,实际使用的是硬编码的 { error: "rate limit exceeded" }

测试仍可能通过(因为核心断言是检查 recordFailure 未被调用),但响应体内容与测试意图不符。

🐛 修复方案 A:更新函数以支持自定义 body
-function createNon200StreamResponse(statusCode: number): Response {
-  const body = `data: ${JSON.stringify({ error: "rate limit exceeded" })}\n\n`;
+function createNon200StreamResponse(statusCode: number, errorBody?: string): Response {
+  const body = errorBody
+    ? `data: ${errorBody}\n\n`
+    : `data: ${JSON.stringify({ error: "rate limit exceeded" })}\n\n`;
   const encoder = new TextEncoder();
🐛 修复方案 B:移除多余参数
-    const response = createNon200StreamResponse(400, '{"error":{"message":"Invalid request"}}');
+    const response = createNon200StreamResponse(400);
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const response = createNon200StreamResponse(400, '{"error":{"message":"Invalid request"}}');
const response = createNon200StreamResponse(400);
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/unit/proxy/response-handler-endpoint-circuit-isolation.test.ts` at line
442, The test passes an extra body argument to createNon200StreamResponse but
the function currently only accepts statusCode and uses a hardcoded body; update
createNon200StreamResponse to accept an optional body parameter (e.g., body?:
string | object) and use that when constructing the Response (instead of the
hardcoded { error: "rate limit exceeded" }), and update the test call in
response-handler-endpoint-circuit-isolation.test.ts to pass the JSON string (or
an object) so the response body matches the test intent; alternatively, if you
prefer the simpler fix, remove the extra second argument from the test call so
createNon200StreamResponse(statusCode) is used as originally implemented.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 0a373518ea

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +52 to +54
allowRetry: false,
allowProviderSwitch: false,
allowCircuitBreakerAccounting: true,

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Count guarded passthrough failures before aborting in forwarder

For /v1/responses/compact, this policy now says allowCircuitBreakerAccounting=true while keeping allowRetry=false. In ProxyForwarder.send(), every !allowRetry endpoint takes the raw-passthrough fast path and throws before any recordFailure() call, and doForward() turns non-2xx upstream responses into ProxyErrors before ProxyResponseHandler can make up the difference. In practice, repeated 401/429/500s on responses/compact will never open the provider circuit, so this endpoint keeps hammering a bad provider even though the new policy claims to restore breaker accounting.

Useful? React with 👍 / 👎.

Comment on lines +394 to 400
// 2. 安全降级:客户端未提供 session_id 时,始终生成新的 opaque session。
// 旧的 content-hash 复用会把不同 key/user 的同构请求错误合并进同一 session。
logger.warn("SessionManager: No client session ID, generating fresh opaque session", {
keyId,
messagesLength,
});
return SessionManager.generateSessionId();

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve a reusable fallback session for clients without session_id

When clientSessionId is missing, this now always generates a brand-new opaque session. Successful responses only attach x-cch-session-id on >=400 responses (error-session-id.ts), and only Claude requests get any best-effort metadata completion, so OpenAI/Codex/Gemini clients that do not already send a session token can never learn or reuse the generated ID. For those callers, every 2xx turn becomes a fresh session, which breaks provider stickiness, request sequencing, and session grouping that previously worked via the fallback path.

Useful? React with 👍 / 👎.

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review Summary

No significant issues identified in this PR.

PR Size: L

  • Lines changed: 567
  • Files changed: 18
  • Split suggestion: Consider splitting into (1) endpoint-policy + passthrough guard changes, (2) circuit-breaker accounting changes, (3) SessionManager session_id fallback change, (4) SSE detection + usage parsing, (5) billing header rectifier redaction.

Review Coverage

  • Logic and correctness - Clean
  • Security (OWASP Top 10) - Clean
  • Error handling - Clean
  • Type safety - Clean
  • Documentation accuracy - Clean
  • Test coverage - Adequate
  • Code clarity - Good

Automated review by Codex AI

Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Identified open PR: #964 (“fix: harden proxy security fallbacks and passthrough guards”).
  • Calculated PR size: L (567 lines changed across 18 files) and applied label size/L.
  • Completed a diff-scoped review across the requested perspectives; no issues met the >= 80 confidence threshold, so no inline review comments were posted.
  • Submitted the required summary review via gh pr review --comment (includes L-size split suggestions).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:core area:provider area:session bug Something isn't working size/L Large PR (< 1000 lines)

Projects

Status: Backlog

Development

Successfully merging this pull request may close these issues.

1 participant