Skip to content

fix(desktop): bound problem report size with per-component byte budgets#1471

Merged
Astro-Han merged 3 commits into
devfrom
claude/diag-size-budgets
Jun 23, 2026
Merged

fix(desktop): bound problem report size with per-component byte budgets#1471
Astro-Han merged 3 commits into
devfrom
claude/diag-size-budgets

Conversation

@Astro-Han

@Astro-Han Astro-Han commented Jun 23, 2026

Copy link
Copy Markdown
Owner

Summary

Bound the "Prepare Diagnostics Package" export with per-component byte budgets — PR2 of the diagnostics-package rebuild (#1465). Each component is capped to its own budget after redaction and before the overall 5 MB truncation ladder (which becomes a fallback):

  • Per-component budgets — fixed and private (not a configurable API; no production caller tunes them): logTailBytes 1 MB, sessionMessagesBytes 1 MB, sessionMessageBytes 256 KB, rendererErrorDetailsBytes 64 KB (bounds both the renderer-error summary and details), rendererDiagnosticsBytes 1 MB. The cap helpers (capLogTailBytes / capMessageParts / capSessionMessagesBytes / headBytes / capEvents) are exported and unit-tested directly at small limits, with a few default-budget tests covering the end-to-end wiring.
  • Hard caps, no force-keep: capSessionMessagesBytes drops even the newest message if it does not fit, so the cap is a true ceiling; capMessageParts keeps the latest parts of an oversized turn (the tool output / error nearest the failure), matching the ladder's oldest-first drop order.
  • Renderer diagnostics re-budgeted after redaction: the slice is byte-capped at the source, but redaction here can re-expand it, so it is re-bounded with the same capEvents() ruler. The component pass and the overall fallback share that one mechanism (incident events dropped only when a tight limit leaves no alternative), replacing a divergent bespoke fallback loop that could leave an all-protected oversized slice unbounded.
  • UTF-8-safe byte slicing (headBytes / tailBytes) so capping never splits a multi-byte codepoint at the budget boundary.
  • Consistent truncation ledger: every omitted-bytes counter is tracked identically in the component pass and the overall fallback ladder, validated by isTruncation.
  • Structure-agnostic private-key redaction (problem-report-redact.ts): defense moved out of the truncation layer into the redactor as anchored, linear-time (ReDoS-safe) rules covering complete BEGIN..END blocks, BEGIN-truncated keys, markerless ≥2-line base64 bodies, orphaned END markers (separate line, same line, PGP =CRC checksum line), armor-headered bodies, and CRLF.

Why

The export could grow without bound: a single huge log tail, one giant session message, or an oversized renderer-error blob could dominate the report or trip the overall 5 MB cap in a way that silently drops other components. Per-component budgets make each part's ceiling explicit and keep the report balanced, with the overall ladder only as a fallback. The budget/truncation work also surfaced a real redaction gap: a pre-redaction tail cut could strand a private-key body whose BEGIN marker was removed, defeating a BEGIN-keyed redactor — so private-key defense now lives entirely in the redactor.

Related Issue

Refs #1465 (umbrella). This is PR2; PR1 #1470 is merged, PR3 #1472 follows. Rebased flat onto dev after #1470 merged.

Human Review Status

Pending

Review Focus

  • The redact-before-truncate invariant: redaction runs on the full payload before any component capping; the only pre-redaction truncation is tailFile (byte + line tail of the log), which now only drops a partial first line — multi-line-secret defense lives entirely in the redactor.
  • Hard caps are true ceilings (drop the newest message if it does not fit), and the omitted-bytes ledger stays consistent between the component pass and the fallback ladder.
  • Renderer diagnostics now go through one capEvents() ruler in both the component pass and the overall fallback (re-derived from the redacted baseline so the omitted ledger stays correct), instead of a separate fallback loop that broke on protected events.
  • UTF-8-safe byte slicing at budget boundaries.
  • The structure-agnostic private-key redactor rules (anchored, linear-time) and whether the enumerated marker/armor/CRLF cases are complete.

Risk Notes

Accepted residuals (confirmed by an independent adversarial review; left as documented tradeoffs rather than risking the ReDoS-safety / false-positive balance):

  • A private-key body wrapped at non-standard narrow widths (< 16 chars/line) whose BEGIN marker was truncated away keeps the same {16,} per-line floor used everywhere else (the floor avoids redacting legitimate short base64 tokens/hashes). Real tools emit 64–76 char lines, so this only affects deliberately re-wrapped keys. PR3's human-review step is the backstop.
  • The overall fallback ladder still uses UTF-16 slice() for log / renderer-error text. It is effectively unreachable under default budgets (component budgets sum well under 5 MB), so a lone-surrogate split there is cosmetic.
  • No visible UI in this PR (the in-app review panel is PR3 feat(desktop): user-reviewed diagnostics package export #1472) — the UI/copy checklist item below is left unticked for that reason; no screenshots.

How To Verify

bun test (packages/desktop-electron) — 656 pass, 0 fail (full suite, latest head)
bun run typecheck (packages/desktop-electron, tsgo -b) — clean
eslint (changed files) — clean
Targeted coverage: pure-helper edge cases (each cap helper at small limits), default-budget
  end-to-end wiring, hard-cap ceilings, ledger consistency, renderer-diagnostics re-budget +
  all-protected drain, UTF-8 boundary slicing, and ReDoS-perf regression on the private-key redactor.

Screenshots or Recordings

None — no visible UI in this PR (the in-app review panel is PR3 #1472).

Checklist

How to use this checklist:

  • Tick a box by replacing [ ] with [x]. Do not edit, add, or remove items.
  • The bot-applied label items can only be honestly ticked AFTER the PR is opened and the labeler / priority-triage bots have run — return to the PR description and tick them then.
  • Most items are required. The few that are conditional are explicitly marked (conditional); for those, leave unticked if they truly do not apply and explain why in Risk Notes. All other items must be ticked before requesting human review.
  • Type label — this PR carries exactly one of bug, enhancement, task, documentation. Type labels are author-added; the labeler bot does NOT assign them. Add the label in the GitHub UI, then tick this.
  • Routing labels — this PR carries at least one of app, ui, platform, harness, ci. The labeler bot assigns these on PR open based on changed paths. Confirm the bot's choice (or override if wrong), then tick this.
  • Priority label — this PR carries exactly one of P0, P1, P2, P3. The priority-triage bot suggests one on PR open. Confirm or override, then tick this.
  • Human Review Status above is set to Pending, Approved by @<reviewer>, or Not required: <reason> (default is Pending; "not required" is restricted to bot-authored low-risk PRs).
  • I linked the related issue, or stated in Summary why there is no issue.
  • I described the review focus and any meaningful risks.
  • I replaced the example block in How To Verify with the real verification steps and the key result for each.
  • I did not introduce unrelated refactors, dependencies, generated files, or file changes beyond the stated scope.
  • (conditional) I manually checked visible UI or copy changes when needed, with screenshots or recordings. Leave unticked only if no visible UI or copy changed.
  • (conditional) I considered macOS and Windows impact for platform, packaging, updater, signing, paths, shell, or permissions changes. Leave unticked only if no platform/packaging surface was touched.
  • (conditional) I called out docs, release notes, dependencies, permissions, credentials, deletion behavior, generated content, or local file changes when relevant. Leave unticked only if none of those surfaces was touched.
  • I reviewed the final diff for unrelated changes and suspicious dependency changes.
  • I am targeting dev, and my PR title and commit messages use Conventional Commits in English.

@Astro-Han Astro-Han added bug Something isn't working P1 High priority platform Electron shell, OS integration, packaging, updater, signing, paths, and permissions labels Jun 23, 2026
@coderabbitai

coderabbitai Bot commented Jun 23, 2026

Copy link
Copy Markdown
Contributor

Warning

Review limit reached

@Astro-Han, we couldn't start this review because you've reached your PR review rate limit.

More reviews will be available in 12 minutes and 56 seconds. Learn how PR review limits work.

Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file).

⌛ How to resolve this issue?

After more reviews become available, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits.

🚦 How do rate limits work?

CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan refill rate.

For paid Pro and Pro+ PR reviews, CodeRabbit uses rolling per-developer review limits. Reviews become available again as older review attempts age out of the rolling limit window.

Please see our Fair Usage Limits Policy for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: 69cf1ad5-ebf1-42f0-bfb8-543c9c66b07a

📥 Commits

Reviewing files that changed from the base of the PR and between da69e10 and baa61f6.

📒 Files selected for processing (8)
  • packages/desktop-electron/src/main/logging.test.ts
  • packages/desktop-electron/src/main/logging.ts
  • packages/desktop-electron/src/main/problem-report-redact.test.ts
  • packages/desktop-electron/src/main/problem-report-redact.ts
  • packages/desktop-electron/src/main/problem-report.test.ts
  • packages/desktop-electron/src/main/problem-report.ts
  • packages/desktop-electron/src/main/renderer-diagnostics-slice.test.ts
  • packages/desktop-electron/src/main/renderer-diagnostics-slice.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch claude/diag-size-budgets

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

The 5 MB overall truncation ladder was the only size bound, so a single
large source (a multi-MB log, a long session) could fill the whole report
and crowd out everything else while still passing under the total limit.

Add a per-component budget pass that runs after redaction and before the
overall ladder, bounding each source independently:
- log tail capped to its own budget, keeping the most recent lines
- session messages dropped oldest-first to a total budget
- renderer error details capped to a head budget (keeps the message + top
  of the stack)
The overall maxBytes ladder is now the fallback, not the primary bound.
A new truncation.omittedRendererErrorBytes field keeps the ledger honest.

Also bound collection and widen the incident window:
- tailFile reads only the last 512 KB of each log via a bounded fd read
  instead of loading the whole file, and caps any single pathological line
- renderer diagnostics slice widens its fixed 5-minute lookback (a manual
  trigger usually happens after the problem); when scoped by session/trace
  it goes wider still, with capEvents/maxBytes bounding the bytes

Refs #1465

Claude-Session: https://claude.ai/code/session_011KyY9wTxQu9oZLy4yPEi3W
@Astro-Han Astro-Han force-pushed the claude/diag-size-budgets branch from ed0094e to c7e3266 Compare June 23, 2026 11:57
@Astro-Han Astro-Han changed the base branch from claude/diag-privacy-redaction to dev June 23, 2026 11:57
- capMessageParts: keep the LATEST parts when trimming an oversized
  message (the tool output / error nearest the failure), matching the
  rest of the ladder which drops oldest-first, instead of keeping the
  start of the turn.
- Drop the test-only `Options.budgets` override: budgets are fixed and
  not a public API. Make COMPONENT_BUDGETS private and export the pure
  cap helpers (headBytes / capLogTailBytes / capMessageParts /
  capSessionMessagesBytes) so their small-limit edge cases are tested
  directly instead of via a production override.
- Bound scoped (session/trace) renderer slices to the same default
  30-min lookback as unscoped slices; an explicit `from` widens the
  window. Removes the unbounded NEGATIVE_INFINITY lower bound so a
  manual trigger collects minimally by default.

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested priority: P2 (includes user-path files (packages/desktop-electron/src/main/logging.test.ts, packages/desktop-electron/src/main/logging.ts, packages/desktop-electron/src/main/problem-report-redact.test.ts, packages/desktop-electron/src/main/problem-report-redact.ts, packages/desktop-electron/src/main/problem-report.test.ts, packages/desktop-electron/src/main/problem-report.ts, packages/desktop-electron/src/main/renderer-diagnostics-slice.test.ts, packages/desktop-electron/src/main/renderer-diagnostics-slice.ts)).

P1/P0 are reserved for maintainer confirmation. Please relabel manually if this is a release blocker, security issue, data-loss risk, or updater/runtime failure.

@Astro-Han

Copy link
Copy Markdown
Owner Author

Addressed the review in place (no split) — pushed as 86bf454.

Changes

  • P2 — capMessageParts kept the earliest parts. Flipped to keep the latest parts (parts.slice(parts.length - kept) in both the binary search and the final trim). The tail of a turn is the tool output / error nearest the failure, and it now matches the rest of the ladder, which drops oldest-first to keep the most recent context.

  • P3 — Options.budgets was a test-only production API. Removed it, along with DEFAULT_COMPONENT_BUDGETS, the ComponentBudgets type, and resolveBudgets(). Budgets are now a private fixed COMPONENT_BUDGETS const. The small-budget edge cases that previously needed an override are now tested by calling the pure cap helpers directly with tiny limits — headBytes / capLogTailBytes / capMessageParts / capSessionMessagesBytes are exported for this. A lean set of integration tests still proves the helpers are wired into buildProblemReport under the real default budgets (1 MB log / 1 MB session / 256 KB per message / 64 KB per renderer-error field).

  • P2 — scoped renderer slice had no lower time bound. Removed the NEGATIVE_INFINITY special case; scoped (session/trace) slices now use the same bounded ~30-min lookback as unscoped ones, so a manual trigger collects minimally by default and a session's hours-old event is not pulled in. An explicit from still opens a wider window when a reviewer needs older history. The slice test now asserts both: bounded by default, widened with from.

On splitting out the private-key redactor (P3)

Kept as one PR. The private-key body-matching logic lives in problem-report-redact.ts and is part of the same redaction-before-budgeting pipeline this PR bounds; it is small, already has focused unit tests in problem-report-redact.test.ts, and splitting it would create a stacked PR with a trivial-but-real feedback.ts overlap for no review benefit. The diff here is scoped to size-budgeting + the redaction it depends on.

Verification

bun run typecheck      — clean (tsgo -b, 9/9 tasks)
eslint (changed src)   — 0 errors (test files are eslint-ignored by config)
bun test (electron)    — 654 pass, 0 fail, 1989 expect() calls, 72 files
  problem-report + renderer-diagnostics-slice — 47 pass, 0 fail

…Events

Renderer diagnostics were byte-capped only at the source (when the slice
is built). buildProblemReport then redacts them, which can re-expand the
slice, but they were not re-bounded by the component-budget pass — and the
overall fallback ladder used a bespoke event-drop loop that broke on the
first protected (incident / identity-transition) event, leaving an
all-protected oversized slice unbounded (it could overflow maxBytes and
throw).

- Add rendererDiagnosticsBytes (= SESSION_EXPORT_RENDERER_DIAGNOSTICS_MAX_BYTES)
  to the fixed component budgets.
- Re-bound the slice post-redaction with the same capEvents() ruler used at
  the source, re-derived from an immutable redacted baseline so the omitted
  ledger stays correct across the component pass and the overall ladder.
- Replace the bespoke fallback loop with a capEvents-based shrink that can
  drop even incident events once a tight maxBytes leaves no alternative, so
  the slice drains to empty rather than overflowing. Removes the now-unused
  isProtectedRendererDiagnosticEvent duplicate predicate.
- Tests: all-protected oversized slice drains under a tight maxBytes (no
  throw); oversized events are re-bounded to the component budget below the
  overall limit.
@Astro-Han

Copy link
Copy Markdown
Owner Author

Addressed both items — pushed as baa61f6.

P2 — renderer diagnostics re-budgeted after redaction (implemented)

Verified the gap: the slice is byte-capped only at the source; buildProblemReport then redactRendererDiagnostics(), which can re-expand it, and the component-budget pass skipped it ("not re-budgeted here"). The overall fallback used a bespoke loop that findIndex(!protected)break on the first protected event, so an all-protected oversized slice could ride past maxBytes and the report would throw.

Fix, with capEvents() as the single ruler:

  • Added rendererDiagnosticsBytes (= SESSION_EXPORT_RENDERER_DIAGNOSTICS_MAX_BYTES, 1 MB) to the fixed component budgets.
  • Re-bound the slice post-redaction with capEvents(), re-derived from an immutable redacted baseline so the omitted-bytes ledger stays correct across the component pass and the overall ladder.
  • Replaced the bespoke fallback loop with a capEvents()-based shrink that drops even incident events once a tight maxBytes leaves no alternative, so the slice drains to empty instead of overflowing. Removed the now-dead isProtectedRendererDiagnosticEvent duplicate predicate (its rule already lives inside capEvents).
  • Tests: (a) all-protected oversized slice drains under a 5 KB maxBytes — no throw, report ≤ maxBytes, ledger > 0; (b) 1.2 MB of events under the default 5 MB limit are re-bounded to the 1 MB component budget (only the component cap, not the overall ladder, could have trimmed them).

P3 — PR description aligned to current head (done)

Rewrote the description: removed resolveBudgets() / budgets-override / NaN-budget references; it now states fixed private component budgets (incl. rendererDiagnosticsBytes), the exported pure cap helpers tested at small limits, the default-budget end-to-end wiring, and the renderer-diagnostics re-budget above.

Verification

bun run typecheck      — clean (tsgo -b)
eslint (changed src)   — 0 errors
bun test (electron)    — 656 pass, 0 fail, 72 files
  problem-report.test  — 45 pass, 0 fail (incl. the 2 new renderer-diagnostics tests)

@Astro-Han Astro-Han merged commit b004523 into dev Jun 23, 2026
41 checks passed
@Astro-Han Astro-Han deleted the claude/diag-size-budgets branch June 23, 2026 13:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working P1 High priority platform Electron shell, OS integration, packaging, updater, signing, paths, and permissions

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant