Skip to content

fix(sentry): silence expected-condition telemetry noise on rc#1141

Open
pedramamini wants to merge 1 commit into
rcfrom
fix/sentry-rc-stats-history-groom-noise
Open

fix(sentry): silence expected-condition telemetry noise on rc#1141
pedramamini wants to merge 1 commit into
rcfrom
fix/sentry-rc-stats-history-groom-noise

Conversation

@pedramamini

@pedramamini pedramamini commented Jun 28, 2026

Copy link
Copy Markdown
Collaborator

Summary

Triage of Sentry field crashes on the rc channel. Three issues were best-effort or fully-recovered conditions being reported to Sentry as if they were bugs (the same telemetry-noise-on-expected-condition family addressed in prior triages). Fixed in one branch:

MAESTRO-M9 - RangeError: Invalid string length (regression)

agentSessions.getGlobalStats lost its RangeError carve-out when it was refactored onto statsCache. A session file too large to read into a single V8 string throws RangeError: Invalid string length, and both the Claude and Codex incremental read loops captured it unconditionally. Re-added the guard, mirroring the storage-layer pattern already present in claude-session-storage.ts / codex-session-storage.ts. Still live on 0.18.2-RC (12 recent events).

MAESTRO-FM - EACCES/ENOENT mkdir .maestro/history (553 occ)

writeEntryLocal is a best-effort cross-host history sync write. When the project lives in a permission-restricted or non-existent directory (read-only Dropbox / CloudStorage team folders, /home/.maestro/history), mkdir throws EACCES/ENOENT. The primary history store is unaffected, so we now skip Sentry for expected filesystem error codes and only report genuinely unexpected failures. Added regression tests.

MAESTRO-JB - "Grooming error: Session not found" (65 occ)

groupChat resetContext already recovers from a grooming failure by falling back to a fresh session. When the failure is "Session not found" (the participant's session was deleted mid-summary), that's an expected, fully-recovered condition - skip the Sentry report; unexpected grooming failures still surface.

Validation

  • tsc -p tsconfig.main.json clean
  • ESLint + Prettier clean on changed files
  • vitest green: shared-history-manager (12, +2 new), agentSessions + groupChat (74)

Deliberately not included

  • MAESTRO-Q2 (maestro-p --status sample failed, ~5.7k occ): a periodic best-effort sampler reporting every failure mode as a Sentry warning. Deferred in two prior triages pending a human call on maestro-p exit semantics; the reason/stage context isn't indexed as queryable tags, so there's no data-backed clean expected-boundary to carve out. Flagging for a decision rather than guessing.
  • MAESTRO-TP (No handler registered for 'pianola:get-rules'): pianola has zero references in the codebase (unmerged dev-environment feature) - no handler or caller to fix here.
  • Native/build issues (renderer/GPU crashes, spawn EINVAL/E2BIG, GLIBC_2.38) - genuine signal or build-infra, not app-fixable noise.

Note: merged into rc, so GitHub will reference but not auto-close the issues until rc reaches main. The M9/FM/JB code also exists on main; this PR is scoped to rc per request.

Summary by CodeRabbit

  • Bug Fixes
    • Reduced unnecessary error reporting for expected filesystem issues and oversized session data.
    • Improved handling of session and context reset failures so common “not found” cases no longer create noisy alerts.
    • Continued to safely fall back when writes or session parsing fail, while still surfacing unexpected errors.

Three field-crash issues on the rc channel were best-effort/recoverable
conditions being reported to Sentry as if they were bugs:

- MAESTRO-M9 (regression): getGlobalStats lost the RangeError carve-out
  when it was refactored onto statsCache. A session file too big to read
  into one V8 string throws `RangeError: Invalid string length`; both the
  Claude and Codex read loops captured it. Re-add the guard, mirroring the
  storage-layer pattern in claude-/codex-session-storage.ts.

- MAESTRO-FM (553 occ): writeEntryLocal is a best-effort cross-host history
  sync write. When the project lives in a permission-restricted or
  non-existent dir (read-only Dropbox/CloudStorage team folders,
  /home/.maestro/history), mkdir throws EACCES/ENOENT. The primary history
  store is unaffected, so skip Sentry for expected fs error codes.

- MAESTRO-JB: groupChat resetContext already recovers from a grooming
  failure by falling back to a fresh session. When the failure is
  "Session not found" (the participant's session was deleted), that's an
  expected, fully-recovered condition - skip the Sentry report.

Adds regression tests for the FM carve-out.
@coderabbitai

coderabbitai Bot commented Jun 28, 2026

Copy link
Copy Markdown

Review Change Stack

📝 Walkthrough

Walkthrough

Three error handlers are updated to skip captureException for known/expected errors: writeEntryLocal skips expected filesystem error codes, agentSessions:getGlobalStats skips RangeError for oversized session files, and groupChat:resetParticipantContext skips "Session not found" errors. Tests are added for the writeEntryLocal path.

Changes

Selective Sentry Exception Reporting

Layer / File(s) Summary
writeEntryLocal expected FS error filtering
src/main/shared-history-manager.ts, src/__tests__/main/shared-history-manager.test.ts
Adds EXPECTED_FS_ERROR_CODES allowlist and isExpectedFsError helper; writeEntryLocal catch block now skips captureException for expected codes. Tests assert captureException is not called for EACCES and is called once for unexpected TypeError.
agentSessions RangeError filtering
src/main/ipc/handlers/agentSessions.ts
Both Claude and Codex parse error loops gain a RangeError branch that logs a "too large to parse" warning without calling captureException; all other errors still call captureException with structured logging.
groupChat "Session not found" filtering
src/main/ipc/handlers/groupChat.ts
resetParticipantContext catch block conditionally skips captureException when the error message matches /Session not found/i.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • RunMaestro/Maestro#585: Adds defensive handling for oversized session files during Claude/Codex parsing, directly related to the RangeError skipping added in agentSessions:getGlobalStats.

Poem

🐇 Hoppy little rabbit checks each error's name,
"EACCES? Too large? Session gone? That's tame!"
No Sentry alarm for the known and the bland,
Only the strange gets reported as planned.
Less noise in the logs, more signal to scan! 🎉

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 75.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly matches the PR’s main goal: suppressing expected-condition Sentry noise on rc.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/sentry-rc-stats-history-groom-noise

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@greptile-apps

greptile-apps Bot commented Jun 28, 2026

Copy link
Copy Markdown

Greptile Summary

This PR reduces Sentry noise for recovered or best-effort error paths.

  • Adds oversized-session carve-outs in Claude and Codex global stats parsing.
  • Skips expected filesystem errors during best-effort shared-history sync writes.
  • Skips recovered group-chat grooming failures when a participant session is missing.
  • Adds shared-history tests for expected and unexpected write failures.

Confidence Score: 4/5

The changed flow looks mergeable after narrowing two diagnostic filters.

  • The shared-history change is limited to best-effort sync writes.
  • The stats handlers can hide unrelated RangeError failures.
  • The group-chat handler can hide unexpected grooming errors that include the same text.

src/main/ipc/handlers/agentSessions.ts, src/main/ipc/handlers/groupChat.ts

Important Files Changed

Filename Overview
src/main/ipc/handlers/agentSessions.ts Adds RangeError suppression for Claude and Codex global stats parsing, but the predicate is broader than the known oversized-string failure.
src/main/ipc/handlers/groupChat.ts Suppresses Sentry for recovered session-not-found grooming failures, with a substring check that can also hide unrelated grooming errors.
src/main/shared-history-manager.ts Filters expected filesystem errors from best-effort local shared-history writes while leaving unexpected errors reportable.
src/tests/main/shared-history-manager.test.ts Adds tests for expected filesystem suppression and unexpected write-error reporting.

Reviews (1): Last reviewed commit: "fix(sentry): silence expected-condition ..." | Re-trigger Greptile

Comment on lines +1007 to +1014
if (error instanceof RangeError) {
logger.warn(`Claude session file too large to parse: ${file.sessionKey}`, LOG_CONTEXT);
} else {
void captureException(error);
logger.warn(`Failed to parse Claude session: ${file.sessionKey}`, LOG_CONTEXT, {
error,
});
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 RangeError Filter Is Too Broad

This catch block now treats every RangeError from the full read, stat, and parse block as an oversized session file. If parsing or future stats logic throws a different RangeError, that session is skipped from global stats and the bug is not reported to Sentry. Narrowing this to the known Invalid string length condition keeps the expected noisy case quiet without hiding unrelated failures.

Context Used: CLAUDE.md (source)

Comment on lines +1041 to +1048
if (error instanceof RangeError) {
logger.warn(`Codex session file too large to parse: ${file.sessionKey}`, LOG_CONTEXT);
} else {
void captureException(error);
logger.warn(`Failed to parse Codex session: ${file.sessionKey}`, LOG_CONTEXT, {
error,
});
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 RangeError Filter Is Too Broad

This branch suppresses every RangeError raised while reading, statting, or parsing a Codex session. A non-size RangeError in the parser or aggregation path would drop that session from global stats and skip Sentry, even though only V8's oversized string error is expected here.

Context Used: CLAUDE.md (source)

Comment on lines +768 to +771
const message = error instanceof Error ? error.message : String(error);
if (!/Session not found/i.test(message)) {
void captureException(error);
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Substring Match Hides Grooming Errors

groomContext wraps process errors as Grooming error: <message>, and unknown provider or tool errors can preserve raw text. If an unexpected grooming failure contains Session not found in its message, this substring check suppresses Sentry even though the failure is not the recovered deleted-session case.

Context Used: CLAUDE.md (source)

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/main/ipc/handlers/agentSessions.ts`:
- Around line 1002-1014: Narrow the error carve-out in agentSessions handling so
only the expected oversized-session case is skipped. Update the logic around the
session parsing flow to check for the specific “Invalid string length”
RangeError signature (or a shared helper used by the session storage code)
instead of using `error instanceof RangeError`, and keep all other RangeErrors
flowing through `captureException` and the existing `logger.warn` path in the
relevant session aggregation code paths.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 52798030-b1ac-4805-90da-1ddc86a4ee29

📥 Commits

Reviewing files that changed from the base of the PR and between b330223 and 55f49b8.

📒 Files selected for processing (4)
  • src/__tests__/main/shared-history-manager.test.ts
  • src/main/ipc/handlers/agentSessions.ts
  • src/main/ipc/handlers/groupChat.ts
  • src/main/shared-history-manager.ts

Comment on lines +1002 to +1014
// A session file too large to read into a single V8 string throws
// `RangeError: Invalid string length` (MAESTRO-M9). That's an expected
// boundary for huge sessions, not a bug - skip it and keep aggregating
// the rest. Mirrors the storage-layer carve-out in
// claude-/codex-session-storage.ts.
if (error instanceof RangeError) {
logger.warn(`Claude session file too large to parse: ${file.sessionKey}`, LOG_CONTEXT);
} else {
void captureException(error);
logger.warn(`Failed to parse Claude session: ${file.sessionKey}`, LOG_CONTEXT, {
error,
});
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🎯 Functional Correctness | 🟠 Major | ⚡ Quick win

Narrow the carve-out to the oversized-file RangeError.

error instanceof RangeError suppresses all RangeErrors from fs.readFile, fs.stat, and the parsers here, so unrelated bugs stop reaching Sentry too. The documented expected case is specifically RangeError: Invalid string length; please gate on that signature (or a shared helper) and keep other RangeErrors reportable.

Suggested fix
-					if (error instanceof RangeError) {
+					const isExpectedOversizeError =
+						error instanceof RangeError && /Invalid string length/i.test(error.message);
+					if (isExpectedOversizeError) {
 						logger.warn(`Claude session file too large to parse: ${file.sessionKey}`, LOG_CONTEXT);
 					} else {
 						void captureException(error);
 						logger.warn(`Failed to parse Claude session: ${file.sessionKey}`, LOG_CONTEXT, {
 							error,
@@
-					if (error instanceof RangeError) {
+					const isExpectedOversizeError =
+						error instanceof RangeError && /Invalid string length/i.test(error.message);
+					if (isExpectedOversizeError) {
 						logger.warn(`Codex session file too large to parse: ${file.sessionKey}`, LOG_CONTEXT);
 					} else {
 						void captureException(error);
 						logger.warn(`Failed to parse Codex session: ${file.sessionKey}`, LOG_CONTEXT, {
 							error,

As per coding guidelines, "Handle only expected/recoverable errors explicitly."

Also applies to: 1038-1048

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/main/ipc/handlers/agentSessions.ts` around lines 1002 - 1014, Narrow the
error carve-out in agentSessions handling so only the expected oversized-session
case is skipped. Update the logic around the session parsing flow to check for
the specific “Invalid string length” RangeError signature (or a shared helper
used by the session storage code) instead of using `error instanceof
RangeError`, and keep all other RangeErrors flowing through `captureException`
and the existing `logger.warn` path in the relevant session aggregation code
paths.

Source: Coding guidelines

pedramamini pushed a commit that referenced this pull request Jun 28, 2026
…oom/oversized-session telemetry

Three stable-channel (main) field-crash fixes, validated against Sentry
(smash-labs/maestro) on releases 0.17.1/0.17.2 (channel:stable). Fixes 2
and 3 mirror rc PR #1141 exactly so the rc->main merge converges with no
conflict.

1. Native bindings packaging (MAESTRO-TE, MAESTRO-Q3, MAESTRO-JV, plus
   the MAESTRO-TC/TD "Could not locate the bindings file" cluster).
   better-sqlite3 depends on bindings, which depends on file-uri-to-path,
   but asarUnpack only unpacked better-sqlite3 and node-pty. When the
   unpacked bindings.js required file-uri-to-path (still inside app.asar),
   resolution could not cross back into the archive ("Cannot find module
   'file-uri-to-path'"); the mixed packed/unpacked layout also produced
   "Could not locate the bindings file" when bindings computed an
   in-archive search root for the .node. Unpack the full native
   dependency closure (bindings + file-uri-to-path) so the whole chain
   resolves from app.asar.unpacked.

2. Recoverable grooming session loss (MAESTRO-JB, 65 events). Group-chat
   context summary spawns a batch agent; if the participant's provider
   session was deleted mid-summary the agent emits a recoverable "Session
   not found" error (error-patterns.ts session_not_found) and we already
   fall back to a fresh session. Skip captureException for that case in
   groupChat.ts; real summary failures still report.

3. Oversized session files (MAESTRO-M9, regressed onto stable). The
   getGlobalStats parse loops in agentSessions.ts read each session file
   into a single string; a file too large throws "RangeError: Invalid
   string length". The #1115 carve-out was lost in the statsCache
   refactor that reached main. Re-skip the expected RangeError in both
   the Claude and Codex loops, mirroring the storage-layer pattern.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant