fix(rc20): #524 outbox + #526 watchdog approval-pending follow-ups#556
Merged
Merged
Conversation
Both issues shipped rc19 fixes (PR #555) but /monitor audits on 2026-05-18 showed each regression still firing in production because the rc19 patch landed in only one of two code paths. #524 — outbox silently drops directory entries rc19 surfaced 📎 Outbox skipped notices on the normal-completion path in handle_message but missed two adjacent paths: the pre-auto-continue delivery (subprocess 1 stuck-after-tool-result recovery) and the run_ok=False failed-run branch. Both still silently dropped the agent's intended deliverable. This commit extracts the surfacing logic into _surface_outbox_skipped in runner_bridge.py and wires it into both gap paths. On a failed run the code still skips the actual file send (preserving the original gating) but does a cheap scan_outbox() to collect skipped items and surface them, so the user always learns what the agent intended to ship. Honours the existing outbox_notify_skipped config flag and filters the "..." overflow pseudo-entry from the user-facing block. #526 — approval-pending stalls misclassified rc19 demoted the bridge-side WARN (progress_edits.stall_detected) to a paced INFO (subprocess.approval_pending) when _has_pending_approval() returned true. The watchdog-side detector in runner.py (which emits subprocess.liveness_stall and is the actual signal untether-issue-watcher auto-files on) was untouched, so the daemon kept filing GitHub issues on routine approval-pending sessions and the nsd audit (2026-05-18) showed a user cancelling a productive 15-minute investigation because the chat-side reassurance came too late (1800s threshold). This commit: - Adds _recent_event_is_control_request helper in runner.py — uses the stream's recent_events ring buffer as the approval-pending signal, consistent with the bridge's inline-keyboard predicate but accessible to runner-scope code. - Plumbs the predicate into _watchdog_loop: when the last JSONL event is control_request, emit subprocess.approval_pending INFO instead of liveness_stall WARN. Skip the auto-kill branch entirely. Pace INFO emission to once per 30 min via the shared _APPROVAL_PENDING_REFIRE_S constant (now defined once in runner.py and imported by the bridge). - Splits _STALL_THRESHOLD_APPROVAL into _STALL_THRESHOLD_APPROVAL_FIRST (600s) and the existing 1800s refire so the user gets a reassuring "tap a button above" chat message at 10 min on first occurrence, matching the watchdog's liveness threshold and avoiding the nsd-style early cancellation. - Rewords the chat-side approval reminder copy to make the "tap a button above to proceed (no action needed otherwise)" affordance explicit, directly quoting the audit's recommended text. Tests cover both code paths: - tests/test_outbox_delivery.py (existing) — format helper + settings default unchanged; no new file-level tests needed. - tests/test_exec_bridge.py — failed-run surfacing, notify_skipped=false suppression, only-overflow filter, two-tier first-reminder threshold, reworded copy. - tests/test_exec_runner.py — predicate truth-table coverage, watchdog demotion via integration with a fake codex script emitting control _request, watchdog WARN still fires when no control_request is recent. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This was referenced May 20, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Both #524 and #526 shipped fixes in
0.35.3rc19(PR #555) but/monitoraudits on 2026-05-18 showed each regression still firing in production because the rc19 patch landed in only one of two code paths. This PR closes the gaps for0.35.3rc20.#524 — outbox silently drops directory entries
run_ok=Falsefailed-run branch (entire outbox delivery skipped, so the user never learns about left-behind directories)_surface_outbox_skippedhelper inrunner_bridge.pyand wired it into both gap paths. Failed runs now do a cheapscan_outbox()(no file send) to collect skipped items and surface them.#526 — approval-pending stalls misclassified
progress_edits.stall_detected). The watchdog-sidesubprocess.liveness_stallinrunner.pywas unchanged, sountether-issue-watcherkept auto-filing GitHub issues on routine approval-pending sessions._recent_event_is_control_request(stream)helper. The watchdog now demotes WARN → INFO when the last JSONL event iscontrol_request, and skips the auto-kill branch entirely._STALL_THRESHOLD_APPROVALinto_STALL_THRESHOLD_APPROVAL_FIRST = 600.0and the existing 1800s refire — first reassurance now fires at 10 min, subsequent reminders gated by 30-min refire.Shared constant
_APPROVAL_PENDING_REFIRE_S = 1800.0torunner.pyand imported intorunner_bridge.pyso the watchdog and bridge agree on a single 30-min cadence.Test plan
Related
guides/) — silent loss of intended deliveries on every session #524 and ENH-PATCH: differentiate approval-pending stalls from genuine hangs in stall warnings + Telegram messaging #526 (issues to be referenced when v0.35.3 stable lands)🤖 Generated with Claude Code