open_file: return immediately with best-available recording (fix 18s timeout) by sonichi · Pull Request #353 · sonichi/sutando

sonichi · 2026-04-15T23:05:02Z

Summary

Fixes the root cause of "I couldn't find the recording at the standard location" on phone calls. The 18s polling loop waiting for the subtitled burn-in exceeded Gemini Live's tool-call timeout — the tool got cancelled mid-retry and the model reported a false negative even while the narrated file was sitting on disk.

Drops the 10-iteration retry loop.
findRecording() already returns best-available in priority order (subtitled > narrated > raw), so one synchronous call is enough.
Adds a subtitled_pending flag + a version field to the return payload.
When subtitled is pending, the returned instruction tells the model to proactively say: "I opened the narrated version. Subtitles are still being generated — want me to switch to the subtitled version when it's ready?" If the user says yes, the model waits ~30 seconds and calls open_file again, which picks up the subtitled version once the burn-in finishes.

What this does NOT do (yet)

The async-notification half of the owner's ask ("can we make the wait async?") — i.e., voice agent proactively tells user when subtitled is ready without the user asking. That's a bigger feature: needs a mid-call signal channel to voice-agent, plus background polling with dedicated state. Post-flood I'm cautious about any results/ iteration path, so I'm pausing that work and flagging it for a design conversation before shipping.

This PR gives the user-driven retry pattern — eliminates the timeout failure, restores a workable UX immediately.

Test plan

npx tsc --noEmit --skipLibCheck clean
Manual: restart voice-agent, start a recording, ask "open it" — tool returns within ~100ms instead of 18s, model speaks the pending message
Manual: call open_file again 30s later once subtitled is on disk — tool returns subtitled version

Diff

src/recording-tools.ts +33/-13

References

Phone-call diagnosis (with 18s timeout root cause): Discord thread 22:42 local
Owner directive: "let users know the subtitled file is not ready and ask them whether they want to wait" (Discord 22:02 local)
Prior meeting notes that captured this as a latent bug: notes/meetings/task-summary-1776292611357.md

🤖 Generated with Claude Code

Fixes #356

Fixes the root cause of "I couldn't find the recording at the standard location" during phone calls that owner diagnosed earlier today. The 18-second polling loop waiting for the subtitled version exceeded Gemini Live's tool-call timeout, so the tool got cancelled mid-retry and the model reported a false negative even when the narrated file was on disk. ### What changes - Remove the 10-iteration retry loop. `findRecording()` already returns the best-available version in priority order (subtitled > narrated > raw), so one synchronous call is enough. - Compute a `subtitled_pending` flag when the returned version is not already subtitled AND it's a sutando recording (i.e. the background subtitle burn might still be running). - When `subtitled_pending` is true, the returned `instruction` string tells the model to proactively inform the user: "I opened the narrated version. Subtitles are still being generated — want me to switch to the subtitled version when it's ready?" If the user says yes, the model can wait ~30 seconds and call open_file again, which will pick up the subtitled version once the burn-in finishes. - Return payload gains a `version` field (`subtitled | narrated | raw`) so the model knows exactly what it's telling the user. ### What this doesn't do yet **Async notification**: owner also asked "can we make the wait async?" i.e., can the voice agent proactively tell the user when subtitled is ready without the user having to ask. That's a bigger feature — it needs a signal channel to voice-agent mid-call (post-flood we're cautious about the results/ iteration path), plus background polling with a dedicated state file. Pausing that work here and asking owner for a design discussion first. The current PR gives owner the non-blocking behavior + "user-driven retry" pattern, which eliminates the timeout failure and ships a workable UX immediately. ### Test plan - [x] `npx tsc --noEmit --skipLibCheck` clean - [ ] Manual: restart voice-agent, start a recording, ask "open it" — tool returns within ~100ms instead of 18s, model speaks the subtitled_pending message if applicable - [ ] Manual: call open_file again 30s later with subtitled now on disk — tool returns subtitled version ### References - Phone call diagnosis: `notes/meetings/task-summary-1776289358782.md`, `notes/meetings/task-summary-1776292611357.md` - Reply with diagnosis: in result text at 22:42 local - Owner design directive: "let users know the subtitled file is not ready and ask them whether they want to wait; if they do, can we make the wait async?" (via Discord 22:02 local) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

sonichi

MacBook review: LGTM. Clean fix — removes the 18s blocking poll, returns immediately with best-available version, flags subtitled_pending for model-driven retry. The retry is now in the prompt instruction, not a blocking loop. Well within Gemini's tool-call timeout. No regressions to normal open_file path.

…#354 Each script reproduces the bug (before the fix) and verifies it's resolved (after the fix). All POCs pass on current main. - poc-pr353-open-file.sh (11/11) — 18s polling timeout in open_file - poc-pr355-subtitled-pending.sh (9/9) — false positive subtitled_pending - poc-pr332-team-tier-revert.sh (9/9) — team-tier -C /tmp broke codex - poc-pr325-bodhi-dep.sh (7/7) — bodhi dep pointed at deleted repo - poc-pr354-retention-sweep.sh — retention sweep for stale results Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

liususan091219 · 2026-04-16T01:02:35Z

POC: bash scripts/poc-pr353-open-file.sh (11/11 pass). Script in PR #358. Issue: #356

sonichi commented Apr 15, 2026

View reviewed changes

sonichi merged commit 2be13be into main Apr 15, 2026
1 check passed

sonichi deleted the fix/open-file-nonblocking branch April 15, 2026 23:16

liususan091219 mentioned this pull request Apr 16, 2026

open_file 18s polling timeout causes false 'recording not found' on phone calls #356

Closed

liususan091219 mentioned this pull request Apr 16, 2026

test: POC verification scripts for fix PRs #358

Closed

2 tasks

liususan091219 mentioned this pull request Apr 16, 2026

subtitled_pending false positive on recordings without subtitle burn #359

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

open_file: return immediately with best-available recording (fix 18s timeout)#353

open_file: return immediately with best-available recording (fix 18s timeout)#353
sonichi merged 1 commit intomainfrom
fix/open-file-nonblocking

sonichi commented Apr 15, 2026 •

edited by liususan091219

Loading

Uh oh!

sonichi left a comment

Uh oh!

Uh oh!

liususan091219 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

sonichi commented Apr 15, 2026 • edited by liususan091219 Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

What this does NOT do (yet)

Test plan

Diff

References

Uh oh!

sonichi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

liususan091219 commented Apr 16, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

sonichi commented Apr 15, 2026 •

edited by liususan091219

Loading