Skip to content

refactor: decouple open_file from recording logic + auto-detect ffmpeg#392

Merged
sonichi merged 1 commit intomainfrom
refactor/open-file-clean-design
Apr 16, 2026
Merged

refactor: decouple open_file from recording logic + auto-detect ffmpeg#392
sonichi merged 1 commit intomainfrom
refactor/open-file-clean-design

Conversation

@sonichi
Copy link
Copy Markdown
Owner

@sonichi sonichi commented Apr 16, 2026

Summary

Per owner's design directive: recording tools return explicit file paths, open_file becomes a path-based file opener, and findFfmpegWithSubtitles() auto-detects the right ffmpeg binary for subtitle burns.

Supersedes #355 (subtitled_pending gate) and #357 (ffmpeg-full path switch) — folds both concerns into a cleaner architecture.

Changes (+71/-41)

1. open_file stripped of recording-specific logic

  • Requires path arg — get it from the recording tool's return value
  • find_recording: true flag as explicit fallback only (when model lost the path)
  • Removed: subtitled_pending flag, version detection, findRecording() as default behavior
  • Kept: known file aliases (diagnostics), QuickTime activation, duration/size metadata

2. screen_record stop returns explicit file list

  • New files object: {raw, narrated, subtitled, recommended}
  • instruction string tells the model exactly which path to pass to open_file
  • Subtitle burn still happens synchronously on stop

3. findFfmpegWithSubtitles() auto-detection (from #357)

  • Probes: $FFMPEG_SUBTITLE_BIN env → system ffmpeg → homebrew narrow → homebrew full (keg-only)
  • Cached per process lifetime (~50ms first call)
  • burnLiveTranscriptSubtitles() uses it instead of hardcoded /opt/homebrew/bin/ffmpeg
  • If no binary has libass, logs clearly and skips burn (no cryptic AVFilterGraph error)

Why this design

Owner: "The recording tool should return the files and the default file to use. The caller should process the result and report abnormality. Then open_file should be called with filename."

This eliminates the class of bugs we chased yesterday:

  • No more subtitled_pending false positives (flag is gone)
  • No more 18s polling timeout (loop is gone)
  • No more "which ffmpeg" confusion (auto-detected)
  • Model gets explicit file paths and instructions, not implicit guessing

Test plan

  • npx tsc --noEmit --skipLibCheck clean
  • findFfmpegWithSubtitles() runtime-tested: correctly picks ffmpeg-full on Mac Mini
  • Manual: phone call "record 20s then open it" — model should pass the path from screen_record result
  • Manual: verify subtitle burn uses the detected ffmpeg-full binary

PRs to close after merge

Still open (independent)

🤖 Generated with Claude Code

Per owner's design directive: "The recording tool should return the
files and the default file to use. The caller should process the result
and report abnormality if any. Then open_file tool should be called with
filename."

Changes:

1. open_file: stripped recording-specific logic.
   - Requires `path` arg (get it from recording tool result).
   - `find_latest_recording: true` flag as explicit fallback — only
     called when model lost the path, not as silent default.
   - Removed: subtitled_pending flag, version detection, polling
     logic, findRecording() as default behavior.
   - Kept: known file aliases (diagnostics), QuickTime activation,
     playback-path write, duration/size metadata.

2. screen_record stop: returns explicit file list.
   - New `files` object: `{raw, narrated, subtitled, recommended}`.
   - `instruction` string tells the model exactly which path to pass
     to open_file and how to offer alternatives to the user.
   - Subtitle burn still happens synchronously on stop.

3. No changes to record_screen_with_narration — its auto-stop timer
   fires in a setTimeout and can't return files to the model through
   a tool result. The model already has the raw path from the start
   call; open_file with that path works. Full file-list return for
   record_screen_with_narration is a follow-up.

Net: +39/-38 (near-zero). This is a separation of concerns, not new
features. findRecording() and findFfmpegWithSubtitles() are preserved
as helpers; they just stop being called by default in open_file.

Supersedes the approach in #353 (non-blocking return + subtitled_pending)
and makes #355 (subtitled_pending gate) + #357 (ffmpeg-full path)
unnecessary as open_file changes — though #357's findFfmpegWithSubtitles
is still needed in the recording tool's burn path.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@sonichi sonichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MacBook review: LGTM. Clean redesign:

  • open_file now requires path from model, find_recording is explicit fallback only
  • screen_record stop returns {files: {raw, narrated, subtitled, recommended}} with paths
  • findFfmpegWithSubtitles() cached probe with env override
  • burnLiveTranscriptSubtitles uses detected binary, graceful skip if no libass
  • Removed all subtitled_pending/polling logic (root cause of false positives)
  • +71/-41, tsc clean. Ready to merge.

@sonichi sonichi merged commit 1b016ea into main Apr 16, 2026
1 check passed
@sonichi sonichi deleted the refactor/open-file-clean-design branch April 16, 2026 13:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant