feat: video transcript prototype — generate tests from screen recordings by aidenybai · Pull Request #81 · millionco/expect

aidenybai · 2026-04-05T04:24:11Z

Summary

Adds apps/video-transcript/ — a standalone CLI prototype that extracts structured interaction transcripts from screen recordings using Gemini 2.5 Flash via AI SDK (@ai-sdk/google)
Pipeline: video file → ffmpeg idle-time cutting (frame diff heuristics) → Gemini transcript extraction → structured output
Designed to compound with the existing git diff context so the test agent gets both "what changed in code" and "how the feature works" from a video demo
Updates .specs/video-transcript.md to use AI SDK instead of @google/genai

Usage

# Full pipeline
GOOGLE_GENERATIVE_AI_API_KEY=... node apps/video-transcript/dist/index.mjs ./demo.mp4

# Output to file
GOOGLE_GENERATIVE_AI_API_KEY=... node apps/video-transcript/dist/index.mjs ./demo.mp4 -o transcript.md

# Just the activity timeline (no Gemini call)
node apps/video-transcript/dist/index.mjs ./demo.mp4 --timeline-only

# Skip ffmpeg preprocessing
GOOGLE_GENERATIVE_AI_API_KEY=... node apps/video-transcript/dist/index.mjs ./demo.mp4 --no-preprocess

Test plan

Record a short screen recording of interacting with a web app
Run with --timeline-only to verify ffmpeg frame analysis detects active/idle segments
Run full pipeline with GOOGLE_GENERATIVE_AI_API_KEY set to verify Gemini transcript extraction
Run with --no-preprocess to verify raw video upload fallback
Verify graceful error when ffmpeg is not installed
Verify clear error message when API key is missing

Note

Medium Risk
Adds a new CLI that shells out to ffmpeg and sends video data to an AI model via Vercel AI Gateway; failures or platform differences (ffmpeg availability, large files, env vars) could affect reliability. Also expands published dependency surfaces for the main CLI/SDK and adds tests to prevent missing runtime deps in bundled outputs.

Overview
Adds a new apps/video-transcript standalone CLI that validates video inputs, optionally preprocesses recordings via ffmpeg frame-diff analysis to trim idle/keep scene changes, and then calls generateText through @ai-sdk/gateway (Gemini 2.5 Flash) to produce a structured interaction transcript.

Updates the spec to use mediaType for AI SDK file parts, adds unit tests covering activity analysis/prompting/transcript extraction, and introduces "runtime dependency safety" tests plus new dependencies in apps/cli and packages/typescript-sdk to ensure bundled dist/ runtime-resolved packages are declared.

^{Reviewed by Cursor Bugbot for commit be73bdc. Bugbot is set up for automated code reviews on this repo. Configure here.}

Summary by cubic

Prototype CLI to turn screen recordings into structured interaction transcripts for test generation. Uses ffmpeg to trim idle time and routes requests via Vercel AI Gateway (requires AI_GATEWAY_API_KEY, replacing GOOGLE_GENERATIVE_AI_API_KEY).

New Features
- Adds apps/video-transcript/ CLI: video → ffmpeg activity analysis/trim → transcript via @ai-sdk/gateway + ai.
- Flags: --timeline-only, --no-preprocess, -o/--output, --verbose; supports .mp4, .webm, .mov, .avi, .mkv; clear errors for missing ffmpeg or API key.
- Adds tests for frame diff/segment classification/timeline formatting, prompt building, and extraction; updates .specs/video-transcript.md to use mediaType.
Bug Fixes
- Prevent pnpm runtime resolution failures by declaring runtime-resolved deps in apps/cli and packages/typescript-sdk and adding dist-scanning tests to enforce declarations (e.g., @github/copilot, @google/gemini-cli, accessibility-checker-engine).

^{Written for commit be73bdc. Summary will update on new commits.}

… recordings Adds apps/video-transcript — a standalone CLI that extracts structured interaction transcripts from screen recordings using Gemini 2.5 Flash via AI SDK. Designed to compound with git diff context so the test agent gets both "what changed" and "how the feature works." Pipeline: video → ffmpeg idle-time cutting → Gemini transcript extraction Updates the video-transcript spec to use AI SDK (@ai-sdk/google) instead of @google/genai, matching the existing codebase dependency surface.

vercel · 2026-04-05T04:24:16Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
expect	Ready	Preview, Comment	Apr 5, 2026 6:09am

pkg-pr-new · 2026-04-05T04:25:18Z

Open in StackBlitz

npm i https://pkg.pr.new/expect-cli@81

commit: be73bdc

- Replace @ai-sdk/google with @ai-sdk/gateway for provider-agnostic model routing via AI_GATEWAY_API_KEY - Add 24 tests covering activity-analyzer (frame diff, segment classification, timeline formatting), transcript-prompt (base prompt, timeline appending), and extract-transcript (gateway model, file parts, timeline inclusion, response handling)

cubic-dev-ai

1 issue found across 10 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/video-transcript/src/index.ts">

<violation number="1" location="apps/video-transcript/src/index.ts:115">
P1: `--timeline-only` is ignored when ffmpeg is unavailable, so the CLI still attempts transcript extraction instead of exiting after timeline handling.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review, or fix all with cubic.}

cubic-dev-ai · 2026-04-05T04:30:29Z

apps/video-transcript/src/index.ts

+      }
+    }
+
+    console.error(pc.cyan("Extracting transcript via Gemini 2.5 Flash..."));


P1: --timeline-only is ignored when ffmpeg is unavailable, so the CLI still attempts transcript extraction instead of exiting after timeline handling.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/video-transcript/src/index.ts, line 115: <comment>`--timeline-only` is ignored when ffmpeg is unavailable, so the CLI still attempts transcript extraction instead of exiting after timeline handling.</comment> <file context> @@ -0,0 +1,128 @@ + } + } + + console.error(pc.cyan("Extracting transcript via Gemini 2.5 Flash...")); + + const transcript = await extractTranscript(processedVideoPath, timeline); </file context>

cursor

Cursor Bugbot has reviewed your changes and found 4 potential issues.

^{❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.}

^{Reviewed by Cursor Bugbot for commit ce7912b. Configure here.}

cursor · 2026-04-05T04:31:25Z

apps/video-transcript/src/index.ts

+          } else {
+            console.log(output);
+          }
+          return;


--timeline-only ignored without ffmpeg or with --no-preprocess

High Severity

The --timeline-only early return is nested inside if (hasFfmpeg) within if (options.preprocess), so it's only reachable when both conditions are true. When ffmpeg is unavailable or --no-preprocess is passed, the flag is silently ignored and the code falls through to extractTranscript, which requires AI_GATEWAY_API_KEY — a key the earlier check at line 54 explicitly skipped validating for timelineOnly mode. This causes an unhandled API error for users who only want the timeline.

Additional Locations (1)

apps/video-transcript/src/index.ts#L114-L117

^{Reviewed by Cursor Bugbot for commit ce7912b. Configure here.}

cursor · 2026-04-05T04:31:25Z

apps/video-transcript/src/activity-analyzer.ts

+    const frameCount = await extractFrames(videoPath, framesDir);
+    if (frameCount < 2) return [{ type: "active", startSeconds: 0, endSeconds: frameCount }];
+
+    const frameSize = 320 * 180;


Unused variable frameSize declared but never read

Low Severity

const frameSize = 320 * 180 is assigned but never referenced anywhere. It appears to be a leftover from a planned validation step (e.g., verifying raw frame buffer sizes match the expected dimensions). This is dead code that adds confusion about whether frame size checking was intentionally omitted.

^{Reviewed by Cursor Bugbot for commit ce7912b. Configure here.}

apps/video-transcript/package.json

cursor · 2026-04-05T04:31:25Z

apps/video-transcript/src/activity-analyzer.ts

+    "-vf",
+    `select='${selectFilter}',setpts=N/FRAME_RATE/TB`,
+    "-af",
+    `aselect='${selectFilter}',asetpts=N/SR/TB`,


ffmpeg -af crashes on videos without audio

Medium Severity

buildTrimmedVideo unconditionally passes -af aselect='...',asetpts=N/SR/TB to ffmpeg. When the input video has no audio stream — common for screen recordings (macOS screenshot tool, many Linux tools, etc.) — ffmpeg fails with a "matches no streams" error. This crashes the CLI during the trim step, preventing transcript extraction for audio-less recordings even though only the video track is needed.

^{Reviewed by Cursor Bugbot for commit ce7912b. Configure here.}

github-actions · 2026-04-05T04:36:16Z

Test Results

❌ Website Test: failed

11 passed, 5 failed out of 16 steps — 582s

Step	Status	Duration
Homepage loads — hero section and install commands visible	✅ passed	26s
View demo — navigates to /replay?demo=true and replay player loads	❌ failed	78s
Replay controls — play/pause, speed selector, and step list	❌ failed	116s
Copy button — clipboard contains expected install command	✅ passed	33s
Theme toggle — dark mode changes background, light mode restores it	✅ passed	21s
Footer links — GitHub and X with correct URLs and target="_blank"	✅ passed	24s
Legal page /terms loads with text content	✅ passed	6s
Legal page /privacy loads with text content	✅ passed	4s
Legal page /security loads with text content	✅ passed	18s
Mobile viewport 375×812 — no horizontal scroll, key content visible	✅ passed	39s
Accessibility audit (WCAG)	❌ failed	44s
Performance metrics	✅ passed	44s
Tablet viewport (768×1024) — no overflow, layout intact	✅ passed	10s
WebKit cross-browser — homepage + View demo + copy button	✅ passed	42s
Project healthcheck — pnpm check	❌ failed	13s
Replay time display — verify current time does not exceed total duration	❌ failed	58s

Session Recording

https://github.com/millionco/expect/releases/download/ci-pr-81/d5849a2e310483443dd0b5f534f80f7b.webm

Workflow run #307 | 📎 Download all recordings

vercel · 2026-04-05T04:44:24Z

apps/video-transcript/src/activity-analyzer.ts

+    const frameCount = await extractFrames(videoPath, framesDir);
+    if (frameCount < 2) return [{ type: "active", startSeconds: 0, endSeconds: frameCount }];
+
+    const frameSize = 320 * 180;


Unused variable frameSize is declared but never referenced

vercel · 2026-04-05T04:44:24Z

apps/video-transcript/src/activity-analyzer.ts

+export const formatTimeline = (timeline: ActivityTimeline): string => {
+  const formatTime = (seconds: number): string => {
+    const minutes = Math.floor(seconds / 60);
+    const secs = seconds % 60;


formatTime function produces malformed time strings when receiving fractional seconds (e.g., "00:5.5" instead of "00:05")

vercel · 2026-04-05T04:44:24Z

apps/video-transcript/src/activity-analyzer.ts

+    }
+  }
+
+  const segments: ActivitySegment[] = [];


Loop condition accesses undefined array element causing off-by-one error in segment boundary calculation

vercel · 2026-04-05T04:44:24Z

apps/video-transcript/src/index.ts

+
+    let processedVideoPath = videoPath;
+    let timeline: Awaited<ReturnType<typeof analyzeActivity>> | undefined;
+


Missing error handling in async action callback allows unhandled promise rejections when analyzeActivity, buildTrimmedVideo, or extractTranscript fail

vercel · 2026-04-05T04:44:24Z

apps/video-transcript/src/index.ts

+          "Error: AI_GATEWAY_API_KEY environment variable is required for transcript extraction.",
+        ),
+      );
+      process.exit(1);


The --timeline-only flag is ignored when ffmpeg is unavailable or --no-preprocess is set, causing transcript extraction to proceed instead of exiting after timeline generation

vercel · 2026-04-05T04:44:25Z

apps/video-transcript/src/activity-analyzer.ts

+  }
+};
+
+export const buildTrimmedVideo = async (


Resource leak: buildTrimmedVideo creates a temporary directory with mkdtempSync but never cleans it up, causing indefinite disk space accumulation

vercel · 2026-04-05T04:44:25Z

apps/video-transcript/src/extract-transcript.ts

+export const extractTranscript = async (
+  videoPath: string,
+  timeline: ActivityTimeline | undefined,
+): Promise<string> => {


readFileSync loads entire video file into memory, causing memory pressure and potential OOM errors for large files with Vercel AI Gateway

vercel · 2026-04-05T04:44:25Z

apps/video-transcript/src/activity-analyzer.ts

+    "-vf",
+    `select='${selectFilter}',setpts=N/FRAME_RATE/TB`,
+    "-af",
+    `aselect='${selectFilter}',asetpts=N/SR/TB`,


ffmpeg command uses -af audio filter unconditionally, causing failure on videos without audio streams

The bundler inlines source from @expect/agent but leaves dynamic require.resolve() calls intact. Consumers with strict node_modules (pnpm) cannot resolve these at runtime unless they are declared as dependencies in the published package.json. Adds runtime-deps tests to both packages that scan the built dist for require.resolve() targets and fail if any are undeclared.

vercel bot deployed to Preview April 5, 2026 04:24 View deployment

vercel bot deployed to Preview April 5, 2026 04:27 View deployment

cubic-dev-ai bot reviewed Apr 5, 2026

View reviewed changes

cursor bot reviewed Apr 5, 2026

View reviewed changes

vercel bot reviewed Apr 5, 2026

View reviewed changes

vercel bot deployed to Preview April 5, 2026 06:09 View deployment


		let processedVideoPath = videoPath;
		let timeline: Awaited<ReturnType<typeof analyzeActivity>> \| undefined;

Conversation

aidenybai commented Apr 5, 2026 • edited by cubic-dev-ai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Usage

Test plan

Summary by cubic

Uh oh!

vercel bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pkg-pr-new bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

cursor bot left a comment

Choose a reason for hiding this comment

Uh oh!

cursor bot Apr 5, 2026

Choose a reason for hiding this comment

--timeline-only ignored without ffmpeg or with --no-preprocess

Uh oh!

cursor bot Apr 5, 2026

Choose a reason for hiding this comment

Unused variable frameSize declared but never read

Uh oh!

Uh oh!

cursor bot Apr 5, 2026

Choose a reason for hiding this comment

ffmpeg -af crashes on videos without audio

Uh oh!

github-actions bot commented Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Test Results

❌ Website Test: failed

Session Recording

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

vercel bot Apr 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

aidenybai commented Apr 5, 2026 •

edited by cubic-dev-ai bot

Loading

vercel bot commented Apr 5, 2026 •

edited

Loading

pkg-pr-new bot commented Apr 5, 2026 •

edited

Loading

cubic-dev-ai bot Apr 5, 2026 •

edited

Loading

`--timeline-only` ignored without ffmpeg or with `--no-preprocess`

Unused variable `frameSize` declared but never read

ffmpeg `-af` crashes on videos without audio

github-actions bot commented Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading

vercel bot Apr 5, 2026 •

edited

Loading