feat(meet_video): mascot canvas as outbound camera in Google Meet#1359
feat(meet_video): mascot canvas as outbound camera in Google Meet#1359senamakel merged 4 commits intotinyhumansai:mainfrom
Conversation
Override navigator.mediaDevices.getUserMedia in the Meet webview so the agent's outbound video is a programmatically drawn mascot rather than the single-frame Y4M backed by --use-file-for-fake-video-capture. The bridge runs at document-start (installed via the same CDP path as audio_bridge.js / captions_bridge.js), decodes idelMascot.svg + Bookreading.svg into a hidden 640x480 canvas, runs an rAF loop with a subtle bob so codecs don't read it as frozen, and toggles between idle and thinking every 5s. Host-side set_mood helper is in place for the agent state machine to drive transitions instead of the JS interval once that wiring lands. Static Y4M fallback at the process-level flag remains as a safety net when the bridge install fails.
…eam splice
Three fixes to make the mascot canvas actually render in the Meet
self-tile without breaking the join flow:
1. Renderer crash on reload. Registering the camera bridge as a third
addScriptToEvaluateOnNewDocument before Page.reload consistently
crashed the CEF 146 renderer ("Target crashed" within ~1s of open).
The 56KB inlined-SVG payload alongside the audio + captions scripts
was the trigger. Switch to post-reload Runtime.evaluate after the
audio bridge confirms the new doc is alive — Meet's first
getUserMedia call only fires after "Ask to join", so we have plenty
of room to inject without racing.
2. WebRTC opus codec collision. The previous getUserMedia override
built a brand-new MediaStream from cloned tracks, which stacked on
top of the audio bridge's existing patch and produced duplicate
audio senders ([111: audio/opus] twice in the BUNDLE). Now we
delegate the full constraints to the existing chain (audio bridge
+ Chromium) and only swap the video track in place via
removeTrack/addTrack. Also dropped the enumerateDevices override
(cosmetic; was returning plain objects that risked breaking Meet's
device-picker iteration).
3. Static / undecoded mascot. The SVG-to-data-URI encoder skipped
spaces, so attribute-value spaces in viewBox/path data left raw
spaces in the URI that Chromium rejects in <img src>. Replace the
ad-hoc escaping with full RFC-3986 percent-encoding of every byte
outside the unreserved + path-safe whitelist. Also switch the
render loop from requestAnimationFrame to setInterval so the camera
keeps producing frames when the meet window is backgrounded behind
the main openhuman window.
📝 WalkthroughWalkthroughAdds a new meet_video module that embeds two mascot SVGs, builds a camera_bridge.js payload, injects it into Meet via CDP, exposes host mood/status controls, renders a canvas-backed virtual camera replacing outgoing video while preserving audio, and integrates camera injection into the existing audio-bridge flow with graceful fallback. ChangesCamera Bridge for Google Meet Video Stream
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes Possibly related issues
Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
app/src-tauri/src/meet_video/inject.rs (1)
109-130: ⚡ Quick winPropagate rejected mood changes instead of returning
Ok(()).
__openhumanSetMood()returnsfalsewhen the bridge is missing or the mood is unknown, but this wrapper only checksexceptionDetails. Once the host state machine starts calling this, a no-op will look like success.🔧 Proposed fix
let res = cdp .call( "Runtime.evaluate", json!({ "expression": expression, "returnByValue": true }), Some(session), ) .await .map_err(|e| format!("Runtime.evaluate set_mood: {e}"))?; if let Some(exception) = res.get("exceptionDetails") { return Err(format!("page exception: {exception}")); } + let applied = res + .get("result") + .and_then(|r| r.get("value")) + .and_then(|v| v.as_bool()) + .unwrap_or(false); + if !applied { + return Err(format!("camera bridge rejected mood change: {mood}")); + } Ok(()) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src-tauri/src/meet_video/inject.rs` around lines 109 - 130, The wrapper set_mood currently only checks Runtime.evaluate errors and exceptionDetails but ignores the actual return value of window.__openhumanSetMood, so failed/no-op mood sets appear as success; after the cdp.call and exceptionDetails check, inspect the evaluation result (res -> "result" -> "value") returned by Runtime.evaluate for a boolean false (or missing/undefined) and convert that into an Err with a clear message (e.g. "set_mood rejected" or "bridge missing/unknown mood") instead of returning Ok(()); update set_mood to return Err when __openhumanSetMood returns false or the value is absent so callers see rejected mood changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@app/src-tauri/src/meet_video/mod.rs`:
- Around line 82-108: The svg_to_data_uri function currently treats apostrophe
(') as unreserved in is_unreserved, which allows raw ' to appear in the data URI
and break single-quoted JS literals; update is_unreserved (used by
svg_to_data_uri) to stop whitelisting b'\'' so that single quotes are
percent-encoded (e.g. %27) when building the URI, and add a unit test that calls
svg_to_data_uri with an SVG containing a single-quote (e.g. "<svg
data-name='mascot'/>") asserting the encoded body contains "%27" and does not
contain a raw '\''.
---
Nitpick comments:
In `@app/src-tauri/src/meet_video/inject.rs`:
- Around line 109-130: The wrapper set_mood currently only checks
Runtime.evaluate errors and exceptionDetails but ignores the actual return value
of window.__openhumanSetMood, so failed/no-op mood sets appear as success; after
the cdp.call and exceptionDetails check, inspect the evaluation result (res ->
"result" -> "value") returned by Runtime.evaluate for a boolean false (or
missing/undefined) and convert that into an Err with a clear message (e.g.
"set_mood rejected" or "bridge missing/unknown mood") instead of returning
Ok(()); update set_mood to return Err when __openhumanSetMood returns false or
the value is absent so callers see rejected mood changes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: e7d5f192-6ef4-45a5-82e7-1fe9e45c06dd
📒 Files selected for processing (5)
app/src-tauri/src/lib.rsapp/src-tauri/src/meet_audio/inject.rsapp/src-tauri/src/meet_video/camera_bridge.jsapp/src-tauri/src/meet_video/inject.rsapp/src-tauri/src/meet_video/mod.rs
CodeRabbit flag (PR tinyhumansai#1359, review id 3206360961): the SVG-to-data-URI encoder whitelisted `'` as unreserved, but the resulting URI is interpolated into single-quoted JS literals in `camera_bridge.js` (e.g. `MASCOTS = { idle: '...' }`). A raw `'` in the SVG body would have terminated the string literal and broken the bridge install. Drop apostrophe from `is_unreserved` so it gets percent-encoded as `%27` and add a regression test for the JS-string-context interaction.
There was a problem hiding this comment.
🧹 Nitpick comments (1)
app/src-tauri/src/meet_video/mod.rs (1)
35-37: ⚡ Quick winUpdate stale “base64” wording in comments.
These lines describe base64 data URIs, but the implementation now emits URL-encoded
data:image/svg+xml;charset=utf-8,URIs. Aligning comments avoids debugging confusion.Proposed doc-only patch
-//! - This file — embeds the two mascot SVGs at build time and templates -//! them into the bridge JS as `data:image/svg+xml;base64,...` URIs, +//! - This file — embeds the two mascot SVGs at build time and templates +//! them into the bridge JS as URL-encoded +//! `data:image/svg+xml;charset=utf-8,...` URIs, //! keeping the bridge fully self-contained inside the Meet origin. @@ -/// Bridge JS template. Two `__OPENHUMAN_MASCOT_*_DATAURI__` tokens are -/// substituted at install time with base64'd SVG data URIs. +/// Bridge JS template. Two `__OPENHUMAN_MASCOT_*_DATAURI__` tokens are +/// substituted at install time with URL-encoded SVG data URIs.Also applies to: 59-60
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@app/src-tauri/src/meet_video/mod.rs` around lines 35 - 37, Update the module doc comment that describes how the two mascot SVGs are embedded and templated into the bridge JS: replace references to "base64" and "data:image/svg+xml;base64,..." with wording that they are URL-encoded and use the `data:image/svg+xml;charset=utf-8,` URI form (and adjust any example URI text accordingly); this change should be applied to the top-level module comment that mentions embedding the mascots and the second occurrence later in the file so both descriptions reflect URL-encoded data URIs instead of base64.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@app/src-tauri/src/meet_video/mod.rs`:
- Around line 35-37: Update the module doc comment that describes how the two
mascot SVGs are embedded and templated into the bridge JS: replace references to
"base64" and "data:image/svg+xml;base64,..." with wording that they are
URL-encoded and use the `data:image/svg+xml;charset=utf-8,` URI form (and adjust
any example URI text accordingly); this change should be applied to the
top-level module comment that mentions embedding the mascots and the second
occurrence later in the file so both descriptions reflect URL-encoded data URIs
instead of base64.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: bfeb4253-6be8-4fb9-95bd-ff83018846a3
📒 Files selected for processing (1)
app/src-tauri/src/meet_video/mod.rs
Summary
<canvas>rendered into Meet via a JS bridge that monkey-patchesnavigator.mediaDevices.getUserMedia.meet_audio/inject.rs), with one important deviation noted below.Problem
The Meet agent's outbound video has been a static mascot frame ever since
fake_camera/mod.rsstarted baking the SVG into a one-frame Y4M and pointing Chromium's--use-file-for-fake-video-captureat it. We want the agent's camera tile to reflect what the agent is actually doing — at minimum, alternate between an idle pose and a thinking pose — so participants can read the agent's state at a glance. The Y4M flag is process-level and immutable per call, so dynamic content needs a different mechanism.Solution
New module
app/src-tauri/src/meet_video/:camera_bridge.js— runs in the Meet page, builds a hidden 640×480 canvas, decodes idle + thinking SVGs, drives asetIntervalrender loop with a subtle bob, exposescanvas.captureStream(30), and monkey-patchesgetUserMediaso any{video:true}request swaps the canvas track in place on the stream returned by the existing audio bridge.window.__openhumanSetMood(name)is exposed for future host-driven control; for now an unconditional 5 s toggle drives the visible behavior.mod.rs—include_str!sremotion/public/idelMascot.svgandBookreading.svg, percent-encodes them asdata:image/svg+xmlURIs, templates them into the bridge JS at install time.inject.rs—install_camera_bridge_post_reload(Runtime.evaluate-based),confirm_bridge_alive, and a host-sideset_moodhelper for the agent state machine.Three deliberate decisions worth flagging for review:
Why JS injection (not a CEF patch) — the long-term shape would be a real CEF-side custom video device, but
vendor/tauri-cefonly contains Tauri's Rust bindings; the actual libcef binary is downloaded prebuilt. PatchingFileVideoCaptureDevicewould mean owning a Chromium-from-source build pipeline. JS injection into Meet (already grandfathered for the audio + captions bridges) was the scoped, ship-today path. The module docstring documents this.Why post-reload
Runtime.evaluate, notaddScriptToEvaluateOnNewDocument— registering the camera bridge as a third pre-document script alongside audio + captions reliably crashed the CEF 146 renderer onPage.reload("Target crashed" within ~1 s of opening;meet-scannerwould log it; the page never came back). The 56 KB inlined-SVG payload is the trigger. Meet's firstgetUserMediacall only fires after "Ask to join" — multiple seconds after navigation — so a post-reloadRuntime.evaluatelands well before it's needed and keeps the renderer happy.Why splice the video track in place, not build a new
MediaStream— the audio bridge already correctly handles{audio,video}requests by stripping audio from Chromium's fake-camera stream and adding its own. Building a new MediaStream from cloned tracks on top of that produced duplicate audio senders against the same destination, which manifested as a WebRTC "BUNDLE group contains a codec collision between [111: audio/opus] and [111: audio/opus]" error and broke the Meet join flow. Now we delegate the full constraints to the existing chain and only swap the video track in place viaremoveTrack/addTrack.The static Y4M plumbing in
fake_camera/mod.rsis intentionally untouched — it's the process-level fallback if our bridge ever fails to install for any reason.Submission Checklist
docs/TESTING-STRATEGY.mddocs/TESTING-STRATEGY.md)Closes #NNN— no tracking issue.Impact
https://meet.google.com/). Other webviews keep the existing static-Y4M behavior via the process-level--use-file-for-fake-video-captureflag.meet_video/mod.rs's docstring.SEA StreamContext ctor parent is null, occasional SRTP unprotect failures) remain and are out of scope.Related
src/openhuman/meet_agent/session.rs) to callmeet_video::inject::set_moodover CDP on real phase transitions instead of relying on the JS-side 5 s auto-toggle. Helper exists; just needs invoking.AI Authored PR Metadata (required for Codex/Linear PRs)
Linear Issue
Commit & Branch
Validation Run
pnpm --filter openhuman-app format:check(pre-push hook)pnpm typecheck(pre-push hook)cargo test --manifest-path app/src-tauri/Cargo.toml --lib meet_video— 3/3 passingcargo check --manifest-path app/src-tauri/Cargo.tomlcleanValidation Blocked
Behavior Changes
Parity Contract
fake_camera/mod.rsand the process-level--use-file-for-fake-video-captureflag are untouched and still serve as the fallback when bridge install fails.getUserMediachain stays the canonical owner of audio splicing.Duplicate / Superseded PR Handling
Summary by CodeRabbit
New Features
Bug Fixes