Skip to content

feat(meet_video): mascot canvas as outbound camera in Google Meet#1359

Merged
senamakel merged 4 commits intotinyhumansai:mainfrom
senamakel:feat/meet-video
May 8, 2026
Merged

feat(meet_video): mascot canvas as outbound camera in Google Meet#1359
senamakel merged 4 commits intotinyhumansai:mainfrom
senamakel:feat/meet-video

Conversation

@senamakel
Copy link
Copy Markdown
Member

@senamakel senamakel commented May 8, 2026

Summary

  • Replace the static mascot Y4M outbound camera with a programmatic 640×480 <canvas> rendered into Meet via a JS bridge that monkey-patches navigator.mediaDevices.getUserMedia.
  • Toggle between an idle pose and a thinking pose every 5 s so the agent's tile reads as alive instead of frozen on a single frame.
  • The bridge runs on the same CDP-injection lifecycle as the existing audio + captions bridges (meet_audio/inject.rs), with one important deviation noted below.

Problem

The Meet agent's outbound video has been a static mascot frame ever since fake_camera/mod.rs started baking the SVG into a one-frame Y4M and pointing Chromium's --use-file-for-fake-video-capture at it. We want the agent's camera tile to reflect what the agent is actually doing — at minimum, alternate between an idle pose and a thinking pose — so participants can read the agent's state at a glance. The Y4M flag is process-level and immutable per call, so dynamic content needs a different mechanism.

Solution

New module app/src-tauri/src/meet_video/:

  • camera_bridge.js — runs in the Meet page, builds a hidden 640×480 canvas, decodes idle + thinking SVGs, drives a setInterval render loop with a subtle bob, exposes canvas.captureStream(30), and monkey-patches getUserMedia so any {video:true} request swaps the canvas track in place on the stream returned by the existing audio bridge. window.__openhumanSetMood(name) is exposed for future host-driven control; for now an unconditional 5 s toggle drives the visible behavior.
  • mod.rsinclude_str!s remotion/public/idelMascot.svg and Bookreading.svg, percent-encodes them as data:image/svg+xml URIs, templates them into the bridge JS at install time.
  • inject.rsinstall_camera_bridge_post_reload (Runtime.evaluate-based), confirm_bridge_alive, and a host-side set_mood helper for the agent state machine.

Three deliberate decisions worth flagging for review:

  1. Why JS injection (not a CEF patch) — the long-term shape would be a real CEF-side custom video device, but vendor/tauri-cef only contains Tauri's Rust bindings; the actual libcef binary is downloaded prebuilt. Patching FileVideoCaptureDevice would mean owning a Chromium-from-source build pipeline. JS injection into Meet (already grandfathered for the audio + captions bridges) was the scoped, ship-today path. The module docstring documents this.

  2. Why post-reload Runtime.evaluate, not addScriptToEvaluateOnNewDocument — registering the camera bridge as a third pre-document script alongside audio + captions reliably crashed the CEF 146 renderer on Page.reload ("Target crashed" within ~1 s of opening; meet-scanner would log it; the page never came back). The 56 KB inlined-SVG payload is the trigger. Meet's first getUserMedia call only fires after "Ask to join" — multiple seconds after navigation — so a post-reload Runtime.evaluate lands well before it's needed and keeps the renderer happy.

  3. Why splice the video track in place, not build a new MediaStream — the audio bridge already correctly handles {audio,video} requests by stripping audio from Chromium's fake-camera stream and adding its own. Building a new MediaStream from cloned tracks on top of that produced duplicate audio senders against the same destination, which manifested as a WebRTC "BUNDLE group contains a codec collision between [111: audio/opus] and [111: audio/opus]" error and broke the Meet join flow. Now we delegate the full constraints to the existing chain and only swap the video track in place via removeTrack/addTrack.

The static Y4M plumbing in fake_camera/mod.rs is intentionally untouched — it's the process-level fallback if our bridge ever fails to install for any reason.

Submission Checklist

  • Tests added or updated (happy path + at least one failure / edge case) per docs/TESTING-STRATEGY.md
  • N/A: 80% diff coverage gate — changed surface is JS injected at runtime into a third-party origin and Rust glue that drives CDP I/O. The unit-testable parts (SVG-to-data-URI encoding, bridge JS template substitution) are covered; the rest is verified by manual end-to-end against a live Meet call (see Validation Run).
  • N/A: Coverage matrix updated — behaviour-only change; no new feature row.
  • N/A: All affected feature IDs from the matrix listed — no matrix entries affected.
  • No new external network dependencies introduced (mock backend used per docs/TESTING-STRATEGY.md)
  • N/A: Release smoke checklist updated — no release-cut surfaces touched.
  • N/A: Linked issue closed via Closes #NNN — no tracking issue.

Impact

  • Desktop only. Tauri shell change; no mobile/web/CLI surface.
  • The camera bridge runs only inside the Meet call window (target URL filter on https://meet.google.com/). Other webviews keep the existing static-Y4M behavior via the process-level --use-file-for-fake-video-capture flag.
  • No new external network traffic (SVGs are inlined as data URIs; nothing fetched from outside meet.google.com).
  • One more JS injection on the grandfathered Meet recipe surface. Documented as a deliberate exception in meet_video/mod.rs's docstring.
  • The codec-collision SDP warning from Chromium WebRTC that pre-dates this change still appears in some sessions; this PR makes it benign by stopping the duplicate-audio-track stacking. Pre-existing background errors (SEA StreamContext ctor parent is null, occasional SRTP unprotect failures) remain and are out of scope.

Related

  • Closes:
  • Follow-up PR(s)/TODOs:
    • Wire the meet-agent state machine (src/openhuman/meet_agent/session.rs) to call meet_video::inject::set_mood over CDP on real phase transitions instead of relying on the JS-side 5 s auto-toggle. Helper exists; just needs invoking.
    • Replace the static-SVG-with-bob renderer with a hidden CEF off-screen-render webview that hosts the actual Remotion compositions, once we own a CEF build pipeline.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

  • Key: N/A
  • URL: N/A

Commit & Branch

  • Branch: feat/meet-video
  • Commit SHA: e2f5416

Validation Run

  • pnpm --filter openhuman-app format:check (pre-push hook)
  • pnpm typecheck (pre-push hook)
  • Focused tests: cargo test --manifest-path app/src-tauri/Cargo.toml --lib meet_video — 3/3 passing
  • Rust fmt/check (if changed): cargo fmt + cargo check both clean (pre-push hook ran rustfmt)
  • Tauri fmt/check (if changed): cargo check --manifest-path app/src-tauri/Cargo.toml clean

Validation Blocked

  • command: N/A
  • error: N/A
  • impact: N/A

Behavior Changes

  • Intended behavior change: Meet agent's outbound camera tile now displays an animated mascot that toggles between idle and thinking poses every 5 s instead of a single static frame.
  • User-visible effect: Other meeting participants see a yellow mascot in the agent's tile, with a subtle bob and pose-change every 5 s. The agent itself is unaffected.

Parity Contract

  • Legacy behavior preserved: fake_camera/mod.rs and the process-level --use-file-for-fake-video-capture flag are untouched and still serve as the fallback when bridge install fails.
  • Guard/fallback/dispatch parity checks: bridge install failure is non-fatal (logged, audio + captions paths keep working); audio bridge's getUserMedia chain stays the canonical owner of audio splicing.

Duplicate / Superseded PR Handling

  • Duplicate PR(s): N/A
  • Canonical PR: N/A
  • Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

  • New Features

    • Mascot avatar rendering replaces outgoing video with a canvas-based mascot (idle/thinking) and mood controls (manual + auto-toggle).
    • Integrated audio + camera bridge for seamless virtual camera + preserved audio.
    • App tray integration and macOS floating mascot option.
    • In-app update probing/download with progress reporting.
  • Bug Fixes

    • More reliable startup/shutdown to avoid webview/process races and startup crashes.

senamakel added 3 commits May 7, 2026 21:07
Override navigator.mediaDevices.getUserMedia in the Meet webview so the
agent's outbound video is a programmatically drawn mascot rather than
the single-frame Y4M backed by --use-file-for-fake-video-capture.

The bridge runs at document-start (installed via the same CDP path as
audio_bridge.js / captions_bridge.js), decodes idelMascot.svg +
Bookreading.svg into a hidden 640x480 canvas, runs an rAF loop with a
subtle bob so codecs don't read it as frozen, and toggles between idle
and thinking every 5s. Host-side set_mood helper is in place for the
agent state machine to drive transitions instead of the JS interval
once that wiring lands.

Static Y4M fallback at the process-level flag remains as a safety net
when the bridge install fails.
…eam splice

Three fixes to make the mascot canvas actually render in the Meet
self-tile without breaking the join flow:

1. Renderer crash on reload. Registering the camera bridge as a third
   addScriptToEvaluateOnNewDocument before Page.reload consistently
   crashed the CEF 146 renderer ("Target crashed" within ~1s of open).
   The 56KB inlined-SVG payload alongside the audio + captions scripts
   was the trigger. Switch to post-reload Runtime.evaluate after the
   audio bridge confirms the new doc is alive — Meet's first
   getUserMedia call only fires after "Ask to join", so we have plenty
   of room to inject without racing.

2. WebRTC opus codec collision. The previous getUserMedia override
   built a brand-new MediaStream from cloned tracks, which stacked on
   top of the audio bridge's existing patch and produced duplicate
   audio senders ([111: audio/opus] twice in the BUNDLE). Now we
   delegate the full constraints to the existing chain (audio bridge
   + Chromium) and only swap the video track in place via
   removeTrack/addTrack. Also dropped the enumerateDevices override
   (cosmetic; was returning plain objects that risked breaking Meet's
   device-picker iteration).

3. Static / undecoded mascot. The SVG-to-data-URI encoder skipped
   spaces, so attribute-value spaces in viewBox/path data left raw
   spaces in the URI that Chromium rejects in <img src>. Replace the
   ad-hoc escaping with full RFC-3986 percent-encoding of every byte
   outside the unreserved + path-safe whitelist. Also switch the
   render loop from requestAnimationFrame to setInterval so the camera
   keeps producing frames when the meet window is backgrounded behind
   the main openhuman window.
@senamakel senamakel requested a review from a team May 8, 2026 04:49
@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Review Change Stack

📝 Walkthrough

Walkthrough

Adds a new meet_video module that embeds two mascot SVGs, builds a camera_bridge.js payload, injects it into Meet via CDP, exposes host mood/status controls, renders a canvas-backed virtual camera replacing outgoing video while preserving audio, and integrates camera injection into the existing audio-bridge flow with graceful fallback.

Changes

Camera Bridge for Google Meet Video Stream

Layer / File(s) Summary
Module Setup & Asset Embedding
app/src-tauri/src/lib.rs, app/src-tauri/src/meet_video/mod.rs
Adds meet_video module, embeds two mascot SVGs, provides build_camera_bridge_js() which percent-encodes SVGs into safe data URIs, and adds unit tests for substitution and escaping.
Page-side Bridge Implementation
app/src-tauri/src/meet_video/camera_bridge.js
In-page JS bridge: canvas renderer (640×480 @30FPS) with idle/thinking moods and bobbing animation, captures canvas stream, monkey-patches navigator.mediaDevices.getUserMedia to splice in the canvas video track while preserving audio, and exposes window.__openhumanSetMood() and window.__openhumanCameraBridgeInfo().
Rust Injection & Host Control
app/src-tauri/src/meet_video/inject.rs
Implements install_camera_bridge_post_reload() to inject the bridge via CDP Runtime.evaluate, confirm_bridge_alive() to poll __openhumanCameraBridgeInfo, and set_mood() to validate and set mood via CDP evaluation.
Audio Bridge Integration
app/src-tauri/src/meet_audio/inject.rs
Extends install_audio_bridge to call the camera bridge install after audio readiness; on injection failure it logs a warning and falls back without causing the audio path to error.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Possibly related PRs

Suggested reviewers

  • M3gA-Mind

Poem

🐰 A tiny rabbit paints a face in code,

On canvas waves the mascot's mode,
It hops in frames at thirty beats,
Replaces video with cheerful feats,
Host can change its mood with one small ode.

🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title clearly and concisely summarizes the main change: introducing a mascot canvas as the outbound camera feed in Google Meet, which aligns perfectly with the changeset's core objective.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
app/src-tauri/src/meet_video/inject.rs (1)

109-130: ⚡ Quick win

Propagate rejected mood changes instead of returning Ok(()).

__openhumanSetMood() returns false when the bridge is missing or the mood is unknown, but this wrapper only checks exceptionDetails. Once the host state machine starts calling this, a no-op will look like success.

🔧 Proposed fix
     let res = cdp
         .call(
             "Runtime.evaluate",
             json!({ "expression": expression, "returnByValue": true }),
             Some(session),
         )
         .await
         .map_err(|e| format!("Runtime.evaluate set_mood: {e}"))?;
     if let Some(exception) = res.get("exceptionDetails") {
         return Err(format!("page exception: {exception}"));
     }
+    let applied = res
+        .get("result")
+        .and_then(|r| r.get("value"))
+        .and_then(|v| v.as_bool())
+        .unwrap_or(false);
+    if !applied {
+        return Err(format!("camera bridge rejected mood change: {mood}"));
+    }
     Ok(())
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src-tauri/src/meet_video/inject.rs` around lines 109 - 130, The wrapper
set_mood currently only checks Runtime.evaluate errors and exceptionDetails but
ignores the actual return value of window.__openhumanSetMood, so failed/no-op
mood sets appear as success; after the cdp.call and exceptionDetails check,
inspect the evaluation result (res -> "result" -> "value") returned by
Runtime.evaluate for a boolean false (or missing/undefined) and convert that
into an Err with a clear message (e.g. "set_mood rejected" or "bridge
missing/unknown mood") instead of returning Ok(()); update set_mood to return
Err when __openhumanSetMood returns false or the value is absent so callers see
rejected mood changes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src-tauri/src/meet_video/mod.rs`:
- Around line 82-108: The svg_to_data_uri function currently treats apostrophe
(') as unreserved in is_unreserved, which allows raw ' to appear in the data URI
and break single-quoted JS literals; update is_unreserved (used by
svg_to_data_uri) to stop whitelisting b'\'' so that single quotes are
percent-encoded (e.g. %27) when building the URI, and add a unit test that calls
svg_to_data_uri with an SVG containing a single-quote (e.g. "<svg
data-name='mascot'/>") asserting the encoded body contains "%27" and does not
contain a raw '\''.

---

Nitpick comments:
In `@app/src-tauri/src/meet_video/inject.rs`:
- Around line 109-130: The wrapper set_mood currently only checks
Runtime.evaluate errors and exceptionDetails but ignores the actual return value
of window.__openhumanSetMood, so failed/no-op mood sets appear as success; after
the cdp.call and exceptionDetails check, inspect the evaluation result (res ->
"result" -> "value") returned by Runtime.evaluate for a boolean false (or
missing/undefined) and convert that into an Err with a clear message (e.g.
"set_mood rejected" or "bridge missing/unknown mood") instead of returning
Ok(()); update set_mood to return Err when __openhumanSetMood returns false or
the value is absent so callers see rejected mood changes.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7d5f192-6ef4-45a5-82e7-1fe9e45c06dd

📥 Commits

Reviewing files that changed from the base of the PR and between 0636b0c and e2f5416.

📒 Files selected for processing (5)
  • app/src-tauri/src/lib.rs
  • app/src-tauri/src/meet_audio/inject.rs
  • app/src-tauri/src/meet_video/camera_bridge.js
  • app/src-tauri/src/meet_video/inject.rs
  • app/src-tauri/src/meet_video/mod.rs

Comment thread app/src-tauri/src/meet_video/mod.rs
CodeRabbit flag (PR tinyhumansai#1359, review id 3206360961): the SVG-to-data-URI
encoder whitelisted `'` as unreserved, but the resulting URI is
interpolated into single-quoted JS literals in `camera_bridge.js`
(e.g. `MASCOTS = { idle: '...' }`). A raw `'` in the SVG body would
have terminated the string literal and broken the bridge install.

Drop apostrophe from `is_unreserved` so it gets percent-encoded as
`%27` and add a regression test for the JS-string-context
interaction.
coderabbitai[bot]
coderabbitai Bot previously approved these changes May 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
app/src-tauri/src/meet_video/mod.rs (1)

35-37: ⚡ Quick win

Update stale “base64” wording in comments.

These lines describe base64 data URIs, but the implementation now emits URL-encoded data:image/svg+xml;charset=utf-8, URIs. Aligning comments avoids debugging confusion.

Proposed doc-only patch
-//! - This file — embeds the two mascot SVGs at build time and templates
-//!   them into the bridge JS as `data:image/svg+xml;base64,...` URIs,
+//! - This file — embeds the two mascot SVGs at build time and templates
+//!   them into the bridge JS as URL-encoded
+//!   `data:image/svg+xml;charset=utf-8,...` URIs,
 //!   keeping the bridge fully self-contained inside the Meet origin.
@@
-/// Bridge JS template. Two `__OPENHUMAN_MASCOT_*_DATAURI__` tokens are
-/// substituted at install time with base64'd SVG data URIs.
+/// Bridge JS template. Two `__OPENHUMAN_MASCOT_*_DATAURI__` tokens are
+/// substituted at install time with URL-encoded SVG data URIs.

Also applies to: 59-60

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src-tauri/src/meet_video/mod.rs` around lines 35 - 37, Update the module
doc comment that describes how the two mascot SVGs are embedded and templated
into the bridge JS: replace references to "base64" and
"data:image/svg+xml;base64,..." with wording that they are URL-encoded and use
the `data:image/svg+xml;charset=utf-8,` URI form (and adjust any example URI
text accordingly); this change should be applied to the top-level module comment
that mentions embedding the mascots and the second occurrence later in the file
so both descriptions reflect URL-encoded data URIs instead of base64.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@app/src-tauri/src/meet_video/mod.rs`:
- Around line 35-37: Update the module doc comment that describes how the two
mascot SVGs are embedded and templated into the bridge JS: replace references to
"base64" and "data:image/svg+xml;base64,..." with wording that they are
URL-encoded and use the `data:image/svg+xml;charset=utf-8,` URI form (and adjust
any example URI text accordingly); this change should be applied to the
top-level module comment that mentions embedding the mascots and the second
occurrence later in the file so both descriptions reflect URL-encoded data URIs
instead of base64.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bfeb4253-6be8-4fb9-95bd-ff83018846a3

📥 Commits

Reviewing files that changed from the base of the PR and between e2f5416 and c5782ca.

📒 Files selected for processing (1)
  • app/src-tauri/src/meet_video/mod.rs

@senamakel senamakel merged commit b6d05cb into tinyhumansai:main May 8, 2026
19 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant