feat(meet_video): mascot canvas as outbound camera in Google Meet by senamakel · Pull Request #1359 · tinyhumansai/openhuman

senamakel · 2026-05-08T04:49:48Z

Summary

Replace the static mascot Y4M outbound camera with a programmatic 640×480 <canvas> rendered into Meet via a JS bridge that monkey-patches navigator.mediaDevices.getUserMedia.
Toggle between an idle pose and a thinking pose every 5 s so the agent's tile reads as alive instead of frozen on a single frame.
The bridge runs on the same CDP-injection lifecycle as the existing audio + captions bridges (meet_audio/inject.rs), with one important deviation noted below.

Problem

The Meet agent's outbound video has been a static mascot frame ever since fake_camera/mod.rs started baking the SVG into a one-frame Y4M and pointing Chromium's --use-file-for-fake-video-capture at it. We want the agent's camera tile to reflect what the agent is actually doing — at minimum, alternate between an idle pose and a thinking pose — so participants can read the agent's state at a glance. The Y4M flag is process-level and immutable per call, so dynamic content needs a different mechanism.

Solution

New module app/src-tauri/src/meet_video/:

camera_bridge.js — runs in the Meet page, builds a hidden 640×480 canvas, decodes idle + thinking SVGs, drives a setInterval render loop with a subtle bob, exposes canvas.captureStream(30), and monkey-patches getUserMedia so any {video:true} request swaps the canvas track in place on the stream returned by the existing audio bridge. window.__openhumanSetMood(name) is exposed for future host-driven control; for now an unconditional 5 s toggle drives the visible behavior.
mod.rs — include_str!s remotion/public/idelMascot.svg and Bookreading.svg, percent-encodes them as data:image/svg+xml URIs, templates them into the bridge JS at install time.
inject.rs — install_camera_bridge_post_reload (Runtime.evaluate-based), confirm_bridge_alive, and a host-side set_mood helper for the agent state machine.

Three deliberate decisions worth flagging for review:

Why JS injection (not a CEF patch) — the long-term shape would be a real CEF-side custom video device, but vendor/tauri-cef only contains Tauri's Rust bindings; the actual libcef binary is downloaded prebuilt. Patching FileVideoCaptureDevice would mean owning a Chromium-from-source build pipeline. JS injection into Meet (already grandfathered for the audio + captions bridges) was the scoped, ship-today path. The module docstring documents this.
Why post-reload Runtime.evaluate, not addScriptToEvaluateOnNewDocument — registering the camera bridge as a third pre-document script alongside audio + captions reliably crashed the CEF 146 renderer on Page.reload ("Target crashed" within ~1 s of opening; meet-scanner would log it; the page never came back). The 56 KB inlined-SVG payload is the trigger. Meet's first getUserMedia call only fires after "Ask to join" — multiple seconds after navigation — so a post-reload Runtime.evaluate lands well before it's needed and keeps the renderer happy.
Why splice the video track in place, not build a new MediaStream — the audio bridge already correctly handles {audio,video} requests by stripping audio from Chromium's fake-camera stream and adding its own. Building a new MediaStream from cloned tracks on top of that produced duplicate audio senders against the same destination, which manifested as a WebRTC "BUNDLE group contains a codec collision between [111: audio/opus] and [111: audio/opus]" error and broke the Meet join flow. Now we delegate the full constraints to the existing chain and only swap the video track in place via removeTrack/addTrack.

The static Y4M plumbing in fake_camera/mod.rs is intentionally untouched — it's the process-level fallback if our bridge ever fails to install for any reason.

Submission Checklist

Tests added or updated (happy path + at least one failure / edge case) per docs/TESTING-STRATEGY.md
N/A: 80% diff coverage gate — changed surface is JS injected at runtime into a third-party origin and Rust glue that drives CDP I/O. The unit-testable parts (SVG-to-data-URI encoding, bridge JS template substitution) are covered; the rest is verified by manual end-to-end against a live Meet call (see Validation Run).
N/A: Coverage matrix updated — behaviour-only change; no new feature row.
N/A: All affected feature IDs from the matrix listed — no matrix entries affected.
No new external network dependencies introduced (mock backend used per docs/TESTING-STRATEGY.md)
N/A: Release smoke checklist updated — no release-cut surfaces touched.
N/A: Linked issue closed via Closes #NNN — no tracking issue.

Impact

Desktop only. Tauri shell change; no mobile/web/CLI surface.
The camera bridge runs only inside the Meet call window (target URL filter on https://meet.google.com/). Other webviews keep the existing static-Y4M behavior via the process-level --use-file-for-fake-video-capture flag.
No new external network traffic (SVGs are inlined as data URIs; nothing fetched from outside meet.google.com).
One more JS injection on the grandfathered Meet recipe surface. Documented as a deliberate exception in meet_video/mod.rs's docstring.
The codec-collision SDP warning from Chromium WebRTC that pre-dates this change still appears in some sessions; this PR makes it benign by stopping the duplicate-audio-track stacking. Pre-existing background errors (SEA StreamContext ctor parent is null, occasional SRTP unprotect failures) remain and are out of scope.

Closes:
Follow-up PR(s)/TODOs:
- Wire the meet-agent state machine (src/openhuman/meet_agent/session.rs) to call meet_video::inject::set_mood over CDP on real phase transitions instead of relying on the JS-side 5 s auto-toggle. Helper exists; just needs invoking.
- Replace the static-SVG-with-bob renderer with a hidden CEF off-screen-render webview that hosts the actual Remotion compositions, once we own a CEF build pipeline.

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Key: N/A
URL: N/A

Commit & Branch

Branch: feat/meet-video
Commit SHA: e2f5416

Validation Run

pnpm --filter openhuman-app format:check (pre-push hook)
pnpm typecheck (pre-push hook)
Focused tests: cargo test --manifest-path app/src-tauri/Cargo.toml --lib meet_video — 3/3 passing
Rust fmt/check (if changed): cargo fmt + cargo check both clean (pre-push hook ran rustfmt)
Tauri fmt/check (if changed): cargo check --manifest-path app/src-tauri/Cargo.toml clean

Validation Blocked

command: N/A
error: N/A
impact: N/A

Behavior Changes

Intended behavior change: Meet agent's outbound camera tile now displays an animated mascot that toggles between idle and thinking poses every 5 s instead of a single static frame.
User-visible effect: Other meeting participants see a yellow mascot in the agent's tile, with a subtle bob and pose-change every 5 s. The agent itself is unaffected.

Parity Contract

Legacy behavior preserved: fake_camera/mod.rs and the process-level --use-file-for-fake-video-capture flag are untouched and still serve as the fallback when bridge install fails.
Guard/fallback/dispatch parity checks: bridge install failure is non-fatal (logged, audio + captions paths keep working); audio bridge's getUserMedia chain stays the canonical owner of audio splicing.

Duplicate / Superseded PR Handling

Duplicate PR(s): N/A
Canonical PR: N/A
Resolution (closed/superseded/updated): N/A

Summary by CodeRabbit

New Features
- Mascot avatar rendering replaces outgoing video with a canvas-based mascot (idle/thinking) and mood controls (manual + auto-toggle).
- Integrated audio + camera bridge for seamless virtual camera + preserved audio.
- App tray integration and macOS floating mascot option.
- In-app update probing/download with progress reporting.
Bug Fixes
- More reliable startup/shutdown to avoid webview/process races and startup crashes.

Override navigator.mediaDevices.getUserMedia in the Meet webview so the agent's outbound video is a programmatically drawn mascot rather than the single-frame Y4M backed by --use-file-for-fake-video-capture. The bridge runs at document-start (installed via the same CDP path as audio_bridge.js / captions_bridge.js), decodes idelMascot.svg + Bookreading.svg into a hidden 640x480 canvas, runs an rAF loop with a subtle bob so codecs don't read it as frozen, and toggles between idle and thinking every 5s. Host-side set_mood helper is in place for the agent state machine to drive transitions instead of the JS interval once that wiring lands. Static Y4M fallback at the process-level flag remains as a safety net when the bridge install fails.

…eam splice Three fixes to make the mascot canvas actually render in the Meet self-tile without breaking the join flow: 1. Renderer crash on reload. Registering the camera bridge as a third addScriptToEvaluateOnNewDocument before Page.reload consistently crashed the CEF 146 renderer ("Target crashed" within ~1s of open). The 56KB inlined-SVG payload alongside the audio + captions scripts was the trigger. Switch to post-reload Runtime.evaluate after the audio bridge confirms the new doc is alive — Meet's first getUserMedia call only fires after "Ask to join", so we have plenty of room to inject without racing. 2. WebRTC opus codec collision. The previous getUserMedia override built a brand-new MediaStream from cloned tracks, which stacked on top of the audio bridge's existing patch and produced duplicate audio senders ([111: audio/opus] twice in the BUNDLE). Now we delegate the full constraints to the existing chain (audio bridge + Chromium) and only swap the video track in place via removeTrack/addTrack. Also dropped the enumerateDevices override (cosmetic; was returning plain objects that risked breaking Meet's device-picker iteration). 3. Static / undecoded mascot. The SVG-to-data-URI encoder skipped spaces, so attribute-value spaces in viewBox/path data left raw spaces in the URI that Chromium rejects in <img src>. Replace the ad-hoc escaping with full RFC-3986 percent-encoding of every byte outside the unreserved + path-safe whitelist. Also switch the render loop from requestAnimationFrame to setInterval so the camera keeps producing frames when the meet window is backgrounded behind the main openhuman window.

coderabbitai · 2026-05-08T04:50:01Z

📝 Walkthrough

Walkthrough

Adds a new meet_video module that embeds two mascot SVGs, builds a camera_bridge.js payload, injects it into Meet via CDP, exposes host mood/status controls, renders a canvas-backed virtual camera replacing outgoing video while preserving audio, and integrates camera injection into the existing audio-bridge flow with graceful fallback.

Changes

Camera Bridge for Google Meet Video Stream

Layer / File(s)	Summary
Module Setup & Asset Embedding `app/src-tauri/src/lib.rs`, `app/src-tauri/src/meet_video/mod.rs`	Adds `meet_video` module, embeds two mascot SVGs, provides `build_camera_bridge_js()` which percent-encodes SVGs into safe data URIs, and adds unit tests for substitution and escaping.
Page-side Bridge Implementation `app/src-tauri/src/meet_video/camera_bridge.js`	In-page JS bridge: canvas renderer (640×480 `@30FPS`) with idle/thinking moods and bobbing animation, captures canvas stream, monkey-patches `navigator.mediaDevices.getUserMedia` to splice in the canvas video track while preserving audio, and exposes `window.__openhumanSetMood()` and `window.__openhumanCameraBridgeInfo()`.
Rust Injection & Host Control `app/src-tauri/src/meet_video/inject.rs`	Implements `install_camera_bridge_post_reload()` to inject the bridge via CDP `Runtime.evaluate`, `confirm_bridge_alive()` to poll `__openhumanCameraBridgeInfo`, and `set_mood()` to validate and set mood via CDP evaluation.
Audio Bridge Integration `app/src-tauri/src/meet_audio/inject.rs`	Extends `install_audio_bridge` to call the camera bridge install after audio readiness; on injection failure it logs a warning and falls back without causing the audio path to error.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Possibly related issues

Let the mascot join Google Meet as a rendered character avatar #1273: Implements the mascot-as-avatar pipeline requested in the issue by adding camera bridge JS, injection helpers, and getUserMedia splicing.

Possibly related PRs

tinyhumansai/openhuman#1355: Directly related — adds meet_video module and inject APIs invoked from meet_audio injection flow.
tinyhumansai/openhuman#721: Related — both modify CEF/TAURI startup flags (SharedArrayBuffer) in lib.rs.

Suggested reviewers

M3gA-Mind

Poem

🐰 A tiny rabbit paints a face in code,

On canvas waves the mascot's mode,
It hops in frames at thirty beats,
Replaces video with cheerful feats,
Host can change its mood with one small ode.

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and concisely summarizes the main change: introducing a mascot canvas as the outbound camera feed in Google Meet, which aligns perfectly with the changeset's core objective.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 1

🧹 Nitpick comments (1)

app/src-tauri/src/meet_video/inject.rs (1)

109-130: ⚡ Quick win

Propagate rejected mood changes instead of returning Ok(()).

__openhumanSetMood() returns false when the bridge is missing or the mood is unknown, but this wrapper only checks exceptionDetails. Once the host state machine starts calling this, a no-op will look like success.

🔧 Proposed fix

     let res = cdp
         .call(
             "Runtime.evaluate",
             json!({ "expression": expression, "returnByValue": true }),
             Some(session),
         )
         .await
         .map_err(|e| format!("Runtime.evaluate set_mood: {e}"))?;
     if let Some(exception) = res.get("exceptionDetails") {
         return Err(format!("page exception: {exception}"));
     }
+    let applied = res
+        .get("result")
+        .and_then(|r| r.get("value"))
+        .and_then(|v| v.as_bool())
+        .unwrap_or(false);
+    if !applied {
+        return Err(format!("camera bridge rejected mood change: {mood}"));
+    }
     Ok(())
 }

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src-tauri/src/meet_video/inject.rs` around lines 109 - 130, The wrapper
set_mood currently only checks Runtime.evaluate errors and exceptionDetails but
ignores the actual return value of window.__openhumanSetMood, so failed/no-op
mood sets appear as success; after the cdp.call and exceptionDetails check,
inspect the evaluation result (res -> "result" -> "value") returned by
Runtime.evaluate for a boolean false (or missing/undefined) and convert that
into an Err with a clear message (e.g. "set_mood rejected" or "bridge
missing/unknown mood") instead of returning Ok(()); update set_mood to return
Err when __openhumanSetMood returns false or the value is absent so callers see
rejected mood changes.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@app/src-tauri/src/meet_video/mod.rs`:
- Around line 82-108: The svg_to_data_uri function currently treats apostrophe
(') as unreserved in is_unreserved, which allows raw ' to appear in the data URI
and break single-quoted JS literals; update is_unreserved (used by
svg_to_data_uri) to stop whitelisting b'\'' so that single quotes are
percent-encoded (e.g. %27) when building the URI, and add a unit test that calls
svg_to_data_uri with an SVG containing a single-quote (e.g. "<svg
data-name='mascot'/>") asserting the encoded body contains "%27" and does not
contain a raw '\''.

---

Nitpick comments:
In `@app/src-tauri/src/meet_video/inject.rs`:
- Around line 109-130: The wrapper set_mood currently only checks
Runtime.evaluate errors and exceptionDetails but ignores the actual return value
of window.__openhumanSetMood, so failed/no-op mood sets appear as success; after
the cdp.call and exceptionDetails check, inspect the evaluation result (res ->
"result" -> "value") returned by Runtime.evaluate for a boolean false (or
missing/undefined) and convert that into an Err with a clear message (e.g.
"set_mood rejected" or "bridge missing/unknown mood") instead of returning
Ok(()); update set_mood to return Err when __openhumanSetMood returns false or
the value is absent so callers see rejected mood changes.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: e7d5f192-6ef4-45a5-82e7-1fe9e45c06dd

📥 Commits

Reviewing files that changed from the base of the PR and between 0636b0c and e2f5416.

📒 Files selected for processing (5)

app/src-tauri/src/lib.rs
app/src-tauri/src/meet_audio/inject.rs
app/src-tauri/src/meet_video/camera_bridge.js
app/src-tauri/src/meet_video/inject.rs
app/src-tauri/src/meet_video/mod.rs

CodeRabbit flag (PR tinyhumansai#1359, review id 3206360961): the SVG-to-data-URI encoder whitelisted `'` as unreserved, but the resulting URI is interpolated into single-quoted JS literals in `camera_bridge.js` (e.g. `MASCOTS = { idle: '...' }`). A raw `'` in the SVG body would have terminated the string literal and broken the bridge install. Drop apostrophe from `is_unreserved` so it gets percent-encoded as `%27` and add a regression test for the JS-string-context interaction.

coderabbitai

🧹 Nitpick comments (1)

app/src-tauri/src/meet_video/mod.rs (1)

35-37: ⚡ Quick win

Update stale “base64” wording in comments.

These lines describe base64 data URIs, but the implementation now emits URL-encoded data:image/svg+xml;charset=utf-8, URIs. Aligning comments avoids debugging confusion.

Proposed doc-only patch

-//! - This file — embeds the two mascot SVGs at build time and templates
-//!   them into the bridge JS as `data:image/svg+xml;base64,...` URIs,
+//! - This file — embeds the two mascot SVGs at build time and templates
+//!   them into the bridge JS as URL-encoded
+//!   `data:image/svg+xml;charset=utf-8,...` URIs,
 //!   keeping the bridge fully self-contained inside the Meet origin.
@@
-/// Bridge JS template. Two `__OPENHUMAN_MASCOT_*_DATAURI__` tokens are
-/// substituted at install time with base64'd SVG data URIs.
+/// Bridge JS template. Two `__OPENHUMAN_MASCOT_*_DATAURI__` tokens are
+/// substituted at install time with URL-encoded SVG data URIs.

Also applies to: 59-60

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@app/src-tauri/src/meet_video/mod.rs` around lines 35 - 37, Update the module
doc comment that describes how the two mascot SVGs are embedded and templated
into the bridge JS: replace references to "base64" and
"data:image/svg+xml;base64,..." with wording that they are URL-encoded and use
the `data:image/svg+xml;charset=utf-8,` URI form (and adjust any example URI
text accordingly); this change should be applied to the top-level module comment
that mentions embedding the mascots and the second occurrence later in the file
so both descriptions reflect URL-encoded data URIs instead of base64.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@app/src-tauri/src/meet_video/mod.rs`:
- Around line 35-37: Update the module doc comment that describes how the two
mascot SVGs are embedded and templated into the bridge JS: replace references to
"base64" and "data:image/svg+xml;base64,..." with wording that they are
URL-encoded and use the `data:image/svg+xml;charset=utf-8,` URI form (and adjust
any example URI text accordingly); this change should be applied to the
top-level module comment that mentions embedding the mascots and the second
occurrence later in the file so both descriptions reflect URL-encoded data URIs
instead of base64.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: bfeb4253-6be8-4fb9-95bd-ff83018846a3

📥 Commits

Reviewing files that changed from the base of the PR and between e2f5416 and c5782ca.

📒 Files selected for processing (1)

app/src-tauri/src/meet_video/mod.rs

senamakel added 3 commits May 7, 2026 21:07

chore(meet_audio): rustfmt single-line log::warn!

e2f5416

senamakel requested a review from a team May 8, 2026 04:49

coderabbitai Bot requested changes May 8, 2026

View reviewed changes

Comment thread app/src-tauri/src/meet_video/mod.rs

coderabbitai Bot previously approved these changes May 8, 2026

View reviewed changes

senamakel dismissed coderabbitai[bot]’s stale review via c5782ca May 8, 2026 05:18

coderabbitai Bot reviewed May 8, 2026

View reviewed changes

coderabbitai Bot approved these changes May 8, 2026

View reviewed changes

senamakel merged commit b6d05cb into tinyhumansai:main May 8, 2026
19 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(meet_video): mascot canvas as outbound camera in Google Meet#1359

feat(meet_video): mascot canvas as outbound camera in Google Meet#1359
senamakel merged 4 commits intotinyhumansai:mainfrom
senamakel:feat/meet-video

senamakel commented May 8, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented May 8, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

senamakel commented May 8, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Solution

Submission Checklist

Impact

Related

AI Authored PR Metadata (required for Codex/Linear PRs)

Linear Issue

Commit & Branch

Validation Run

Validation Blocked

Behavior Changes

Parity Contract

Duplicate / Superseded PR Handling

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related issues

Possibly related PRs

Suggested reviewers

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

senamakel commented May 8, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented May 8, 2026 •

edited

Loading