Skip to content

web-client: formant-weighted radial waveform for speaking state#348

Merged
sonichi merged 1 commit intomainfrom
feat/formant-weighted-waveform
Apr 15, 2026
Merged

web-client: formant-weighted radial waveform for speaking state#348
sonichi merged 1 commit intomainfrom
feat/formant-weighted-waveform

Conversation

@sonichi
Copy link
Copy Markdown
Owner

@sonichi sonichi commented Apr 15, 2026

Summary

Stacks on merged #338. Makes the 24-bar radial waveform respond to phoneme changes, not just overall amplitude. Top-3 peak bins in the frequency spectrum act as formant proxies (F1/F2/F3); bars within 6 bins of a peak get up to 1.8× their raw height. The ring shape visibly rotates/morphs when the vowel changes ("ahhh" → "eee"), which reads as "the avatar is articulating" rather than "a VU meter next to the avatar."

Why this approach

Converged with @sutando#9708 in #dev today after discussing three scope options:

  1. Bump hero avatar to 200–300px and do an SVG overlay — unblocked a bigger hero-screen redesign. Deferred as follow-up.
  2. SVG overlay on current avatar sizes — rejected: at 44×44/80×80 an overlay drawn on the image competes with character art at ~5% scale and doesn't visually resolve.
  3. Formant-weighted radial waveform ← this PR. Ring lives in the 24–30 radius margin outside the image, proven visible from feat: avatar animation — speaking (green) + working (blue) #338.

What changes

  • src/web-client.ts — inside existing startSpeakingDetection() closure: add findPeaks() (linear top-K local-max scan, K=3), and replace the bar-draw loop's val = buf[…]/255 with a formant-weighted value.
  • No new deps, no CSS edits, no new files, no asset changes.
  • +37/-1 lines total.

Test plan

  • npx tsc --noEmit --skipLibCheck clean
  • Embedded browser-JS extracted via python and node --check-ed clean (per feedback_web_client_embedded_js_no_ts)
  • Manual: hard-refresh, start voice, speak sustained "ahhh" vs "eee" — ring shape visibly differs
  • Silence: speaking=false gates the draw — existing behavior preserved (no peaks found → boost=1.0 → raw amplitude → same as today)

Graceful degradation

When the spectrum is low (silence) findPeaks leaves all peakIdx[k] = -1, minDist stays 999, boost = 1.0, val = raw. Output is byte-identical to the pre-PR behavior. If the whole analyser pipeline fails, speaking = false still toggles the border CSS on the <img> — the visor keeps its color animation even without the canvas overlay.

Scope explicitly NOT in this PR

  • Full SVG-overlay avatar redesign on a larger hero surface (Option 1 in the scope discussion). MacBook is filing a follow-up issue to track it.
  • Working-state redesign (Option G from the research memo). Separate PR if/when owner wants it.

References

  • Research memo: notes/avatar-animation-research.md — Option D
  • Pre-written spec: notes/avatar-formant-waveform-spec.md
  • Two-bot convergence thread: today in #dev

🤖 Generated with Claude Code

Stacks on merged PR #338. Makes the 24-bar radial waveform respond to
phoneme changes, not just overall amplitude. Top-3 peak bins in the
frequency spectrum act as formant proxies (approximating F1/F2/F3);
bars within 6 bins of a peak get up to 1.8× their raw height. The ring
shape visibly rotates/morphs when the vowel changes ("ahhh" → "eee"),
reading as "the avatar is articulating" rather than "a VU meter next
to the avatar."

No new deps. No CSS changes. No asset changes. ~40 added lines inside
the existing `startSpeakingDetection()` closure in src/web-client.ts.
Preserves the existing graceful-degradation path: silence → `findPeaks`
leaves all peakIdx=-1 → boost=1.0 → identical to prior behavior.

Ring stays in the canvas margin (radii 24–30) outside the 44×44 image —
proven visible at current display size. Bigger SVG-overlay redesign
for the hero screen is tracked as a separate follow-up per the
Mini/MacBook two-bot consensus today.

Spec at notes/avatar-formant-waveform-spec.md.
Research memo: notes/avatar-animation-research.md (Option D).

Verified:
- npx tsc --noEmit --skipLibCheck clean
- Embedded browser-JS extracted and `node --check`-ed cleanly
  (per feedback_web_client_embedded_js_no_ts rule — no TS syntax inside
  template literal)

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Owner Author

@sonichi sonichi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MacBook review: LGTM. Clean formant extraction — top-3 peak detection with proximity-weighted boost. The 6-bin radius and 0.8 max boost factor are reasonable defaults. No TS-in-JS mistakes this time. Merge when ready.

@sonichi sonichi merged commit a4e0d99 into main Apr 15, 2026
1 check passed
@sonichi sonichi deleted the feat/formant-weighted-waveform branch April 15, 2026 20:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant