Add streaming Silero VAD runner for real-time speech detection by seyeong-han · Pull Request #18507 · pytorch/executorch

seyeong-han · 2026-03-25T21:58:59Z

Summary

Add a streaming CLI entry point (silero_vad_stream_runner) for the Silero VAD model that enables real-time, frame-by-frame voice activity detection from stdin. This powers the "hey torch" wake-up feature in the Voxtral Realtime macOS app.

Changes

New: `silero_vad_stream_runner`

A CLI that reads 16kHz mono float32 PCM from stdin and outputs per-frame speech probabilities via a line protocol:

READY
PROB 0.032 0.039
PROB 0.064 0.808
PROB 0.096 0.980

This enables any app to run Silero VAD as a subprocess — pipe audio in, parse probabilities out. The Voxtral macOS app uses this for hands-free wake-up detection.

New: Streaming API on `SileroVadRunner`

reset_stream() — re-initialize LSTM state and context buffers
process_frame(audio_data, num_samples) — process a single 512-sample chunk, return speech probability, carry LSTM state forward

The existing detect() method now uses process_frame() internally, so offline and streaming paths share the same inference code.

Build changes

CMakeLists.txt — add silero_vad_stream_runner target alongside silero_vad_runner
Remove unnecessary extension_llm_runner link dependency that caused string_view ambiguity with sentencepiece headers
Makefile silero-vad-cpu target — build both runners, configure with -DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=OFF
README.md — document streaming usage, architecture, and line protocol

Usage

# Build
make silero-vad-cpu

# Run on a file
ffmpeg -i input.wav -ar 16000 -ac 1 -f f32le pipe:1 | \
  ./cmake-out/examples/models/silero_vad/silero_vad_stream_runner \
    --model_path silero_vad.pte

# Run on live mic
ffmpeg -f avfoundation -i ":0" -ar 16000 -ac 1 -f f32le pipe:1 | \
  ./cmake-out/examples/models/silero_vad/silero_vad_stream_runner \
    --model_path silero_vad.pte

Test plan

make silero-vad-cpu builds both silero_vad_runner and silero_vad_stream_runner
Offline runner produces same results as before on test WAV files
Stream runner on saved audio produces correct speech probabilities (0.8-1.0 on speech, <0.01 on silence)
Stream runner on live mic input detects speech in real-time
Integration tested with Voxtral Realtime macOS app wake-up flow

Authored with assistance from Claude.

Made with Cursor

Add a new `silero_vad_stream_runner` CLI that reads 16kHz mono float32 PCM from stdin and outputs per-frame speech probabilities via a simple line protocol (`PROB <time> <probability>`). This enables real-time VAD as a subprocess for apps like the Voxtral Realtime macOS dictation app. Changes: - Add `reset_stream()` and `process_frame()` to SileroVadRunner for stateful frame-by-frame inference with persistent LSTM state - Add `stream_main.cpp` as the streaming CLI entry point - Update CMakeLists.txt to build both `silero_vad_runner` (offline) and `silero_vad_stream_runner` (streaming) targets - Remove unnecessary `extension_llm_runner` dependency that caused build conflicts with sentencepiece headers - Update Makefile `silero-vad-cpu` target to build both runners with `-DEXECUTORCH_BUILD_EXTENSION_LLM_RUNNER=OFF` - Update README with streaming usage and architecture docs Authored with assistance from Claude. Made-with: Cursor

pytorch-bot · 2026-03-25T21:59:04Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18507

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-03-25T21:59:45Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Adjust `stream_main.cpp` to match the formatter output so the remaining lintrunner failure is resolved without changing behavior. Authored with assistance from Claude. Made-with: Cursor

Rewrite the silero VAD runner link-library lists to match cmake-format so the remaining lintrunner failure is cleared without changing build behavior. Authored with assistance from Claude. Made-with: Cursor

seyeong-han requested review from kirklandsign, larryliu0820 and lucylq as code owners March 25, 2026 21:59

meta-cla bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Mar 25, 2026

seyeong-han mentioned this pull request Mar 25, 2026

Voxtral macOS: Add replacements, snippets, wake-up, and sidebar navigation meta-pytorch/executorch-examples#226

Merged

8 tasks

mergennachin approved these changes Mar 26, 2026

View reviewed changes

seyeong-han and others added 3 commits March 31, 2026 12:49

Merge branch 'main' into silero-vad/streaming-runner

7df9074

Silero VAD: fix stream runner formatting

cfff2f0

Adjust `stream_main.cpp` to match the formatter output so the remaining lintrunner failure is resolved without changing behavior. Authored with assistance from Claude. Made-with: Cursor

Silero VAD: fix CMake formatting

0468e1c

Rewrite the silero VAD runner link-library lists to match cmake-format so the remaining lintrunner failure is cleared without changing build behavior. Authored with assistance from Claude. Made-with: Cursor

seyeong-han merged commit 3616c3d into pytorch:main Mar 31, 2026
147 of 153 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add streaming Silero VAD runner for real-time speech detection#18507

Add streaming Silero VAD runner for real-time speech detection#18507
seyeong-han merged 4 commits intopytorch:mainfrom
seyeong-han:silero-vad/streaming-runner

seyeong-han commented Mar 25, 2026

Uh oh!

pytorch-bot bot commented Mar 25, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

seyeong-han commented Mar 25, 2026

Summary

Changes

New: silero_vad_stream_runner

New: Streaming API on SileroVadRunner

Build changes

Usage

Test plan

Uh oh!

pytorch-bot bot commented Mar 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/18507

Uh oh!

github-actions bot commented Mar 25, 2026

This PR needs a release notes: label

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

New: `silero_vad_stream_runner`

New: Streaming API on `SileroVadRunner`

pytorch-bot bot commented Mar 25, 2026 •

edited

Loading

This PR needs a `release notes:` label