| Tool | Version | Why | Install |
|---|---|---|---|
| Node.js | 20+ | Frontend build + Tauri CLI | https://nodejs.org |
| Rust + Cargo | stable | Tauri shell compilation | https://rustup.rs |
| Python | 3.10+ | Audio engine sidecar | https://python.org |
| VB-Cable | any | Virtual mic for meeting routing | https://vb-audio.com/Cable |
| Sarvam AI key | — | STT + TTS | https://dashboard.sarvam.ai |
| Groq key | — | Translation (Llama 3.3 70B) | https://console.groq.com |
Two API keys are required. There is no way to run the full pipeline with only one.
git clone https://github.com/Soumyadipgithub/SyncSpeak.git
cd SyncSpeaknpm installpython -m venv venv
venv\Scripts\activate
pip install -r requirements.txtwebrtcvad requires a pre-built wheel on Windows (the source package needs a C compiler):
venv\Scripts\pip install webrtcvad-wheelsBoth API keys are entered through the app's Settings modal on first launch. No .env file is needed.
- Sarvam key: Settings → AI Authentication → Sarvam API Key → Activate
- Groq key: Settings → AI Authentication → Groq API Key → Activate
Keys are saved to %APPDATA%\com.syncspeak.app\config.json and automatically re-injected into the Python engine on every subsequent launch. You only enter them once.
VB-Cable is a Windows virtual audio driver. You can install it either:
- Manually from https://vb-audio.com/Cable
- Via the app's Guide tab (click "Install VB-Cable" — it downloads and launches the installer automatically)
After installation, set your microphone to CABLE Input (VB-Audio) in Google Meet / Zoom settings.
npm run devThis starts both the Vite dev server and the Tauri shell. The Python sidecar launches automatically from venv\Scripts\python.exe python\sidecar_main.py.
Start_SyncSpeak.batThis batch file:
- Adds Rust/Cargo to the PATH
- Verifies Python engine dependencies (auto-installs if missing)
- Runs
npm run dev
npm run buildThis runs npx vite build then tauri build. The bundled app includes the Python sidecar compiled to a native binary by PyInstaller (binaries/syncspeaker-sidecar).
- The app opens with a glass UI showing "READY"
- Enter your Groq key in Settings → Groq Authentication
- Enter your Sarvam key in Settings → Sarvam Authentication
- Click ↻ to scan audio devices
- Select your microphone from Input Settings
- Select "CABLE Output (VB-Audio)" from Output Settings — a "Ready" badge appears
- Select a voice (click "Sample" to preview)
- Click START TRANSLATION
- Speak Hindi — you will see "Hearing..." → "Thinking..." → English text in the log
Run the test suite to verify each component individually:
Test_Pipeline.batOr directly:
venv\Scripts\python.exe python\test_pipeline.pyThe Python sidecar failed to start. Likely causes:
- Missing
groqpackage:venv\Scripts\pip install groq - Missing
webrtcvad-wheels:venv\Scripts\pip install webrtcvad-wheels - Missing
sarvamai:venv\Scripts\pip install sarvamai
Another app (Teams, Zoom, Discord) is holding the microphone. Either:
- Close the other app and click ↻ to rescan
- Or select a different microphone from the dropdown
The pipeline automatically calls sd._terminate() + sd._initialize() before each stream open to clear stale WASAPI state.
The Sarvam API returned 400. Common causes:
- Wrong model name (must be
saarika:v2.5— oldersaarika:v2is deprecated) - Malformed WAV (rare — indicates a bug in
_pcm_to_wav_bytes)
This means Llama is reversing the translation direction. Check the system prompt in _MEETING_PROMPT — the "Already English → output as-is" rule must be present.
The TTS voice is being picked up by the microphone. This is handled automatically by the _tts_active flag (mic is blocked while TTS plays). If it still happens, it means the VB-Cable loopback is routing TTS audio back to the input device. Check your Windows audio routing — the input device (mic) and output device (CABLE Output) should not form a loop through Windows audio settings.
The webrtcvad pre-buffer (500 ms) catches speech onset. If very short words ("kya", "yes") are still being missed, try increasing Voice Sensitivity in the UI.
The current requirements.txt reflects the full set of required packages. Two packages require manual installation because pip may have difficulty installing them on some systems:
webrtcvad-wheels— installs via pip but must be the-wheelsvariant on Windows (notwebrtcvadwhich requires a C compiler)groq— standard pip install
Python 3.10 or higher is required. The match/case statement syntax used in some Tauri plugins requires 3.10+. Python 3.11 or 3.12 is recommended.
Node.js 20 LTS is recommended. The @tauri-apps/cli v2 package requires Node 18+.