Real-time Hindi → English voice translator for corporate meetings.
Speak Hindi. Your colleagues hear English. Instantly.
You speak Hindi into your mic. Sync Speak translates it to English and plays it through a virtual audio cable — so everyone in your Google Meet / Zoom hears natural English in real time.
You speak Hindi
↓
Mic → webrtcvad (speech detection)
↓
Sarvam Saarika v2.5 (Hindi/Hinglish → text)
↓
Groq Llama 3.3 70B (translation with conversation context)
↓
Sarvam Bulbul v3 (English text → voice)
↓
VB-Cable virtual mic → Google Meet / Zoom hears English
Three layers that keep latency low:
- webrtcvad — Google's neural VAD (used in Chrome/Meet) separates speech from noise with 92%+ accuracy
- HTTP session pooling — reuses the TCP/TLS connection to Sarvam across TTS calls
- Sentence pipelining — synthesizes sentence 2 while sentence 1 is playing; zero gaps
| Tool | Why |
|---|---|
| Node.js 20+ | Frontend build |
| Rust + Cargo | Tauri shell |
| Python 3.10+ | Audio engine sidecar |
| VB-Cable | Virtual mic (routes translated audio into meetings) |
| Sarvam AI key | STT (Saarika v2.5) + TTS (Bulbul v3) |
| Groq key | Translation (Llama 3.3 70B) |
Two API keys are required. VB-Cable is a free Windows driver.
1. Clone and install
git clone https://github.com/Soumyadipgithub/SyncSpeak.git
cd SyncSpeak
npm install2. Set up Python environment
python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
pip install webrtcvad-wheels # Windows pre-built wheel3. Run
npm run dev4. Enter your API keys in the app
Open Settings → AI Authentication → paste both keys → click Activate.
Keys are saved automatically and reloaded on every launch. No .env file needed.
For a full setup walkthrough, see docs/setup.md.
SyncSpeak/
├── python/ # Audio engine (webrtcvad + Sarvam + Groq)
│ ├── sidecar_main.py # VAD, pipeline orchestration, command loop
│ └── translator.py # API wrappers, TTS pipelining, HTTP session
├── src/renderer/ # React 19 UI (TypeScript)
│ ├── pages/ # TranslatePage, HistoryPage, VoicesPage
│ └── components/ # TitleBar, TabBar, LiquidTerminal
├── src-tauri/ # Rust / Tauri v2 shell
├── website/ # Marketing site (Astro) — syncspeak.soumg.workers.dev
└── docs/ # Technical documentation
| Layer | Technology |
|---|---|
| Desktop shell | Tauri v2 (Rust) |
| UI | React 19, TypeScript, Zustand |
| Design | Custom "Liquid Glass" CSS (no Tailwind) |
| VAD | webrtcvad (Google Neural VAD) |
| STT | Sarvam Saarika v2.5 (REST) |
| Translation | Groq Llama 3.3 70B |
| TTS | Sarvam Bulbul v3 (REST) |
| Audio routing | sounddevice + VB-Cable |
14 built-in voices (Sarvam Bulbul v3): shubh, sumit, amit, manan, rahul, ratan, ritu, pooja, simran, kavya, priya, ishita, shreya, shruti.
Local WAV previews cached in resources/samples/ — no API tokens used for previews.
| Document | Contents |
|---|---|
| docs/architecture.md | Three-tier design, file map, full IPC protocol |
| docs/pipeline.md | Audio pipeline deep dive (VAD → STT → LLM → TTS) |
| docs/api-reference.md | Sarvam AI + Groq endpoints, models, parameters |
| docs/setup.md | Full installation and troubleshooting guide |
| docs/design-system.md | Liquid Glass tokens, rules, component patterns |
| CONTRIBUTING.md | Workflow, design rules, PR checklist |
See CONTRIBUTING.md for the full workflow, design rules, and checklist.
Quick version:
- Fork → branch → PR
- Read CLAUDE.md before making any changes — it has the design rules and key behaviours
- Keep all panels as glass cards (no solid colors — see docs/design-system.md)
- Do not swap out AI providers without discussion
To report a security vulnerability, see SECURITY.md.
The marketing site lives in website/ and is built with Astro.
It deploys to syncspeak.soumg.workers.dev (Cloudflare
Workers) on every push to main. See website/README.md
for local-dev commands.
MIT — see LICENSE