Sync Speak

Real-time Hindi → English voice translator for corporate meetings.
Speak Hindi. Your colleagues hear English. Instantly.

What does it do?

You speak Hindi into your mic. Sync Speak translates it to English and plays it through a virtual audio cable — so everyone in your Google Meet / Zoom hears natural English in real time.

How it works

You speak Hindi
      ↓
Mic → webrtcvad (speech detection)
      ↓
Sarvam Saarika v2.5 (Hindi/Hinglish → text)
      ↓
Groq Llama 3.3 70B (translation with conversation context)
      ↓
Sarvam Bulbul v3 (English text → voice)
      ↓
VB-Cable virtual mic → Google Meet / Zoom hears English

Three layers that keep latency low:

webrtcvad — Google's neural VAD (used in Chrome/Meet) separates speech from noise with 92%+ accuracy
HTTP session pooling — reuses the TCP/TLS connection to Sarvam across TTS calls
Sentence pipelining — synthesizes sentence 2 while sentence 1 is playing; zero gaps

Prerequisites

Tool	Why
Node.js 20+	Frontend build
Rust + Cargo	Tauri shell
Python 3.10+	Audio engine sidecar
VB-Cable	Virtual mic (routes translated audio into meetings)
Sarvam AI key	STT (Saarika v2.5) + TTS (Bulbul v3)
Groq key	Translation (Llama 3.3 70B)

Two API keys are required. VB-Cable is a free Windows driver.

Setup

1. Clone and install

git clone https://github.com/Soumyadipgithub/SyncSpeak.git
cd SyncSpeak
npm install

2. Set up Python environment

python -m venv venv
venv\Scripts\activate
pip install -r requirements.txt
pip install webrtcvad-wheels   # Windows pre-built wheel

3. Run

npm run dev

4. Enter your API keys in the app

Open Settings → AI Authentication → paste both keys → click Activate. Keys are saved automatically and reloaded on every launch. No .env file needed.

For a full setup walkthrough, see docs/setup.md.

Project Structure

SyncSpeak/
├── python/                # Audio engine (webrtcvad + Sarvam + Groq)
│   ├── sidecar_main.py    # VAD, pipeline orchestration, command loop
│   └── translator.py      # API wrappers, TTS pipelining, HTTP session
├── src/renderer/          # React 19 UI (TypeScript)
│   ├── pages/             # TranslatePage, HistoryPage, VoicesPage
│   └── components/        # TitleBar, TabBar, LiquidTerminal
├── src-tauri/             # Rust / Tauri v2 shell
├── website/               # Marketing site (Astro) — syncspeak.soumg.workers.dev
└── docs/                  # Technical documentation

Tech Stack

Layer	Technology
Desktop shell	Tauri v2 (Rust)
UI	React 19, TypeScript, Zustand
Design	Custom "Liquid Glass" CSS (no Tailwind)
VAD	webrtcvad (Google Neural VAD)
STT	Sarvam Saarika v2.5 (REST)
Translation	Groq Llama 3.3 70B
TTS	Sarvam Bulbul v3 (REST)
Audio routing	sounddevice + VB-Cable

Voices

14 built-in voices (Sarvam Bulbul v3): shubh, sumit, amit, manan, rahul, ratan, ritu, pooja, simran, kavya, priya, ishita, shreya, shruti.

Local WAV previews cached in resources/samples/ — no API tokens used for previews.

Documentation

Document	Contents
docs/architecture.md	Three-tier design, file map, full IPC protocol
docs/pipeline.md	Audio pipeline deep dive (VAD → STT → LLM → TTS)
docs/api-reference.md	Sarvam AI + Groq endpoints, models, parameters
docs/setup.md	Full installation and troubleshooting guide
docs/design-system.md	Liquid Glass tokens, rules, component patterns
CONTRIBUTING.md	Workflow, design rules, PR checklist

Contributing

See CONTRIBUTING.md for the full workflow, design rules, and checklist.

Quick version:

Fork → branch → PR
Read CLAUDE.md before making any changes — it has the design rules and key behaviours
Keep all panels as glass cards (no solid colors — see docs/design-system.md)
Do not swap out AI providers without discussion

To report a security vulnerability, see SECURITY.md.

Website

The marketing site lives in website/ and is built with Astro. It deploys to syncspeak.soumg.workers.dev (Cloudflare Workers) on every push to main. See website/README.md for local-dev commands.

License

MIT — see LICENSE

Name		Name	Last commit message	Last commit date
Latest commit History 120 Commits
.github		.github
docs		docs
python		python
resources/samples		resources/samples
scripts		scripts
src-tauri		src-tauri
src/renderer		src/renderer
tests		tests
website		website
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
Start_SyncSpeak.bat		Start_SyncSpeak.bat
Test_Pipeline.bat		Test_Pipeline.bat
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
tsconfig.json		tsconfig.json
tsconfig.node.json		tsconfig.node.json
vite.config.ts		vite.config.ts

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Sync Speak

What does it do?

How it works

Prerequisites

Setup

Project Structure

Tech Stack

Voices

Documentation

Contributing

Website

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Sync Speak

What does it do?

How it works

Prerequisites

Setup

Project Structure

Tech Stack

Voices

Documentation

Contributing

Website

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages