Text-to-speech running entirely in your browser. No server, no GPU, no install.
Built on KittenTTS using ONNX Runtime Web and eSpeak-NG WASM for phonemization. Everything runs client-side — nothing leaves your device.
| Model | Params | First load |
|---|---|---|
| Nano int8 | 15M | ~20 MB |
| Nano | 15M | ~57 MB |
| Micro | 40M | ~45 MB |
| Mini | 80M | ~82 MB |
Models are fetched from Hugging Face and cached by your browser.
Bella, Jasper, Luna, Bruno, Rosie, Hugo, Kiki, Leo
git clone https://github.com/DipFlip/KittenTTSWeb.git
cd KittenTTSWeb/docs
python3 -m http.server 8080Open http://localhost:8080.