GitHub - wavekat/wavekat-tts: Text-to-speech library for Rust with a unified trait interface over multiple backends (Kokoro ONNX). Part of the WaveKat voice pipeline.

Unified text-to-speech for voice pipelines, wrapping multiple TTS engines behind common Rust traits. Same pattern as wavekat-vad and wavekat-turn.

Warning

Early development. API may change between minor versions.

Backends

Backend	Feature flag	License
Qwen3-TTS	`qwen3-tts`	Apache 2.0
CosyVoice	`cosyvoice`	Apache 2.0

Quick start

cargo add wavekat-tts --features qwen3-tts

use wavekat_tts::{TtsBackend, SynthesizeRequest};
use wavekat_tts::backends::qwen3_tts::Qwen3Tts;

// Auto-downloads model files (~3.8 GB) on first run:
let tts = Qwen3Tts::new()?;

// Or load from an explicit directory:
// let tts = Qwen3Tts::from_dir("models/qwen3-tts-0.6b")?;

let request = SynthesizeRequest::new("Hello, world");
let audio = tts.synthesize(&request)?;

println!("{}s at {} Hz", audio.duration_secs(), audio.sample_rate());

Model files are cached at $WAVEKAT_MODEL_DIR or ~/.cache/wavekat/qwen3-tts-0.6b/.

All backends produce AudioFrame<'static> from wavekat-core — the same type consumed by wavekat-vad and wavekat-turn.

Architecture

wavekat-vad   →  "is someone speaking?"
wavekat-turn  →  "are they done speaking?"
wavekat-tts   →  "synthesize the response"
     │                   │                     │
     └───────────────────┴─────────────────────┘
                         │
                   AudioFrame (wavekat-core)

Two trait families:

TtsBackend — batch synthesis: text → AudioFrame<'static>
StreamingTtsBackend — streaming: text → iterator of AudioFrame<'static> chunks

Examples

Generate a WAV file from text (model files are auto-downloaded on first run):

cargo run --example synthesize --features qwen3-tts,hound -- "Hello, world\!"
cargo run --example synthesize --features qwen3-tts,hound -- --language zh "你好世界"
cargo run --example synthesize --features qwen3-tts,hound -- --model-dir /path/to/model --output hello.wav "Hello"

Feature flags

Flag	Default	Description
`qwen3-tts`	off	Qwen3-TTS local ONNX inference
`cosyvoice`	off	CosyVoice local ONNX inference

License

Licensed under Apache 2.0.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
.github/workflows		.github/workflows
crates/wavekat-tts		crates/wavekat-tts
docs		docs
tools/qwen3-tts-onnx		tools/qwen3-tts-onnx
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
release-plz.toml		release-plz.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Backends

Quick start

Architecture

Examples

Feature flags

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Backends

Quick start

Architecture

Examples

Feature flags

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages