Typhoon ASR Realtime — CPU API

FastAPI server hosting typhoon-ai/typhoon-asr-realtime (NeMo FastConformer-Transducer, 114M) for CPU-only inference.

Install

Python 3.10+ recommended.

pip install -r requirements.txt

NeMo also requires libsndfile and ffmpeg on the system, e.g.:

sudo apt-get install -y libsndfile1 ffmpeg

Run

CUDA_VISIBLE_DEVICES="" python app.py
# or pick a port:
PORT=8001 CUDA_VISIBLE_DEVICES="" python app.py

The model downloads from the Hugging Face Hub on first run (cached under ~/.cache/huggingface) and loads at startup. CUDA_VISIBLE_DEVICES="" forces CPU.

Endpoints

GET /health → {"status":"ok","loaded":true}
POST /transcribe — multipart form, field file (wav/mp3/m4a; auto-resampled to 16kHz mono)
- query ?with_timestamps=true to also return char/word timestamps

curl -X POST http://localhost:8000/transcribe -F "file=@audio.wav"
# {"text":"..."}

curl -X POST "http://localhost:8000/transcribe?with_timestamps=true" -F "file=@audio.wav"

Interactive docs at http://localhost:8000/docs.

Notes

Thai-language model; non-Thai audio produces garbage output (expected).
torch.set_num_threads() is set to all CPU cores for intra-op parallelism.
For production, run with uvicorn directly, e.g. CUDA_VISIBLE_DEVICES="" uvicorn app:app --host 0.0.0.0 --port 8000 (each worker loads its own copy of the model into RAM, so add --workers N only if you have the memory).

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
app.py		app.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Typhoon ASR Realtime — CPU API

Install

Run

Endpoints

Notes

About

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Typhoon ASR Realtime — CPU API

Install

Run

Endpoints

Notes

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!

Languages