03 Mar 16:24

dharmendrach

0c5cb75

v0.4.0 Latest

Latest

NVIDIA Pipecat 0.4.0 (3 March 2026)

Added

Multilingual support for the voice agent
AI agent deployment skill for WebRTC, WebSocket, and NAT agent examples
Jetson Thor edge deployment support
OpenTelemetry tracing support for ASR, TTS, and LLM services
Riva Text Filter option to clean and normalize LLM output before TTS processing
Display of unfiltered transcripts in the WebRTC UI
Support for Nemotron 3 Nano LLM model

Changed

BREAKING: Renamed RivaASRService to NemotronASRService and RivaTTSService to NemotronTTSService to better reflect the underlying Nemotron Speech technology. The old names remain available as deprecated aliases.
Upgraded to pipecat 0.0.98
Migrated to RTVIObserver for the WebRTC UI
Changed default TTS sample rate to 22.05 kHz for WebRTC examples
Updated the Jetson Thor guide to use the public Riva 2.24.0 release

Fixed

Chat history truncation logic for Nemotron models
Riva generate_interruptions logic
TTS chunk cutoff at websocket transport layer by appending silence at the end of each TTS response
TTS text normalization in NemotronTTSService

Removed

Riva NMT processor and BlingFire Text Aggregator

Assets 2

07 Nov 17:54

dharmendrach

v0.3.0

ceb1b5f

v0.3.0

NVIDIA Pipecat 0.3.0 (7 November 2025)

New Features

Added WebRTC-based voice agent example and custom UI
Nemo Agent Toolkit integration and Voice Agent example with Agentic AI
Scripts for latency and throughput performance benchmarking for Voice Agents
Support for Dynamic LLM prompt ingestion and TTS Voice selection using WebRTC UI
Full-Duplex-Bench evaluation inference client script
BlingFireTextAggregator for TTS Service
Added steps for LLM deployment with KV Cache support

Improvements

Updated pipecat to version 0.0.85
Renamed GitHub repository to voice-agent-examples
Switched to Magpie TTS Multilingual model
Hardcoded NIM version tags in examples

Fixed

Fixed user transcriptions and Docker Compose volume issues
Split long TTS sentences to handle Riva TTS character limit error

Removed

Removed Animation and Audio2Face support
Removed ACE naming references

Assets 2

18 Jun 11:58

dharmendrach

v0.2.0

d746b90

v0.2.0

NVIDIA Pipecat 0.2.0 (17 June 2025)

New Features

Support for deepseek, mistral-ai, and llama-nemotron models in Nvidia LLM Service
Support for BotSpeakingFrame in animation graph service

Improvements

Upgraded Riva Client version to 2.20.0
Upgraded to pipecat 0.0.68
Improved animation graph stream handling
Improved task cancellation support in NVIDIA LLM and NVIDIA RAG Service

Fixed

Fixed transcription synchronization for multiple final ASR transcripts
Fixed edge case where the mouth of the avatar would not close
Fixed animation stream handling for broken streams
Fixed Elevenlabs edge case issues with multi-lingual use cases
Fixed chunk truncation issues in RAG Service
Fixed dangling tasks and pipeline cleanup issues

Assets 2

23 Apr 16:08

dharmendrach

v0.1.0

f272bf7

v0.1.0

NVIDIA Pipecat 0.1.0 (23 April 2025)

The NVIDIA Pipecat library augments the Pipecat framework by adding additional frame processors and services, as well as new multimodal frames to enhance avatar interactions. This is the first release of the NVIDIA Pipecat library.

New Features

Added Pipecat services for Riva ASR (Automatic Speech Recognition), Riva TTS (Text to Speech), and Riva NMT (Neural Machine Translation) models.
Added Pipecat frames, processors, and services to support multimodal avatar interactions and use cases. This includes Audio2Face3DService, AnimationGraphService, FacialGestureProviderProcessor, and PostureProviderProcessor.
Added ACETransport, which is specifically designed to support integration with existing ACE microservices. This includes a FastAPI-based HTTP and WebSocket server implementation compatible with ACE.
Added NvidiaLLMService for NIM LLM models and NvidiaRAGService for the NVIDIA RAG Blueprint.
Added UserTranscriptSynchronization processor for user speech transcripts and BotTranscriptSynchronization processor for synchronizing bot transcripts with bot audio playback.
Added custom context aggregators and processors to enable Speculative Speech Processing to reduce latency.
Added UserPresence, Proactivity, and AcknowledgementProcessor frame processors to improve human-bot interactions.
Released source code for the voice assistant example using nvidia-pipecat, along with the pipecat-ai library service, to showcase NVIDIA services with ACETransport.

Improvements

Added ElevenLabsTTSServiceWithEndOfSpeech, an extended version of the ElevenLabs TTS service with end-of-speech events for usage in avatar interactions.

Assets 2

Releases: NVIDIA/voice-agent-examples

v0.4.0

NVIDIA Pipecat 0.4.0 (3 March 2026)

Added

Changed

Fixed

Removed

Uh oh!

v0.3.0

NVIDIA Pipecat 0.3.0 (7 November 2025)

New Features

Improvements

Fixed

Removed

Uh oh!

v0.2.0

NVIDIA Pipecat 0.2.0 (17 June 2025)

New Features

Improvements

Fixed

Uh oh!

v0.1.0

NVIDIA Pipecat 0.1.0 (23 April 2025)

New Features

Improvements

Uh oh!