The NVIDIA Pipecat library augments the Pipecat framework by adding additional frame processors and NVIDIA services. This includes the integration of NVIDIA services and NIMs such as Nemotron Speech ASR (Parakeet), Nemotron Speech TTS (Magpie), LLM NIMs, NAT (NeMo Agent Toolkit), and Foundational RAG. It also introduces a few processors with a focus on improving the end-user experience for multimodal conversational agents, along with speculative speech processing to reduce latency for faster bot responses.
The nvidia-pipecat source code can be found in the GitHub repository.