A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
-
Updated
Feb 18, 2026 - Python
A comprehensive ComfyUI integration for Microsoft's VibeVoice text-to-speech model, enabling high-quality single and multi-speaker voice synthesis directly within your ComfyUI workflows.
A ComfyUI custom node integration for local multi-engine multi-language Text-to-Speech and Voice Conversion. Supports: RVC, Echo-TTS, Qwen3-TTS, Cozy Voice 3, Step Audio EditX, IndexTTS-2, Chatterbox (classic and multilingual), F5-TTS, Higgs Audio 2 and VibeVoice with unlimited text length, SRT timing, Character support, and many audio tools
Free real-time AI Noise Gate VST3/AU plugin. Removes coughs, sneezes, and other artifacts from your live streams, podcasts, and videos.
Soundstorm is a cutting-edge AI-powered audio manipulation application designed to provide a rich yet simplified experience for sound designers, algorithmic composers, and experimental audio enthusiasts. From sample pack creation and algorithmic composition to AI text-to-audio and onscreen ChatGPT, Soundstorm is a sonic powerhouse.
Real-Time Deepfake Pipeline
Local-first CLI that turns Markdown scripts into multi-speaker podcast-style audio using Coqui XTTS v2.
Music Generation Using Deep Learning🎶🎵
AI Voice Agents: Exploring the Next Generation of Human-Machine Interaction! 🎙️🤖🎧
Community list of AI tools for audio and music
AudioInsight is a web application that processes audio, generates transcriptions, and allows users to ask questions about the related audio.
An approach to Andrej Karpathy's LLM challenge, as outlined here: https://twitter.com/karpathy/status/1760740503614836917
A local-first EPUB reader with high-fidelity neural text-to-speech, word-level synchronization, and Next.js/FastAPI/ONNX stack.
Professional Yocto BSP Layer for Dynamic Devices Edge Computing Platforms - AI Audio Processing, E-Ink Displays, Power Management, Wireless Connectivity, i.MX8MM/i.MX93 Support
AI Audio Framework 🎵
High-performance KittenTTS API server with a built-in web UI, OpenAI-compatible routes, long-form text support, and optional CUDA acceleration.
IntelliMix is an AI-powered web app for transforming and editing audio with ease. Create mashups just with one prompt, trim audio, batch process, and download media—all in one streamlined interface. Built with React, Flask, and integrated AI tools.
Maya Voice AI is an open-source project that demonstrates the Maya1 model, capable of generating realistic voice audio from text input with rich emotional and descriptive control. This repository provides a demo for text-to-speech synthesis using advanced language models and the SNAC codec, focusing on high-quality audio at 24kHz.
Open-source Chinese TTS workstation for humans, AI, and agents. CLI first, WebUI on the roadmap.
AI 音效生成平台 —— 用一句话描述场景,秒出专业级音效。面向视频创作者、游戏开发者、播客主播。🎵 aiwave.art
Official Unity SDK for VARCO Voice API. High-quality AI text-to-speech, real-time LipSync, and 80+ professional DSP presets for game characters.
Add a description, image, and links to the ai-audio topic page so that developers can more easily learn about it.
To associate your repository with the ai-audio topic, visit your repo's landing page and select "manage topics."