VoiceMode brings voice conversations to AI coding assistants. It works as both an MCP server for Claude Code and as a standalone CLI tool.
VoiceMode provides:
- MCP Server: Adds voice tools to Claude Code - no installation needed
- CLI Tool: Use VoiceMode's tools directly from your terminal
- Local Services: Optional privacy-focused speech processing
The fastest way to get started is using VoiceMode with Claude Code.
Install UV package manager (if not already installed), then run the VoiceMode installer:
# Install UV package manager (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh
# Install VoiceMode and configure services
uvx voice-mode-install
# Add to Claude Code MCP
claude mcp add --scope user voicemode -- uvx --refresh voice-modeThe installer will:
- Install missing system dependencies (FFmpeg, PortAudio, etc.)
- Set up your environment for VoiceMode
- Offer to install local voice services (Whisper STT and Kokoro TTS)
Alternative UV installation methods:
- macOS:
brew install uv - With pip:
pip install uv
Learn more: UV Installation Guide
Set your OpenAI API key as an environment variable:
export OPENAI_API_KEY="sk-your-api-key-here"Or add it to your shell configuration file (~/.bashrc, ~/.zshrc, etc.)
# Check that VoiceMode is connected
claude mcp listYou should see voicemode in the list of connected servers.
By default, Claude Code prompts for permission each time VoiceMode tools are used. To enable automatic approval, add to ~/.claude/settings.json:
{
"permissions": {
"allow": [
"mcp__voicemode__converse",
"mcp__voicemode__service"
]
}
}This allows voice conversations and service management without prompts. For more permission options, see the Permissions Guide.
In Claude Code, simply type:
converse
Speak when you hear the chime, and Claude will respond with voice!
If you want to use VoiceMode from the command line:
# Install with pip
uv tool install voice-mode
# Or install from source in editable mode
git clone https://github.com/mbailey/voicemode
cd voicemode
uv tool install -e .# Set your API key
export OPENAI_API_KEY="sk-your-api-key-here"
# Start a voice conversation
voicemode converseFor complete privacy, you can run voice services locally instead of using OpenAI.
# Install local services
voicemode service install whisper # Speech-to-text
voicemode service install kokoro # Text-to-speech
# Start services
voicemode service start whisper
voicemode service start kokoro
# Check status of all services
voicemode service statusVoiceMode will automatically detect and use these local services when available.
To have services start automatically at login:
# Enable services to start at boot/login
voicemode service enable whisper
voicemode service enable kokoroOn macOS, this creates launchd agents. On Linux, it creates systemd user services.
| Service | Download Size | Disk Space | First Start Time |
|---|---|---|---|
| Whisper (tiny) | ~75MB | ~150MB | 30 seconds |
| Whisper (base) | ~150MB | ~300MB | 1-2 minutes |
| Whisper (small) | ~460MB | ~1GB | 2-3 minutes |
| Kokoro TTS | ~350MB | ~700MB | 2-3 minutes |
Recommended: Whisper base + Kokoro = ~500MB download, ~1GB disk space.
After installation, services download models on first start. Wait for them to be ready:
# Wait for Whisper (port 2022)
while ! nc -z localhost 2022 2>/dev/null; do sleep 2; done
echo "Whisper ready"
# Wait for Kokoro (port 8880)
while ! nc -z localhost 8880 2>/dev/null; do sleep 2; done
echo "Kokoro ready"Learn more: Whisper Setup Guide | Kokoro Setup Guide
VoiceMode works out of the box with sensible defaults. To customize:
# OpenAI voices
export VOICEMODE_VOICES="nova,shimmer"
# Or Kokoro voices (if using local TTS)
export VOICEMODE_VOICES="af_sky,am_adam"Available OpenAI voices: alloy, echo, fable, onyx, nova, shimmer
Create .voicemode.env in your project:
export VOICEMODE_VOICES="af_nova,nova"
export VOICEMODE_TTS_SPEED=1.2Learn more: Configuration Guide
-
Check MCP connection:
claude mcp list
-
Verify OPENAI_API_KEY is set in your MCP configuration
-
Add to your MCP config:
"env": { "OPENAI_API_KEY": "sk-...", }
# List audio devices
voicemode diag devices
# Test TTS and STT
voicemode converse# Check service status
voicemode service status # All services
voicemode service status whisper # Specific service
# View logs
voicemode service logs whisper -n 50
voicemode service logs kokoro -n 50
# Check if service is responding
voicemode service health whisper
voicemode service health kokoroFor remote access or persistent operation, run VoiceMode as a background service:
# Start the VoiceMode HTTP server
voicemode service start voicemode
# Enable auto-start at boot/login
voicemode service enable voicemode
# Check all services
voicemode service statusThe HTTP server enables remote access from other machines on your network or via secure tunnels.
For security best practices when running remotely, see the Configuration Guide.
- Configuration Guide - Customize VoiceMode
- Development Setup - Contribute to VoiceMode
- Service Guides - Set up Whisper, Kokoro, or LiveKit
- CLI Reference - All available commands
- GitHub Issues: github.com/mbailey/voicemode/issues
- Discord: Join our community for support
Welcome to voice-enabled AI coding! 🎙️