Jarvis

A voice-controlled AI assistant for macOS, powered by Claude Code. Say "Jarvis" and it listens, transcribes, thinks, and speaks back.

Features

Wake word detection - Say "Jarvis" to activate (faster-whisper tiny, rolling buffer)
Real-time transcription - See your words appear as you speak (faster-whisper preview + mlx-whisper turbo final)
Claude Code backend - Full access to tools: Bash, file system, web search, code editing
Natural voice - ElevenLabs multilingual v2 text-to-speech
Streaming responses - Text appears token by token, TTS plays sentence by sentence
Conversation mode - Follow-up questions without repeating "Jarvis"
Auto-end detection - Jarvis stops listening when you're done or talking to someone else

Architecture

Mic --> Wake word (faster-whisper tiny, rolling buffer)
    --> Record + live preview (faster-whisper tiny)
    --> Final transcription (mlx-whisper turbo on Apple Silicon GPU)
    --> Claude Code CLI (persistent process, stream-json)
    --> ElevenLabs TTS (sentence by sentence, background thread)
    --> Speaker
    --> Conversation mode (mic stays open) or back to wake word

Requirements

macOS with Apple Silicon (M1/M2/M3/M4)
Python 3.9+
Claude Code installed and authenticated
ElevenLabs API key (free tier: 10,000 chars/month)
ffmpeg (brew install ffmpeg)

Setup

git clone https://github.com/TDS-Upec/Jarvis.git
cd Jarvis
python3 -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
cp .env.example .env
# Edit .env with your ElevenLabs API key

Usage

source .venv/bin/activate
python main.py

Say "Jarvis" to activate, then speak naturally in French.

Project Structure

main.py          - Main loop, mic buffer, conversation flow
assistant.py     - Claude Code CLI integration (persistent process)
transcriber.py   - Speech-to-text via mlx-whisper (Apple Silicon GPU)
wake_word.py     - Wake word detection via faster-whisper (rolling buffer)
speaker.py       - Text-to-speech via ElevenLabs
audio.py         - Audio capture and silence detection
config.py        - Configuration and constants
ui.py            - Terminal UI (styled output)

Configuration

Edit config.py to adjust:

SILENCE_DURATION - How long to wait before cutting (default: 0.8s)
CONVERSATION_TIMEOUT - Silence before ending conversation (default: 5s)
ELEVENLABS_VOICE_ID - Change the voice
WHISPER_MODEL - Change the transcription model

Built With

Claude Code - AI backend with full tool access
mlx-whisper - Speech-to-text optimized for Apple Silicon
faster-whisper - Lightweight STT for wake word
ElevenLabs - Natural text-to-speech
sounddevice - Audio capture

Name		Name	Last commit message	Last commit date
Latest commit History 119 Commits
docs/superpowers		docs/superpowers
jarvis_profile		jarvis_profile
scripts		scripts
tests		tests
ui		ui
.env.example		.env.example
.gitignore		.gitignore
.greeting_cache.npy		.greeting_cache.npy
README.md		README.md
assistant.py		assistant.py
audio.py		audio.py
config.py		config.py
jlog.py		jlog.py
main.py		main.py
requirements.txt		requirements.txt
speaker.py		speaker.py
transcriber.py		transcriber.py
ui.py		ui.py
ui_socket.py		ui_socket.py
wake_word.py		wake_word.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Jarvis

Features

Architecture

Requirements

Setup

Usage

Project Structure

Configuration

Built With

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Jarvis

Features

Architecture

Requirements

Setup

Usage

Project Structure

Configuration

Built With

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages