AI-powered voice interaction client for Android — Jetpack Compose front-end that captures microphone input, sends it to an AI pipeline (Whisper STT, LLM reasoning, ElevenLabs TTS), and plays back synthesised speech responses.
Part of the Coding-Autopilot-System ecosystem: gsd-orchestrator | Promptimprover | autogen
See also: OgeonX-Ai/enterprise-ai-gateway — vendor-agnostic AI service bus
flowchart LR
Mic[Microphone\nMediaRecorder] --> Upload[Audio Upload\nOkHttp multipart]
Text[Text Input\nCompose UI] --> TTS_req[TTS Request\nOkHttp JSON]
Upload --> Backend[FastAPI Backend\nWhisper STT / LLM / ElevenLabs TTS]
TTS_req --> Backend
Backend --> Player[Audio Playback\nMediaPlayer MP3]
The app provides two interaction paths. Voice input: the user records audio via MediaRecorder, which is uploaded as a multipart M4A file to the FastAPI backend. The backend transcribes speech (Whisper STT), generates a response (LLM), synthesises audio (ElevenLabs TTS), and returns an MP3 stream. Text input: the user types a message and selects a voice persona; the app sends a JSON request to the backend, which returns synthesised speech. Both paths end with MediaPlayer playback of the MP3 response.
- Voice capture —
MediaRecorderM4A audio with runtime permission handling - AI voice pipeline — microphone input to Whisper STT to LLM reasoning to ElevenLabs TTS
- Text-to-speech — text input with selectable voice personas (Kim, Milla, John, Lily)
- Jetpack Compose UI — Material 3 interface with gradient background, voice dropdown, and action buttons
- FastAPI backend — included Python backend with Whisper, Hugging Face LLM, and ElevenLabs integrations
- JVM unit tests —
MainActivityTestvalidating backend URL configuration and voice list integrity
git clone https://github.com/OgeonX-Ai/android.git- Open the project in Android Studio
- Set
backendUrlinMainActivity.ktto your FastAPI backend endpoint - Start the backend:
cd backend && pip install -r requirements.txt && uvicorn main:app --port 8000 - Run on a device or emulator (min SDK 26 / Android 8.0)
Part of the Coding-Autopilot-System ecosystem: gsd-orchestrator | Promptimprover | autogen