Skip to content

Add Mobile App Companion for Voice Copilot#1

Open
Securiteru wants to merge 1 commit into
mainfrom
mobile
Open

Add Mobile App Companion for Voice Copilot#1
Securiteru wants to merge 1 commit into
mainfrom
mobile

Conversation

@Securiteru

Copy link
Copy Markdown
Owner

Mobile App Companion for Voice Copilot

This PR adds a complete mobile app companion that allows you to send voice recordings to your Voice Copilot service from your phone.

🎯 What's New

📱 Mobile App

  • React Native app with large, intuitive microphone button
  • Real-time recording feedback and connection status
  • Professional Material Design UI with cross-platform support
  • Easy server configuration and connection testing
  • Transcription results display with clear status messages

🖥️ API Server

  • Flask-based REST API with /health, /status, and /transcribe endpoints
  • Audio file upload handling with Whisper integration
  • CORS support for mobile app compatibility
  • Flexible configuration (host, port, model size)

🔧 Services & Scripts

  • Combined service to run desktop hotkeys + API server together
  • Helper scripts for network setup and ngrok integration
  • Testing scripts for API validation
  • Demo script for easy setup

🌐 Network Options

Local WiFi (Recommended)

  • Direct connection to your computer
  • Fast, private, and secure
  • No external dependencies

Internet via ngrok

  • Access from anywhere in the world
  • Automatic tunnel setup
  • Handles firewall/NAT issues

🚀 Quick Start

1. Start the API Server

# API server only
poetry run voice-transcriber-api --host 0.0.0.0 --port 8000

# Combined (desktop + API)
poetry run voice-transcriber-combined --api-host 0.0.0.0 --api-port 8000

# With ngrok for internet access
./scripts/start-with-ngrok.sh

2. Setup Mobile App

cd mobile-app
npm install
npm start

3. Connect and Use

  1. Install Expo Go on your phone
  2. Scan the QR code from terminal
  3. Enter your server URL in the app
  4. Test connection and start recording!

📋 Features

Mobile App Features

  • 🎤 Large microphone button with tap-and-hold recording
  • 📊 Real-time recording duration and processing status
  • 🔄 Automatic connection status detection
  • ⚙️ Simple server URL configuration
  • 📝 Clear transcription results display
  • 🎨 Professional Material Design UI

Server Features

  • 🔌 RESTful API with standard HTTP endpoints
  • 🔄 Multipart file upload handling
  • 🧠 Integration with existing Whisper transcription
  • 📊 Health and configuration monitoring
  • 🌐 CORS support for web/mobile access
  • 🔧 Configurable model sizes and network settings

Network Features

  • 🏠 Local WiFi network support
  • 🌍 Internet access via ngrok tunneling
  • 🔒 Local processing (no cloud dependencies)
  • ⚡ Direct connection for optimal performance

📁 Files Added

Core Components

  • voice_transcriber/api_server.py - Main Flask API server
  • voice_transcriber/combined_service.py - Combined desktop+API service
  • mobile-app/ - Complete React Native mobile app

Helper Scripts

  • scripts/start-with-ngrok.sh - Easy ngrok tunnel setup
  • scripts/get-local-ip.py - Network configuration helper
  • test_api.py - API endpoint testing
  • demo.py - Complete demo and setup script

Documentation

  • MOBILE_SETUP.md - Comprehensive setup guide
  • MOBILE_APP_SUMMARY.md - Implementation summary
  • Updated README.md with mobile app information

🧪 Testing

The implementation includes comprehensive testing:

  • ✅ API server startup and health checks
  • ✅ Audio upload and transcription processing
  • ✅ Error handling and edge cases
  • ✅ Network connectivity options
  • ✅ Mobile app integration

🔧 Technical Details

API Endpoints

  • GET /health - Server health check
  • GET /status - Configuration and model information
  • POST /transcribe - Audio file upload and transcription

Audio Processing

  1. Mobile app records WAV audio (16kHz, mono)
  2. Uploads via multipart/form-data
  3. Server processes with existing Whisper engine
  4. Returns JSON with transcribed text
  5. Automatic cleanup of temporary files

Dependencies Added

  • flask ^3.0.0 - Web framework for API
  • flask-cors ^4.0.0 - CORS support for mobile access

📖 Documentation

Complete setup guides and documentation are included:

  • Step-by-step setup instructions
  • Troubleshooting guides
  • Network configuration help
  • API reference documentation
  • Development and customization guides

🎉 Result

You now have a complete mobile app companion that provides exactly what you requested:

  • Massive microphone input on your phone
  • Voice recording transmission to your desktop service
  • Works over wireless (same WiFi network)
  • Works over internet (via ngrok domain)
  • Professional, easy-to-use interface
  • Seamless integration with existing Voice Copilot service

The mobile app successfully bridges the gap between your phone and desktop, allowing you to use Voice Copilot from anywhere!

@Securiteru can click here to continue refining the PR

- Created Flask API server with /health, /status, and /transcribe endpoints
- Built React Native mobile app with large microphone button and recording
- Added combined service to run desktop hotkeys + API server together
- Implemented local WiFi and internet (ngrok) connectivity options
- Added helper scripts for network setup and ngrok integration
- Created comprehensive setup guides and documentation
- Added API testing and demo scripts
- Updated dependencies and project configuration

Features:
- Large, intuitive microphone button for easy recording
- Real-time recording feedback and connection status
- Server URL configuration and connection testing
- Professional Material Design UI
- Cross-platform support (iOS/Android via Expo)
- Local network and internet access via ngrok
- Integration with existing Whisper transcription engine
- Complete documentation and setup guides
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants