This project demonstrates how to implement a real-time language tutoring experience using the firebase_ai SDK for Flutter, leveraging the multimodal (text and voice) capabilities of Google's Gemini models via Firebase Vertex AI. Users can chat via text, speak to the AI tutor, and receive responses in both text and audio formats.
-
💬 Real-time Text Chat: Send text messages to the AI tutor and receive instant responses.
-
🎤 Voice Input (Speech-to-Text): Press and hold the microphone button to speak directly to the tutor. Your voice is transcribed and sent to Gemini.
-
🗣️ Voice Output (Text-to-Speech): Enable audio response mode to hear the tutor's replies, aiding in pronunciation and listening comprehension.
-
🔄 Switchable Output Modality: Easily toggle between receiving responses as text or as audio.
-
🤖 Custom AI Tutor Persona: Gemini is instructed (via a system prompt) to act as a patient and encouraging language teacher.
-
🔥 Powered by Gemini & Firebase AI: Utilizes the gemini-2.0-flash-live-preview-04-09 model (or similar live-compatible model) through FirebaseAI.vertexAI() for low-latency, bidirectional communication.
-
🧱 Modular Architecture: Code organized into services (GeminiAiService, ChatService) for better maintainability and clarity.
-
🔊 Seamless Audio Integration: Uses the flutter_sound package for audio recording and playback.
-
✨ Reactive State Management: Extensive use of Streams for handling asynchronous data flow.
-
Flutter (v3.27.3): UI toolkit for building natively compiled applications.
-
Dart (v3.7): Programming language for Flutter.
-
Firebase AI SDK (firebase_ai): For interacting with Google's Gemini models. Specifically FirebaseAI.vertexAI().liveGenerativeModel() Google Gemini: Advanced multimodal AI model (using gemini-2.0-flash-live-preview-04-09 or similar). Firebase Vertex AI: Platform for deploying and serving ML models at scale.
-
flutter_sound: For audio recording and playback.
-
permission_handler: For requesting microphone permissions.
-
Flutter SDK: Ensure you have Flutter installed or FVM
-
Firebase Project: Create a project in the Firebase Console & enable Vertex AI API.
-
Firebase CLI:
-
How to Install Firebase CLI:
$ npm install -g firebase-tools -
Log in:
$ firebase login
-
-
FlutterFire CLI:
- How to install FlutterFire CLI:
$ dart pub global activate flutterfire_cli
- How to install FlutterFire CLI:
- Clone the Repository:
$ git clone https://github.com/alfredobs97/gemini_talk.git
$ cd gemini_talk
- Configure Firebase for your Flutter App: From the root of your cloned Flutter project, run:
$ flutterfire configure
-
Select your Firebase project created earlier.
-
Choose the platforms you want to configure (android, ios, web, etc.). This will generate the lib/firebase_options.dart file with your project's configuration.
-
Get Flutter Dependencies:
$ flutter pub get
Once setup is complete: Ensure you have an emulator running or a device connected. Run the app:
$ flutter run
Captures user input (text or voice using flutter_sound). Displays chat messages and speaker animation. Orchestrates interactions with GeminiAiService and ChatService. Manages UI state (e.g., audio mode on/off, recording).
Establishes and manages the LiveSession with the Gemini model via FirebaseAI.vertexAI().liveGenerativeModel(). Sends user text or the audio stream (InlineDataPart) to Gemini. Receives the stream of responses from Gemini (LiveServerMessage). Transforms responses into AiEvents (TextMessage, BytesMessage, TurnCompleted).
Defines a clear API for the types of responses the AI can generate, making them easier to handle in the UI.
Manages the list of chat messages. Assembles text message chunks (TextMessage) received from Gemini into complete messages, using TurnCompleted to finalize a message.
User Input: UI -> GeminiAiService -> Gemini.
AI Response: Gemini -> GeminiAiService (processes to AiEvent) -> UI (reacts to AiEvent, updates ChatService, plays audio).
Everything is handled reactively using Streams.
Contributions are welcome! If you have ideas for improving the app or find a bug, please open an issue or submit a pull request.
To the Google Developer Relations team for the #AISprint initiative and support via GDP credits to run the proyect.