🗣️💬 Live Multimodal Language Tutor - Flutter & Gemini AI

Learn languages interactively with the power of Gemini's multimodal AI in your Flutter app!

This project demonstrates how to implement a real-time language tutoring experience using the firebase_ai SDK for Flutter, leveraging the multimodal (text and voice) capabilities of Google's Gemini models via Firebase Vertex AI. Users can chat via text, speak to the AI tutor, and receive responses in both text and audio formats.

✨ Read here the full Medium article explaining this project in detail ✨

Key Features 🌟

💬 Real-time Text Chat: Send text messages to the AI tutor and receive instant responses.
🎤 Voice Input (Speech-to-Text): Press and hold the microphone button to speak directly to the tutor. Your voice is transcribed and sent to Gemini.
🗣️ Voice Output (Text-to-Speech): Enable audio response mode to hear the tutor's replies, aiding in pronunciation and listening comprehension.
🔄 Switchable Output Modality: Easily toggle between receiving responses as text or as audio.
🤖 Custom AI Tutor Persona: Gemini is instructed (via a system prompt) to act as a patient and encouraging language teacher.
🔥 Powered by Gemini & Firebase AI: Utilizes the gemini-2.0-flash-live-preview-04-09 model (or similar live-compatible model) through FirebaseAI.vertexAI() for low-latency, bidirectional communication.
🧱 Modular Architecture: Code organized into services (GeminiAiService, ChatService) for better maintainability and clarity.
🔊 Seamless Audio Integration: Uses the flutter_sound package for audio recording and playback.
✨ Reactive State Management: Extensive use of Streams for handling asynchronous data flow.

Technologies Used 🛠️

Flutter (v3.27.3): UI toolkit for building natively compiled applications.
Dart (v3.7): Programming language for Flutter.
Firebase AI SDK (firebase_ai): For interacting with Google's Gemini models. Specifically FirebaseAI.vertexAI().liveGenerativeModel() Google Gemini: Advanced multimodal AI model (using gemini-2.0-flash-live-preview-04-09 or similar). Firebase Vertex AI: Platform for deploying and serving ML models at scale.
flutter_sound: For audio recording and playback.
permission_handler: For requesting microphone permissions.

Prerequisites 📋

Flutter SDK: Ensure you have Flutter installed or FVM
Firebase Project: Create a project in the Firebase Console & enable Vertex AI API.
Firebase CLI:
- How to Install Firebase CLI: $ npm install -g firebase-tools
- Log in: $ firebase login
FlutterFire CLI:
- How to install FlutterFire CLI: $ dart pub global activate flutterfire_cli

Project Setup ⚙️

Clone the Repository:

$ git clone https://github.com/alfredobs97/gemini_talk.git
$ cd gemini_talk

Configure Firebase for your Flutter App: From the root of your cloned Flutter project, run:

$ flutterfire configure

Select your Firebase project created earlier.
Choose the platforms you want to configure (android, ios, web, etc.). This will generate the lib/firebase_options.dart file with your project's configuration.
Get Flutter Dependencies:

$ flutter pub get

Running the Application ▶️

Once setup is complete: Ensure you have an emulator running or a device connected. Run the app:

$ flutter run

How It Works (High-Level Overview) 🧠

HomePage (UI):

Captures user input (text or voice using flutter_sound). Displays chat messages and speaker animation. Orchestrates interactions with GeminiAiService and ChatService. Manages UI state (e.g., audio mode on/off, recording).

GeminiAiService:

Establishes and manages the LiveSession with the Gemini model via FirebaseAI.vertexAI().liveGenerativeModel(). Sends user text or the audio stream (InlineDataPart) to Gemini. Receives the stream of responses from Gemini (LiveServerMessage). Transforms responses into AiEvents (TextMessage, BytesMessage, TurnCompleted).

AiEvent (Sealed Class):

Defines a clear API for the types of responses the AI can generate, making them easier to handle in the UI.

ChatService:

Manages the list of chat messages. Assembles text message chunks (TextMessage) received from Gemini into complete messages, using TurnCompleted to finalize a message.

Data Flow:

User Input: UI -> GeminiAiService -> Gemini.

AI Response: Gemini -> GeminiAiService (processes to AiEvent) -> UI (reacts to AiEvent, updates ChatService, plays audio).

Everything is handled reactively using Streams.

Contributions ❤️

Contributions are welcome! If you have ideas for improving the app or find a bug, please open an issue or submit a pull request.

Acknowledgements 🙏

To the Google Developer Relations team for the #AISprint initiative and support via GDP credits to run the proyect.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.fvm		.fvm
.vscode		.vscode
android		android
ios		ios
lib		lib
linux		linux
macos		macos
web		web
windows		windows
.fvmrc		.fvmrc
.gitignore		.gitignore
.metadata		.metadata
README.md		README.md
analysis_options.yaml		analysis_options.yaml
firebase.json		firebase.json
pubspec.lock		pubspec.lock
pubspec.yaml		pubspec.yaml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🗣️💬 Live Multimodal Language Tutor - Flutter & Gemini AI

Learn languages interactively with the power of Gemini's multimodal AI in your Flutter app!

✨ Read here the full Medium article explaining this project in detail ✨

Key Features 🌟

Technologies Used 🛠️

Prerequisites 📋

Project Setup ⚙️

Running the Application ▶️

How It Works (High-Level Overview) 🧠

HomePage (UI):

GeminiAiService:

AiEvent (Sealed Class):

ChatService:

Data Flow:

Contributions ❤️

Acknowledgements 🙏

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🗣️💬 Live Multimodal Language Tutor - Flutter & Gemini AI

Learn languages interactively with the power of Gemini's multimodal AI in your Flutter app!

✨ Read here the full Medium article explaining this project in detail ✨

Key Features 🌟

Technologies Used 🛠️

Prerequisites 📋

Project Setup ⚙️

Running the Application ▶️

How It Works (High-Level Overview) 🧠

HomePage (UI):

GeminiAiService:

AiEvent (Sealed Class):

ChatService:

Data Flow:

Contributions ❤️

Acknowledgements 🙏

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages