Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion api-reference/server/services/s2s/gemini-live-vertex.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,7 @@ description: "A real-time, multimodal conversational AI service powered by Googl
<Card
title="Example Implementation"
icon="play"
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/realtime/realtime-gemini-live-vertex-function-calling.py"
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/realtime/realtime-gemini-live-vertex.py"
>
Complete Gemini Live Vertex AI function calling example
</Card>
Expand Down Expand Up @@ -252,4 +252,5 @@ llm = GeminiLiveVertexLLMService(
- **Authentication priority**: The service tries credentials in this order: (1) `credentials` JSON string, (2) `credentials_path` file, (3) Application Default Credentials (ADC).
- **File API not supported**: The Gemini File API is not available through Vertex AI. Use Google Cloud Storage for file handling instead.
- **Model naming**: Vertex AI uses different model identifiers (e.g., `"google/gemini-live-2.5-flash-native-audio"`) compared to the Google AI variant.
- **Async tool limitation**: Vertex AI's Gemini Live endpoint does not currently support NON_BLOCKING tool calls. Functions registered with `cancel_on_interruption=False` will log a one-time warning and fall back to synchronous behavior (the conversation pauses while the tool runs). Use `cancel_on_interruption=True` (the default) or use a non-realtime LLM service if your tool requires async semantics.
- **All other features** (VAD, context compression, thinking, function calling, etc.) work identically to the base [Gemini Live](/api-reference/server/services/s2s/gemini-live) service.
5 changes: 3 additions & 2 deletions api-reference/server/services/s2s/gemini-live.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -23,9 +23,9 @@ description: "A real-time, multimodal conversational AI service powered by Googl
<Card
title="Example Implementation"
icon="play"
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/realtime/realtime-gemini-live-function-calling.py"
href="https://github.com/pipecat-ai/pipecat/blob/main/examples/realtime/realtime-gemini-live-async-tool.py"
>
Complete Gemini Live function calling example
Complete Gemini Live async tool calling example
</Card>
<Card
title="Gemini Documentation"
Expand Down Expand Up @@ -321,6 +321,7 @@ llm = GeminiLiveLLMService(
## Notes

- **Model support**: The service supports both Gemini 2.5 and Gemini 3.x models. The service automatically detects and handles model-specific behavior.
- **Async tool support**: Functions registered with `cancel_on_interruption=False` use Gemini's NON_BLOCKING tool mechanism on models that support it (currently Gemini 2.x), allowing the conversation to continue while the tool runs in the background. The result is delivered via the async-tool mechanism and integrated into the model's next turn. On models that don't support NON_BLOCKING (Gemini 3.x), the service logs a one-time warning explaining the limitation. Note: An intermittent 1008 error can occasionally occur on Gemini 2.5 during long-running tool calls; the service auto-reconnects when this happens.
- **System instruction precedence**: The `system_instruction` from service settings takes precedence over an initial system message in the LLM context. A warning is logged when both are set.
- **VAD modes**: By default, Gemini Live uses server-side VAD for detecting when the user starts and stops speaking. To use local VAD (e.g., Silero), set `vad=GeminiVADParams(disabled=True)` and configure an external VAD analyzer in your `LLMUserAggregatorParams`. The service will automatically send activity signals to the Gemini API when local VAD detects speech.
- **Tools precedence**: Similarly, tools provided in the context override tools provided at init time.
Expand Down
Loading