Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion api-reference/server/services/tts/elevenlabs.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -245,7 +245,7 @@ async with aiohttp.ClientSession() as session:
- **Multilingual models required for `language`**: Setting `language` with a non-multilingual model (e.g. `eleven_turbo_v2_5`) has no effect. Use `eleven_multilingual_v2` or similar.
- **WebSocket vs HTTP**: The WebSocket service supports word-level timestamps and interruption handling, making it significantly better for interactive conversations. The HTTP service is simpler but lacks these features.
- **Text aggregation**: Sentence aggregation is enabled by default (`text_aggregation_mode=TextAggregationMode.SENTENCE`). Buffering until sentence boundaries produces more natural speech. Set `text_aggregation_mode=TextAggregationMode.TOKEN` to stream tokens directly for lower latency. The `auto_mode` parameter is automatically configured based on the aggregation mode for optimal quality.
- **Word timestamp accuracy**: Word timestamps accurately reflect the spoken audio, not just the input text. When using pronunciation dictionaries or text normalization (`apply_text_normalization`), the service consumes ElevenLabs' normalized alignment data to ensure downstream consumers (captions, transcripts, context aggregation) match what the listener actually hears.
- **Word timestamp accuracy**: Word timestamps reflect the original input text by default, preserving non-Latin scripts in transcripts and LLM context. When pronunciation dictionaries are configured via `pronunciation_dictionary_locators`, the service switches to ElevenLabs' normalized alignment to avoid duplicate words caused by dictionary substitutions. Text normalization (`apply_text_normalization`) does not affect which alignment field is used.

## Event Handlers

Expand Down
Loading