diff --git a/README.md b/README.md index 4e2b1ca..cd4ffdf 100644 --- a/README.md +++ b/README.md @@ -176,6 +176,8 @@ Binary audio messages contain timestamps in the server's time domain indicating - When a client cannot maintain sync (e.g., buffer underrun), it should send `state: 'error'` via [`client/state`](#client--server-clientstate), mute its audio output, and continue buffering until it can resume synchronized playback, at which point it should send `state: 'synchronized'` - The server is unaware of individual client synchronization accuracy - it simply broadcasts timestamped audio - The server sends audio to late-joining clients with future timestamps only, allowing them to buffer and start playback in sync with existing clients +- After sending [`stream/start`](#server--client-streamstart) or [`stream/clear`](#server--client-streamclear) messages, servers should schedule the first audio timestamp far enough in the future so clients can receive and queue initial chunks without missing playback start (see [`required_lead_time_ms`](#client--server-clientstate-player-object)) +- For live streams, servers may need to delay playback to build and maintain players' [`min_buffer_ms`](#client--server-clientstate-player-object) targets - Audio chunks may arrive with timestamps in the past due to network delays or buffering; clients should drop these late chunks to maintain sync - Clients subtract their [`static_delay_ms`](#client--server-clientstate-player-object) from server timestamps before scheduling playback - Servers factor in each client's `static_delay_ms` when calculating how far ahead to send audio, keeping effective buffer headroom constant @@ -455,6 +457,19 @@ The `player@v1_support` object in [`client/hello`](#client--server-clienthello) **Note:** Servers must support all audio codecs: 'opus', 'flac', and 'pcm'. +**Note:** [`required_lead_time_ms`](#client--server-clientstate-player-object) and [`min_buffer_ms`](#client--server-clientstate-player-object) are reported via [`client/state`](#client--server-clientstate-player-object). Players should report the lowest values that reliably prevent buffer underruns and start-of-stream truncation under expected conditions, to ensure the lowest possible latency for real-time applications. Both should factor in expected network delay/jitter (small on LAN/Wi-Fi, larger for remote or high-latency clients). Do not include `static_delay_ms` in these values; the server applies `static_delay_ms` separately when calculating send-ahead. + +**Server behavior:** +- For startup/restart timing, compute per-player send-ahead using `required_lead_time_ms + static_delay_ms`. +- For grouped startup/restart, use a common send-ahead of `max(required_lead_time_ms + static_delay_ms)` across grouped players. +- For ongoing playback timing, compute per-player send-ahead using `min_buffer_ms + static_delay_ms`. +- For live streams or other real-time content with grouped playback, use a common ongoing send-ahead of `max(min_buffer_ms + static_delay_ms)` across grouped players. Recompute when players join, leave, or update their timing parameters. +- When the max `min_buffer_ms` decreases mid-stream (player leaves group, or updates timing), the server may keep the current send-ahead unchanged or reduce it toward the new max. The choice depends on the implementation and the priorities of the server. +- Especially for live streams, servers must keep each player's ongoing buffer duration at or above its `min_buffer_ms`, capped by the maximum buffer size advertised in `buffer_capacity`. If `min_buffer_ms` worth of audio exceeds `buffer_capacity`, `buffer_capacity` takes precedence; players must size `buffer_capacity` to fit their own `min_buffer_ms`. +- For buffered streams, prefer filling each player's queue near `buffer_capacity` to maximize stability. +- `buffer_capacity` is a hard per-player byte limit; servers should not send data that would cause a player's queued compressed audio to exceed this limit. +- Servers may rate-limit, debounce, or coalesce a player's timing updates to prevent disruption from frequent or small changes. + **PCM Encoding Convention:** For the `pcm` codec, samples are encoded as little-endian signed integers (two's complement). 24-bit samples are packed as 3 bytes per sample. ### Client → Server: `client/state` player object @@ -469,10 +484,14 @@ State updates must be sent whenever any state changes, including when the volume - `volume?`: integer - range 0-100, must be included if 'volume' is in `supported_commands` from [`player@v1_support`](#client--server-clienthello-playerv1-support-object) - `muted?`: boolean - mute state, must be included if 'mute' is in `supported_commands` from [`player@v1_support`](#client--server-clienthello-playerv1-support-object) - `static_delay_ms`: integer - static delay in milliseconds (0-5000), always required for players + - `required_lead_time_ms`: integer - minimum startup lead time in milliseconds (e.g., codec init, decode warmup, audio backend buffering, DAC latency), always required for players. Measured from server transmit time of the start/restart trigger ([`stream/start`](#server--client-streamstart) or [`stream/clear`](#server--client-streamclear)) to the timestamp of the first subsequent audio chunk. + - `min_buffer_ms`: integer - requested minimum ongoing buffer duration in milliseconds during playback (primarily for live streams), used to absorb network jitter and ongoing decode/playback timing variance. Always required for players. - `supported_commands?`: string[] - subset of: 'set_static_delay' **Static delay:** The default is 0, meaning audio exits the device's audio port at the timestamp. `static_delay_ms` compensates for additional delay beyond the port (external speakers, amplifiers). Negative values are not supported and should never be required for any compliant implementation. Clients must persist `static_delay_ms` locally across reboots and server reconnections. Clients may update `static_delay_ms` and `supported_commands` when audio output changes (e.g., external speaker connected), persisting separate delays per output. +**Timing parameters:** Clients may update `required_lead_time_ms` and `min_buffer_ms` at any time (e.g., after empirically measuring lead time post-warmup, or on link-type change). Servers must factor in updated values for subsequent playback timing. Clients should debounce updates locally, reporting changes only after a shift in conditions appears sustained, not on transient fluctuations. + ### Client → Server: `stream/request-format` player object The `player` object in [`stream/request-format`](#client--server-streamrequest-format) has this structure: