-
Notifications
You must be signed in to change notification settings - Fork 8
feat: add Lyrics role (LRC and CDG) #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -7,11 +7,12 @@ Sendspin is a multi-room music experience protocol. The goal of the protocol is | |
| ## Definitions | ||
|
|
||
| - **Sendspin Server** - orchestrates all devices, generates audio streams, manages players and clients, provides metadata | ||
| - **Sendspin Client** - a client that can play audio, visualize audio, display metadata, display colors, or provide music controls. Has different possible roles (player, metadata, controller, artwork, visualizer, color). Every client has a unique identifier | ||
| - **Sendspin Client** - a client that can play audio, visualize audio, display metadata, display colors, or provide music controls. Has different possible roles (player, metadata, controller, artwork, lyrics, visualizer, color). Every client has a unique identifier | ||
| - **Player** - receives audio and plays it in sync. Has its own volume and mute state and preferred format settings | ||
| - **Controller** - controls the Sendspin group this client is part of | ||
| - **Metadata** - displays text metadata (title, artist, album, etc.) | ||
| - **Artwork** - displays artwork images. Has preferred format for images | ||
| - **Lyrics** - displays track lyrics or karaoke subtitles. Has preferred format for lyrics data | ||
| - **Visualizer** - visualizes music. Has preferred format for audio features | ||
| - **Color** - receives colors derived from the current audio | ||
| - **Sendspin Group** - a group of clients. Each client belongs to exactly one group, and every group has at least one client. Every group has a unique identifier. Each group has the following states: list of member clients, volume, mute, and playback state | ||
|
|
@@ -21,7 +22,7 @@ Sendspin is a multi-room music experience protocol. The goal of the protocol is | |
|
|
||
| Roles define what capabilities and responsibilities a client has. All roles use explicit versioning with the `@` character: `<role>@<version>` (e.g., `player@v1`, `controller@v1`). | ||
|
|
||
| This specification defines the following roles: [`player`](#player-messages), [`controller`](#controller-messages), [`metadata`](#metadata-messages), [`artwork`](#artwork-messages), [`visualizer`](#visualizer-messages), [`color`](#color-messages). All servers must implement all versions of these roles described in this specification. | ||
| This specification defines the following roles: [`player`](#player-messages), [`controller`](#controller-messages), [`metadata`](#metadata-messages), [`artwork`](#artwork-messages), [`lyrics`](#lyrics-messages), [`visualizer`](#visualizer-messages), [`color`](#color-messages). All servers must implement all versions of these roles described in this specification. | ||
|
|
||
| All role names and versions not starting with `_` are reserved for future revisions of this specification. | ||
|
|
||
|
|
@@ -147,7 +148,7 @@ Binary message IDs typically use **bits 7-2** for role type and **bits 1-0** for | |
| - `000000xx` (0-3): Reserved for future use | ||
| - `000001xx` (4-7): Player role | ||
| - `000010xx` (8-11): Artwork role | ||
| - `000011xx` (12-15): Reserved for a future role | ||
| - `000011xx` (12-15): Lyrics role | ||
| - `00010xxx` (16-23): Visualizer role | ||
| - Roles 6-47 (IDs 24-191): Reserved for future roles | ||
| - Roles 48-63 (IDs 192-255): Available for use by [application-specific roles](#application-specific-roles) | ||
|
|
@@ -217,6 +218,9 @@ sequenceDiagram | |
| alt Artwork role | ||
| Server->>Client: binary Types 8-11 (artwork channels 0-3) | ||
| end | ||
| alt Lyrics role | ||
| Server->>Client: binary Type 12 (lyrics data) | ||
| end | ||
| alt Visualizer role | ||
| Server->>Client: binary Type 16 (visualization data) | ||
| end | ||
|
|
@@ -275,10 +279,12 @@ Players that can output audio should have the role `player`. | |
| - `controller@v1` - controls the current Sendspin group | ||
| - `metadata@v1` - displays text metadata describing the currently playing audio | ||
| - `artwork@v1` - displays artwork images | ||
| - `lyrics@v1` - displays track lyrics or karaoke subtitles | ||
| - `visualizer@v1` - visualizes audio | ||
| - `color@v1` - receives colors derived from the current audio | ||
| - `player@v1_support?`: object - only if `player@v1` is listed ([see player@v1 support object details](#client--server-clienthello-playerv1-support-object)) | ||
| - `artwork@v1_support?`: object - only if `artwork@v1` is listed ([see artwork@v1 support object details](#client--server-clienthello-artworkv1-support-object)) | ||
| - `lyrics@v1_support?`: object - only if `lyrics@v1` is listed ([see lyrics@v1 support object details](#client--server-clienthello-lyricsv1-support-object)) | ||
| - `visualizer@v1_support?`: object - only if `visualizer@v1` is listed ([see visualizer@v1 support object details](#client--server-clienthello-visualizerv1-support-object)) | ||
|
|
||
| **Note:** Each role version may have its own support object (e.g., `player@v1_support`, `player@v2_support`). Application-specific roles or role versions follow the same pattern (e.g., `_myapp_display@v1_support`, `player@_experimental_support`). | ||
|
|
@@ -381,6 +387,7 @@ Starts a stream for one or more roles. If sent for a role that already has an ac | |
|
|
||
| - `player?`: object - only sent to clients with the `player` role ([see player object details](#server--client-streamstart-player-object)) | ||
| - `artwork?`: object - only sent to clients with the `artwork` role ([see artwork object details](#server--client-streamstart-artwork-object)) | ||
| - `lyrics?`: object - only sent to clients with the `lyrics` role ([see lyrics object details](#server--client-streamstart-lyrics-object)) | ||
| - `visualizer?`: object - only sent to clients with the `visualizer` role ([see visualizer object details](#server--client-streamstart-visualizer-object)) | ||
|
|
||
| [Application-specific roles](#application-specific-roles) may also include objects in this message (keys starting with `_`). | ||
|
|
@@ -399,6 +406,7 @@ Request different stream format (upgrade or downgrade). Available for clients wi | |
|
|
||
| - `player?`: object - only for clients with the `player` role ([see player object details](#client--server-streamrequest-format-player-object)) | ||
| - `artwork?`: object - only for clients with the `artwork` role ([see artwork object details](#client--server-streamrequest-format-artwork-object)) | ||
| - `lyrics?`: object - only for clients with the `lyrics` role ([see lyrics object details](#client--server-streamrequest-format-lyrics-object)) | ||
|
|
||
| [Application-specific roles](#application-specific-roles) may also include objects in this message (keys starting with `_`). | ||
|
|
||
|
|
@@ -410,7 +418,7 @@ Response: [`stream/start`](#server--client-streamstart) for the requested role(s | |
|
|
||
| Ends the stream for one or more roles. When received, clients should stop output and clear buffers for the specified roles. | ||
|
|
||
| - `roles?`: string[] - roles to end streams for ('player', 'artwork', 'visualizer'). If omitted, ends all active streams | ||
| - `roles?`: string[] - roles to end streams for ('player', 'artwork', 'lyrics', 'visualizer'). If omitted, ends all active streams | ||
|
|
||
| [Application-specific roles](#application-specific-roles) may also be included in this array (names starting with `_`). | ||
|
|
||
|
|
@@ -703,6 +711,54 @@ The timestamp indicates when this artwork should be displayed. Clients must tran | |
|
|
||
| **Clearing artwork:** To clear the currently displayed artwork on a specific channel, the server sends an empty binary message (only the message type byte and timestamp, with no image data) for that channel. | ||
|
|
||
| ## Lyrics messages | ||
| This section describes messages specific to clients with the `lyrics` role, which handle display of track lyrics and karaoke subtitles. Lyrics clients receive lyrics data in their preferred format delivered as binary messages. | ||
|
|
||
| **Supported formats:** | ||
| - `lrc` - LRC text format with embedded timestamps (UTF-8 encoded) | ||
| - `cdg` - CD+G karaoke format (binary) | ||
|
|
||
| **No transport timestamps:** Unlike audio or artwork, lyrics binary messages do not carry a server clock timestamp. Timing information is embedded within the lyrics data itself (e.g., `[mm:ss.xx]` tags in LRC, frame timing in CDG). The server sends lyrics at the start of each track; clients begin processing as soon as they receive them. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. How will the lyrics be in sync then if it's not aligned to any timestamp?
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The spec instructs clients how to calculate track progress - can't we use that progress value to show the right lyrics? This is what I'm currently doing in my client (which is downloading lyrics independently) and it's working well. |
||
|
|
||
| ### Client → Server: `client/hello` lyrics@v1 support object | ||
|
|
||
| The `lyrics@v1_support` object in [`client/hello`](#client--server-clienthello) has this structure: | ||
|
|
||
| - `lyrics@v1_support`: object | ||
| - `supported_formats`: string[] - list of supported lyrics formats in priority order (first is preferred), e.g. `['lrc', 'cdg']` | ||
| - `max_size_bytes`: integer - maximum size in bytes of lyrics data the client can handle per binary message. If the lyrics for the current track exceed this limit, the server must send [`stream/end`](#server--client-streamend) for the lyrics role instead of the binary data. | ||
|
|
||
| **Note:** Servers must support all lyrics formats: 'lrc' and 'cdg'. | ||
|
|
||
| ### Client → Server: `stream/request-format` lyrics object | ||
|
|
||
| The `lyrics` object in [`stream/request-format`](#client--server-streamrequest-format) has this structure: | ||
|
|
||
| Request the server to change the lyrics format. | ||
|
|
||
| After receiving this message, the server responds with [`stream/start`](#server--client-streamstart) for the lyrics role with the new format, followed by an immediate lyrics update through a binary message. | ||
|
|
||
| - `lyrics`: object | ||
| - `format`: 'lrc' | 'cdg' - requested lyrics format | ||
|
|
||
| ### Server → Client: `stream/start` lyrics object | ||
|
|
||
| The `lyrics` object in [`stream/start`](#server--client-streamstart) has this structure: | ||
|
|
||
| - `lyrics`: object | ||
| - `format`: 'lrc' | 'cdg' - format of the lyrics data | ||
|
|
||
| ### Server → Client: Lyrics (Binary) | ||
|
|
||
| Binary messages should be rejected if there is no active stream. | ||
|
|
||
| - Byte 0: message type `12` (uint8) | ||
| - Rest of bytes: lyrics data (UTF-8 encoded text for `lrc`; binary data for `cdg`) | ||
|
OnFreund marked this conversation as resolved.
|
||
|
|
||
| The server sends lyrics data at the start of each track and resends after a track change. Clients begin processing the lyrics data immediately upon receipt. | ||
|
|
||
| When no lyrics are available for the current track (or lyrics exceed the client's `max_size_bytes`), the server sends [`stream/end`](#server--client-streamend) for the lyrics role. When lyrics become available again (e.g., on the next track), the server sends a new [`stream/start`](#server--client-streamstart) followed by the binary lyrics data. | ||
|
|
||
| ## Visualizer messages | ||
| This section describes messages specific to clients with the `visualizer` role, which create visual representations of the audio being played. Visualizer clients receive audio analysis data like FFT information that corresponds to the current audio timeline. | ||
|
|
||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lrc files are pretty ubiquitous, but I haven't heard of CD+G before.
Why should we support CD+G? I mean if it's really niche we can't depend on Sendspin implementations supporting that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have the data to back this up unfortunately, but karaoke subtitles are not niche and are well supported by both hardware and software.
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@maximmaxim345 CDG+G is not niche at all, it is an official standard for Audio-CD with Graphics (as Audio Compact Discs with digital images) that all professional Karaoke machines uses (such as used in Karaoke bars and home use), so almost all songs are available in that format.The ”G” stands for Graphics and as I understand the format is it just images for each character and as such does not need a font engine to display wierd characters for all languages. Anyway, more common nowadays are MP3+G which is just when an the redbook Audio-CD part been converted to MPG and the Graphics broken out to separate files (.cdg).
For player code reference I suggest check out the Kodi (formerly XBMC) codebase which supports CD+G and MP3+G
Quote from that Wikipedia article:
Along with dedicated karaoke machines, other consumer devices that play CD+G format CDs include the NEC TurboGrafx-CD (a CD-ROM peripheral for the TurboGrafx-16) and Turbo Duo, as well as the Japan-only successor the PC-FX, the Philips CD-i, the Sega CD, Sega Saturn,[5] the JVC X'Eye, the 3DO Interactive Multiplayer, the Amiga CD32 and Commodore CDTV, and the Atari Jaguar CD (an attachment for the Atari Jaguar). Some CD-ROM drives can also read this data. Pioneer's LaserActive player can also play CD+G discs, as long as either the PAC-S1/S-10 or PAC-N1/N10 game modules are installed.
Since 2003, some standalone DVD players have supported the CD+G format. Regular audio CD players will output only the audio tracks as if it was a normal music CD, unless otherwise designed to read the extra data (lyrics and images).[6]
CD+G karaoke albums are still made today by several UK and US manufacturers including Sunfly, Zoom Entertainments, SBI Karaoke and Vocal Star. Although the popularity of CD sales are dwindling the format is still widely used as MP3+G downloads.