WIP: Add encryption and authentication support#77
Conversation
| PSK = HKDF-SHA256(dh_shared_secret, salt="sendspin-pairing", info="sendspin-psk") | ||
| ``` | ||
|
|
||
| This PSK bootstraps a Noise NNpsk0 handshake. Inside the resulting encrypted session, both sides exchange long-term static public keys for future reconnections. See [encryption messages](#encryption-messages) for the full message sequence. |
There was a problem hiding this comment.
This version of the spec does not secure against MITM attacks.
We need to support mandatory PIN confirmation or similar, at least for commercial devices. Probably still optional for DIY clients.
There was a problem hiding this comment.
PIN is difficult though, only very few speakers have screens.
We could also just accept the risk since the MITM attack needs to happen during setup. After that the connection is protected.
And for devices that implement pairing_method: 'button' the LRU eviction issue can only happen with physical access.
|
|
||
| The receiver decrypts and checks the first byte: `0x00` means strip it and parse the rest as JSON; any other value is handled as a binary message using the existing [binary message ID structure](#binary-message-id-structure). | ||
|
|
||
| Nonce management is handled automatically by the Noise transport state. |
There was a problem hiding this comment.
Noise has a 65535-byte transport message limit. Large artwork images may exceed this.
There was a problem hiding this comment.
Reserve a bit of the role byte. a 0 (existintg behavior) says its a finished message. A 1 says its unfinished and will continue with follow up messages until one is marked finished.
Mitigates MITM and rogue server attacks. Physical button press gates pairing and LRU eviction. Unpaired servers cannot displace active connections or trigger pairing from discovery. Identity changes between sessions surface a distinct warning before re-pairing is attempted.
|
|
||
| Clients advertise their pairing capability via the `pairing_method` field in [`client/hello`](#client--server-clienthello): | ||
|
|
||
| - `button` — physical pairing button on the device. Pairing requires both server UI approval and a button press. **Recommended.** |
There was a problem hiding this comment.
What is that button? play/pause? Long pressing play/pause? Dedicated sync/paring button?
Not clear.
There was a problem hiding this comment.
Remember that not all speakers have a buttons though, (so would be nice if could force pairing without a button).
I guess an alternative option to having a physical pairing button could be to do what many Zigbee devices manufactures do when the device does not have any buttons and instead have the user power ON and OFF the device a certain amout of times within a set time slot? The downside to that is that it is not a user-friendly experience and pairing of some Zigbee devices that uses that method can sometimes require multiple or repeated attempts before pairing works, (that is also why many manufacturers usually recommend to always factory-reset of such Zigbee devices before pairing it).
There was a problem hiding this comment.
Yep, this is a tricky problem to solve though.
Power cycling wouldn't really work here since you need to keep the WebSocket connection during this "accept pairing" action. (and I hate switching the power on and off every time I need to pair a lightbulb haha)
As an alternative there is still the none paring option. But here its a compromise between security and convenience. For example, anybody with access to the same network can blast music with full volume.
Another idea I'm thinking about is to have another password pairing option. Here any server that wants to pair to the server needs to know either a factory provided password (thats maybe printed on a sticker).
Or a device without a button or provided password can be "upgraded" to this more secure form.
I think most users wouldn't set a password, but with the option you could setup any Sendspin client (even without a button) to be secure enough to be used on a public wifi (for example in a Cafe or something similar). At least when just looking at the protocol itself as a attack surface.
I'm pretty sure this password approach is used by at least a couple AirPlay speakers.
| - `encryption?`: object - omitted if client does not support [encryption](#encryption) | ||
| - `suite`: string - preferred cipher suite: `'25519_ChaChaPoly_SHA256'` or `'25519_AESGCM_SHA256'` | ||
| - `pairing_method`: string - `'button'` or `'none'`. See [pairing methods](#pairing-methods) | ||
| - `server_static_key?`: string - Base64 encoded static public key of this server from a previous pairing. Enables [reconnection](#pairing-and-reconnection) via Noise KK handshake |
There was a problem hiding this comment.
The client doesn't know yet what server is connecting. How can then the client send the correct static key?
But thinking about it, we don't even need this field. The server and client can just try to build a connection, if it fails we treat it invalid.
And if the client suddenly acts as if it isn't paired, the server should also show a warning.
Client sends first, server responds, removing an ambiguity that could lead to race conditions or divergent implementations.
| - `button` — physical pairing button on the device. Pairing requires both server UI approval and a button press. **Recommended.** | ||
| - `none` — no physical pairing mechanism. Pairing requires only server UI approval. **Discouraged** — vulnerable to rogue server pairing with no physical presence check. |
There was a problem hiding this comment.
Alternative idea to always require the button press:
Clients allow all servers to connect, but some roles require a more secure paring process.
In that case playing to a speaker is always as frictionless as possible (and still protects the server from zero-click attacks).
But more sensitive roles are still secured behind additional steps (for example the future source role).
For example, a client implementing the player and source role will only be able to use the player role until the user does the full pairing process.
Rough draft of how encryption support could look like.
Related: Sendspin/backlog#32
Still very WIP with a couple limitations and potential security vulnerabilities.