Skip to content

e2ee: retain previous receive header key for cross-poll stragglers #298

@danmarg

Description

@danmarg

Problem

The cross-epoch reorder guarantee in §5.3 applies only within a single batchdecryptAndSort orders previous-epoch frames before current-epoch ones, and the cache path uses a pre-decrypted header. A previous-epoch straggler that arrives in a later poll than the one that triggered the ratchet fails tryDecryptHeader because we retain only the current and next receive header keys (§8.3.1(6)).

Realistic vectors

  • WAL retry on flaky network: Alice's outbox holds a pre-encrypted epoch-N blob across her own ratchet to N+1; when connectivity returns, the WAL retransmits the old blob to its original epoch-N token.
  • Server pagination on backlog drain: Bob comes back online after a long absence; page 1 trips the ratchet, page 2's prev-epoch messages arrive next poll. Real path given MAX_MESSAGES_PER_POLL = 500.
  • Brief polling-token race during epoch transition.

Impact

  • Location samples: invisible — the next update supersedes.
  • Sticky state transitions (stop, stationary): a brief UI glitch (friend still shown as moving / sharing) until the sender's next message arrives. If the sender goes quiet after a stop — which is the whole point — the correction never comes.

Proposed enhancement

Retain the previous receive header key for a bounded window:

  • Add prevRecvHeaderKey: ByteArray to SessionState (shared/.../Types.kt).
  • In Session.performDhRatchet (Session.kt:355), copy the old headerKey into prevRecvHeaderKey before overwriting.
  • In E2eeProtocol.tryDecryptHeader and Session.decryptMessage's header-decrypt block, try prevRecvHeaderKey after headerKey and nextHeaderKey.

Compatibility

Purely local state — no wire change:

  • Envelope, AAD, handshake formats are unchanged.
  • SessionState JSON deserializers use ignoreUnknownKeys = true, so old persisted blobs deserialize fine (missing field → default empty ByteArray); rolled-back code ignores the extra field.

Out of scope

  • Bounded retention window: TBD, but should be short enough that an attacker who somehow learns a stale header key cannot replay arbitrarily far back. A single-ratchet window (one DH step) is likely sufficient.
  • A sender-side approach (re-emit stop/stationary periodically in keepalives) is an alternative for the UI-stickiness symptom that doesn't require this state change.

Triggering

Reproducible via PreviousEpochStragglerTest: deliver a previous-epoch seq >= 2 straggler in a separate poll after the ratchet. Currently asserts the drop; flipping the assertion would verify the fix.

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions