Skip to content

Spider: unit-test the reconnect backoff/blackout state machine #101

Description

@kwsantiago

Follow-up from #99. The reconnect backoff/blackout decision logic in relayLoop (src/spider.zig) has no unit coverage, so regressions like the one fixed in #99 (productive sessions escalating into blackouts) can only be caught by running against live relays.

Problem

The productive / quick-disconnect / blackout classification is interleaved with interruptibleSleep, network calls, and milliTimestamp(), so it cannot be tested in isolation. #99 reached readLoop with the wrong event count and silently relocated the bug; a pure decision function would have caught both.

Fix direction

Extract a pure function, e.g.:

fn classifyOutcome(success: bool, last_session_events: u64, connection_duration: i64) Action

returning the next reconnect-delay / sleep / blackout decision, and unit-test the boundaries:

  • productive short session (events > 0, duration < QUICK_DISCONNECT_MS) -> reset delay, no escalation
  • unproductive quick disconnect (events == 0, duration < QUICK_DISCONNECT_MS) -> escalate
  • long uptime, no events -> reset delay
  • connection failure -> escalate
  • Nth consecutive failure -> blackout boundary (currently ~5 failures with MAX_RECONNECT_DELAY_MS = 5m)

Notes

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions