Skip to content

fix(auth): F8 — stop intermittent mobile forced re-login (token loss)#90

Merged
ahmetabdullahgultekin merged 3 commits into
mainfrom
fix/f8-mobile-token-loss
Jun 12, 2026
Merged

fix(auth): F8 — stop intermittent mobile forced re-login (token loss)#90
ahmetabdullahgultekin merged 3 commits into
mainfrom
fix/f8-mobile-token-loss

Conversation

@ahmetabdullahgultekin

Copy link
Copy Markdown
Contributor

F8 — mobile app prompts for login randomly/inconsistently

By design: access token TTL = 15 min, refresh TTL = 24 h (sliding). An actively-used app should never re-prompt; the only normal forced re-login is after ~24 h of inactivity. The intermittent premature prompts ("sometimes asks, sometimes doesn't") were bugs. Full analysis: docs/LOGIN_TRIAGE_2026-06-12.md (F8 verdict).

All changes are in :shared (commonMain), so they apply to Android + Desktop. Pure auth-resilience hardening — no API/contract change.

Root cause 1 (primary) — session wiped on ANY transient refresh failure — commit b4cf47d3

NetworkModule.refreshAccessToken cleared the entire persisted session on any refresh failure: the blanket catch (_) { clearTokens() } plus the non-200 branches wiped tokens on every timeout / dropped keep-alive / brief offline / Traefik HTTP/2 stale-conn RST / 5xx / 429. A transient blip logged the user fully out → intermittent re-prompt.

Fix: distinguish a definitive auth failure (HTTP 400/401 invalid_grant from the refresh-token grant → token truly dead) from a transient failure (IOException/timeout/RST, or a non-auth non-200 such as 5xx/429). clearTokens() runs only on a definitive failure; transient failures return false without clearing, so only the in-flight request fails and the session survives the next attempt. Added isDefinitiveAuthFailure(status).

Root cause 2 — refresh-token reuse-detection family-revoke from uncoordinated refresh — commit 81b25ac3

The server revokes the whole rotation family if an already-rotated refresh token is re-presented. refreshMutex only serialized refreshes within the identity client; the biometric client had NO refresh interceptor and was outside the mutex, so a biometric 401 racing an identity refresh (or presenting a token rotated out from under it) could re-present the old token → family revoke → all sessions logged out non-deterministically.

Fix: extract the identity client's HttpSend refresh-on-401 + single-retry into a shared installRefreshOn401(client, tokenManager) and install it on both clients. Both now funnel every 401 through the same refreshAccessToken and the same module-level refreshMutex, so only one rotation is ever in flight. The refresh targets the identity token endpoint via absolute URLs, so driving it from the biometric client is correct. (Identity path = pure extraction, no behavior change.)

Root cause 3 (hardening) — empty-string refresh token — commit 9e31ce3a

AuthDto.toModel() maps a missing refresh_token to "". On a 200 refresh that omits a new refresh token (rotation disabled), the old code overwrote the stored token with "" → the next refresh had no token → forced re-login.

Fix:

  • On a 200 refresh, keep the existing refresh token when the response omits one (oauth.refreshToken.ifBlank { refreshToken }), in both the OAuth and legacy /auth/refresh branches (mirrors the desktop RefreshInterceptor's "reuse if rotation disabled").
  • TokenManager.getAccessToken/getRefreshToken treat a blank "" stored value as absent (return null), so a "" never looks like a usable token and isAuthenticated() falls back correctly.

Tests

./gradlew :shared:desktopTest — 12 new unit tests, all green:

  • RefreshTokenResilienceTest (8): transient 503 / transport-error / legacy 5xx keep the session; definitive 400/401 clear it; 200 happy path; 200-without-refresh_token preserves the old token; isDefinitiveAuthFailure classification.
  • TokenManagerBlankTokenTest (4): blank access/refresh read back as null; not-authenticated on blank-only refresh; authenticated on a real refresh token.

Compile result

  • ./gradlew :shared:compileDebugKotlinAndroidBUILD SUCCESSFUL (Android target).
  • ./gradlew :shared:desktopTestBUILD SUCCESSFUL, 12/12 green.
  • No emulator/instrumented tests run (host has no KVM) — by project policy.

Release / rollback note (live mobile auth path)

This is the live mobile auth path. To verify on a real device, the owner must rebuild a signed APK and cut a new release tag (e.g. v5.3.2) — the change ships only in the APK:
gh workflow run android-build.yml -R Rollingcat-Software/client-apps --ref main -f build_type=release
Reversibility: if anything regresses, roll back by installing the prior release APK (same signing cert since v5.2.x → installs in place); no server change is involved.

Scope note

This PR is F8 only (client-apps / Kotlin). The other live-test findings (F1–F7, F9, F10) are web-app / backend and tracked separately; coordinator messages about web-app PR #204 / the approve-QR→MFA bridge / picker gating are a different repo and out of scope here.

Ahmet Abdullah Gultekin added 3 commits June 12, 2026 13:37
…nvalid_grant (F8)

Root cause 1 (primary) of the intermittent mobile forced re-login.

`refreshAccessToken` wiped the ENTIRE persisted session on ANY refresh
failure: the blanket `catch (_) { clearTokens() }` plus the non-200
branches cleared tokens on every timeout, dropped keep-alive, brief
offline, Traefik HTTP/2 stale-connection RST, 5xx, or 429. A transient
network blip therefore logged the user fully out — surfacing as the
"sometimes asks for login, sometimes doesn't" complaint.

Now a DEFINITIVE auth failure (HTTP 400/401 invalid_grant from the
refresh-token grant → token truly dead) is distinguished from a TRANSIENT
failure (IOException/timeout/RST, or a non-auth non-200 such as 5xx/429).
clearTokens() runs ONLY on a definitive failure; transient failures return
false WITHOUT clearing, so only the in-flight request fails and the
session survives for the next attempt.

Adds `isDefinitiveAuthFailure(status)` + makes the refresh helper
`internal` for unit testing. RefreshTokenResilienceTest covers transient
503 / transport-error / legacy 5xx (session kept), definitive 400/401
(session cleared), and the 200 happy path.

By design: access TTL 15 min, refresh TTL 24 h (sliding) — an actively
used app should never re-prompt; the only normal forced re-login is after
~24 h of inactivity.
…resh (F8)

Root cause 2 of the intermittent mobile forced re-login: refresh-token
reuse-detection family-revoke from uncoordinated refresh.

The server revokes the entire refresh-token rotation family when an
already-rotated token is re-presented. `refreshMutex` serialized refreshes
only WITHIN the identity client; the biometric client had NO refresh-on-401
interceptor and was outside the mutex. A biometric 401 racing an identity
refresh — or presenting a token rotated out from under it — could re-present
the old refresh token, triggering a family revoke that logs out ALL sessions
non-deterministically.

Extract the identity client's HttpSend refresh-on-401 + single-retry into a
shared `installRefreshOn401(client, tokenManager)` and install it on BOTH
clients. Both now funnel every 401 through the same `refreshAccessToken` and
therefore the same module-level `refreshMutex`, so only one rotation is ever
in flight. The refresh targets the identity token endpoint via absolute URLs,
so driving it from the biometric client is correct (no behavior change to the
identity path — pure extraction).
…s absent (F8)

Root cause 3 (hardening) of the intermittent mobile forced re-login.

`AuthDto.toModel()` maps a missing `refresh_token`/`access_token` to "". On
a refresh 200 that omits a new refresh token (rotation disabled), the old
code overwrote the stored refresh token with "" — so the very next refresh
had no token and forced a full re-login.

- NetworkModule: on a 200 refresh, keep the existing refresh token when the
  response omits one (`oauth.refreshToken.ifBlank { refreshToken }`), in both
  the OAuth and legacy `/auth/refresh` branches (mirrors the desktop
  RefreshInterceptor's "reuse if rotation disabled").
- TokenManager.getAccessToken/getRefreshToken: treat a blank ("") stored
  value as ABSENT (return null), so a "" never looks like a usable token and
  `isAuthenticated()` falls back correctly instead of routing a session-less
  user to the dashboard or POSTing an empty grant the server rejects.

Tests: RefreshTokenResilienceTest covers the 200-without-refresh_token
preserve case; TokenManagerBlankTokenTest covers blank-as-absent for both
tokens and isAuthenticated.
@ahmetabdullahgultekin ahmetabdullahgultekin merged commit 14785c8 into main Jun 12, 2026
3 checks passed
ahmetabdullahgultekin added a commit that referenced this pull request Jun 12, 2026
#91)

Bumps versionCode 13→14 / versionName 5.3.1→5.3.2 so the signed APK carrying
the F8 fix (transient refresh no longer wipes session; biometricClient shared
refresh; empty-string refresh hardening) installs over v5.3.1.

Co-authored-by: Ahmet Abdullah Gultekin <rollingcat.help@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant