fix(auth): F8 — stop intermittent mobile forced re-login (token loss)#90
Merged
Conversation
added 3 commits
June 12, 2026 13:37
…nvalid_grant (F8)
Root cause 1 (primary) of the intermittent mobile forced re-login.
`refreshAccessToken` wiped the ENTIRE persisted session on ANY refresh
failure: the blanket `catch (_) { clearTokens() }` plus the non-200
branches cleared tokens on every timeout, dropped keep-alive, brief
offline, Traefik HTTP/2 stale-connection RST, 5xx, or 429. A transient
network blip therefore logged the user fully out — surfacing as the
"sometimes asks for login, sometimes doesn't" complaint.
Now a DEFINITIVE auth failure (HTTP 400/401 invalid_grant from the
refresh-token grant → token truly dead) is distinguished from a TRANSIENT
failure (IOException/timeout/RST, or a non-auth non-200 such as 5xx/429).
clearTokens() runs ONLY on a definitive failure; transient failures return
false WITHOUT clearing, so only the in-flight request fails and the
session survives for the next attempt.
Adds `isDefinitiveAuthFailure(status)` + makes the refresh helper
`internal` for unit testing. RefreshTokenResilienceTest covers transient
503 / transport-error / legacy 5xx (session kept), definitive 400/401
(session cleared), and the 200 happy path.
By design: access TTL 15 min, refresh TTL 24 h (sliding) — an actively
used app should never re-prompt; the only normal forced re-login is after
~24 h of inactivity.
…resh (F8) Root cause 2 of the intermittent mobile forced re-login: refresh-token reuse-detection family-revoke from uncoordinated refresh. The server revokes the entire refresh-token rotation family when an already-rotated token is re-presented. `refreshMutex` serialized refreshes only WITHIN the identity client; the biometric client had NO refresh-on-401 interceptor and was outside the mutex. A biometric 401 racing an identity refresh — or presenting a token rotated out from under it — could re-present the old refresh token, triggering a family revoke that logs out ALL sessions non-deterministically. Extract the identity client's HttpSend refresh-on-401 + single-retry into a shared `installRefreshOn401(client, tokenManager)` and install it on BOTH clients. Both now funnel every 401 through the same `refreshAccessToken` and therefore the same module-level `refreshMutex`, so only one rotation is ever in flight. The refresh targets the identity token endpoint via absolute URLs, so driving it from the biometric client is correct (no behavior change to the identity path — pure extraction).
…s absent (F8)
Root cause 3 (hardening) of the intermittent mobile forced re-login.
`AuthDto.toModel()` maps a missing `refresh_token`/`access_token` to "". On
a refresh 200 that omits a new refresh token (rotation disabled), the old
code overwrote the stored refresh token with "" — so the very next refresh
had no token and forced a full re-login.
- NetworkModule: on a 200 refresh, keep the existing refresh token when the
response omits one (`oauth.refreshToken.ifBlank { refreshToken }`), in both
the OAuth and legacy `/auth/refresh` branches (mirrors the desktop
RefreshInterceptor's "reuse if rotation disabled").
- TokenManager.getAccessToken/getRefreshToken: treat a blank ("") stored
value as ABSENT (return null), so a "" never looks like a usable token and
`isAuthenticated()` falls back correctly instead of routing a session-less
user to the dashboard or POSTing an empty grant the server rejects.
Tests: RefreshTokenResilienceTest covers the 200-without-refresh_token
preserve case; TokenManagerBlankTokenTest covers blank-as-absent for both
tokens and isAuthenticated.
ahmetabdullahgultekin
added a commit
that referenced
this pull request
Jun 12, 2026
#91) Bumps versionCode 13→14 / versionName 5.3.1→5.3.2 so the signed APK carrying the F8 fix (transient refresh no longer wipes session; biometricClient shared refresh; empty-string refresh hardening) installs over v5.3.1. Co-authored-by: Ahmet Abdullah Gultekin <rollingcat.help@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
F8 — mobile app prompts for login randomly/inconsistently
By design: access token TTL = 15 min, refresh TTL = 24 h (sliding). An actively-used app should never re-prompt; the only normal forced re-login is after ~24 h of inactivity. The intermittent premature prompts ("sometimes asks, sometimes doesn't") were bugs. Full analysis:
docs/LOGIN_TRIAGE_2026-06-12.md(F8 verdict).All changes are in
:shared(commonMain), so they apply to Android + Desktop. Pure auth-resilience hardening — no API/contract change.Root cause 1 (primary) — session wiped on ANY transient refresh failure — commit
b4cf47d3NetworkModule.refreshAccessTokencleared the entire persisted session on any refresh failure: the blanketcatch (_) { clearTokens() }plus the non-200 branches wiped tokens on every timeout / dropped keep-alive / brief offline / Traefik HTTP/2 stale-conn RST / 5xx / 429. A transient blip logged the user fully out → intermittent re-prompt.Fix: distinguish a definitive auth failure (HTTP 400/401
invalid_grantfrom the refresh-token grant → token truly dead) from a transient failure (IOException/timeout/RST, or a non-auth non-200 such as 5xx/429).clearTokens()runs only on a definitive failure; transient failures returnfalsewithout clearing, so only the in-flight request fails and the session survives the next attempt. AddedisDefinitiveAuthFailure(status).Root cause 2 — refresh-token reuse-detection family-revoke from uncoordinated refresh — commit
81b25ac3The server revokes the whole rotation family if an already-rotated refresh token is re-presented.
refreshMutexonly serialized refreshes within the identity client; the biometric client had NO refresh interceptor and was outside the mutex, so a biometric 401 racing an identity refresh (or presenting a token rotated out from under it) could re-present the old token → family revoke → all sessions logged out non-deterministically.Fix: extract the identity client's
HttpSendrefresh-on-401 + single-retry into a sharedinstallRefreshOn401(client, tokenManager)and install it on both clients. Both now funnel every 401 through the samerefreshAccessTokenand the same module-levelrefreshMutex, so only one rotation is ever in flight. The refresh targets the identity token endpoint via absolute URLs, so driving it from the biometric client is correct. (Identity path = pure extraction, no behavior change.)Root cause 3 (hardening) — empty-string refresh token — commit
9e31ce3aAuthDto.toModel()maps a missingrefresh_tokento"". On a 200 refresh that omits a new refresh token (rotation disabled), the old code overwrote the stored token with""→ the next refresh had no token → forced re-login.Fix:
oauth.refreshToken.ifBlank { refreshToken }), in both the OAuth and legacy/auth/refreshbranches (mirrors the desktopRefreshInterceptor's "reuse if rotation disabled").TokenManager.getAccessToken/getRefreshTokentreat a blank""stored value as absent (return null), so a""never looks like a usable token andisAuthenticated()falls back correctly.Tests
./gradlew :shared:desktopTest— 12 new unit tests, all green:RefreshTokenResilienceTest(8): transient 503 / transport-error / legacy 5xx keep the session; definitive 400/401 clear it; 200 happy path; 200-without-refresh_tokenpreserves the old token;isDefinitiveAuthFailureclassification.TokenManagerBlankTokenTest(4): blank access/refresh read back as null; not-authenticated on blank-only refresh; authenticated on a real refresh token.Compile result
./gradlew :shared:compileDebugKotlinAndroid— BUILD SUCCESSFUL (Android target)../gradlew :shared:desktopTest— BUILD SUCCESSFUL, 12/12 green.Release / rollback note (live mobile auth path)
This is the live mobile auth path. To verify on a real device, the owner must rebuild a signed APK and cut a new release tag (e.g. v5.3.2) — the change ships only in the APK:
gh workflow run android-build.yml -R Rollingcat-Software/client-apps --ref main -f build_type=releaseReversibility: if anything regresses, roll back by installing the prior release APK (same signing cert since v5.2.x → installs in place); no server change is involved.
Scope note
This PR is F8 only (client-apps / Kotlin). The other live-test findings (F1–F7, F9, F10) are web-app / backend and tracked separately; coordinator messages about web-app PR #204 / the approve-QR→MFA bridge / picker gating are a different repo and out of scope here.