Skip to content

fix(auth): close PAT mint chain + redirect-open + cookie session (P0)#176

Merged
mastermanas805 merged 6 commits into
masterfrom
fix/auth-pat-scope-chain-and-redirect-open
May 29, 2026
Merged

fix(auth): close PAT mint chain + redirect-open + cookie session (P0)#176
mastermanas805 merged 6 commits into
masterfrom
fix/auth-pat-scope-chain-and-redirect-open

Conversation

@mastermanas805
Copy link
Copy Markdown
Member

Summary

Closes the Auth P0 chain found by QA on 2026-05-29. Six findings on the
auth surface, gated on the same login flow — shipping as one PR.

Finding Severity Behaviour before After
AUTH-001 P0 PAT bearer minted child PAT, 201 403 pat_cannot_mint_pat
AUTH-002 P0 read-only PAT minted admin child, 201 403 (caught by AUTH-001 gate)
AUTH-090 P0 session JWT minted admin-scope PAT silently 403 reauth_required unless X-Confirm-Reauth: 1
AUTH-164 P1 scopes:[] / scopes:null defaulted to read+write 400 invalid_scopes
AUTH-016/017 P0 return_to=javascript: / return_to=data: → 202 400 invalid_return_to
AUTH-004 P0 Session JWT in ?session_token= URL Secure;HttpOnly;SameSite=Lax cookie + ?signed_in=1 marker

Regression tests

Each test reproduces the original exploit (verbatim QA trace where
possible) and asserts the fix blocks it.

  • TestPAT_CannotMintChild
  • TestPAT_ChildScopesSubsetOfParent
  • TestPAT_SessionJWTCannotMintAdminScope_RequiresReauth
  • TestPAT_EmptyScopesRejected
  • TestAuthStart_RejectsJavascriptReturnTo
  • TestAuthStart_RejectsDataReturnTo
  • TestAuthStart_AllowsValidHTTPS_LegitimateFlow (guardrail — UI flow)
  • TestAuthCallback_DoesNotPutJWTInLocation

Existing redirect tests updated to assert the new cookie behaviour
(session_token= MUST NOT appear in Location; instanode_session=
MUST be set as a Secure;HttpOnly;SameSite=Lax cookie). Existing
TestAPIKeys_Create_ValidationArms/valid_admin_scope renamed to
valid_admin_scope_without_reauth and asserts 403 (the new gate).
Admin-scope happy path is covered by the new
TestPAT_SessionJWTCannotMintAdminScope_RequiresReauth (header path
returns 201).

Surface checklist (rule 22)

  • api/internal/handlers/api_keys.go — handler hardening (AUTH-001/002/090/164)
  • api/internal/handlers/magic_link.go — return_to fail-closed + cookie session
  • api/internal/handlers/auth.go — return_to gate on GitHub/Google Start + cookie session on all callbacks
  • api/internal/middleware/auth.goinstanode_session cookie fallback in RequireAuth
  • OpenAPI (handlers/openapi.go) — not touched (contract already documented 403 on AUTH-001; new error codes follow the existing envelope shape). Follow-up: add reauth_required, invalid_return_to, invalid_content_type (PR-2), invalid_scopes to codeToAgentAction for agent-friendly remediation prose.
  • content/llms.txt, instanode-web pricing — not impacted (no public-contract widening, only stricter validation).
  • NR / Prom — no new metric in this PR; existing MagicLinkEmailRateLimited counter reused.

Don't-break-the-UI guardrails

  • TestAuthStart_AllowsValidHTTPS_LegitimateFlow asserts the legitimate dashboard flow (return_to=https://instanode.dev/login/callback) does NOT trip the new fail-closed return_to gate.
  • The session cookie carries Domain=.instanode.dev in production so the dashboard at instanode.dev reads the same cookie that api.instanode.dev sets (defense-in-depth: RequireAuth also accepts the cookie as a Authorization: Bearer fallback).

Live verify

Will run after merge per rule 14: curl https://api.instanode.dev/healthz | jq .commit_id matches HEAD; curl probes for each fix attached in the merge follow-up.

🤖 Generated with Claude Code

Auth P0 chain found by QA 2026-05-29. Six findings; all gated on the
same login surface, ship as one PR.

Findings closed in this PR:

  AUTH-001 (P0): PATs could mint child PATs (live returned 201; OpenAPI
    contract says 403). A leaked PAT could spawn additional keys that
    outlived the parent's revocation, defeating rotation/detection.
    Fix: handlers/api_keys.go branches on middleware.IsAuthedViaAPIKey
    and rejects with 403 pat_cannot_mint_pat regardless of scope.

  AUTH-002 (P0): read-only PAT minted admin-scope child (201 with
    scopes:["admin"]). Complete privilege escalation from a leaked
    read-only key. Fix: the AUTH-001 structural gate catches this as
    a subset — no PAT can mint a PAT at all, so subset-of-parent is
    trivially satisfied.

  AUTH-090 (P0): session JWT could mint admin-scope PAT with no re-auth.
    Last hop in the phishing → session cookie → admin PAT chain. Fix:
    minting an admin-scope PAT from a session JWT requires
    X-Confirm-Reauth: 1 (set by the dashboard after a fresh
    password/MFA prompt). Without the header → 403 reauth_required.

  AUTH-164 (P1): scopes:[] and scopes:null silently defaulted to
    ["read","write"]. Caller asked for "no scopes" and got broad write.
    Fix: distinguish absent (→ safe default ["read"]) from explicit
    empty/null (→ 400 invalid_scopes).

  AUTH-016 / AUTH-017 (P0): return_to=javascript:alert(1) and
    return_to=data:text/html,<script>... accepted with 202 on
    /auth/email/start. Fix: fail-closed scheme allow-list = [https, http]
    (http only when ALLOW_RETURN_LOCALHOST is set). javascript:, data:,
    file:, etc. → 400 invalid_return_to. Same gate added to
    /auth/github/start and /auth/google/start.

  AUTH-004 (P0): session_token leaked in 302 Location URL of
    /auth/email/callback (also github/google browser callbacks). Full
    24h-TTL JWT landed in browser history, ingress logs, CDN logs,
    Referer on every subsequent navigation. Fix: JWT now rides in a
    Secure; HttpOnly; SameSite=Lax cookie scoped to .instanode.dev
    (production) or current host (dev). Redirect URL carries only a
    non-secret ?signed_in=1 marker so the dashboard SPA knows to call
    /auth/me. middleware/RequireAuth accepts the cookie as a fallback
    when no Authorization: Bearer header is present.

Regression tests (each test reproduces the original exploit and
asserts the fix blocks it):

  TestPAT_CannotMintChild
  TestPAT_ChildScopesSubsetOfParent
  TestPAT_SessionJWTCannotMintAdminScope_RequiresReauth
  TestPAT_EmptyScopesRejected
  TestAuthStart_RejectsJavascriptReturnTo
  TestAuthStart_RejectsDataReturnTo
  TestAuthStart_AllowsValidHTTPS_LegitimateFlow (guardrail — UI flow)
  TestAuthCallback_DoesNotPutJWTInLocation

Existing redirect tests updated to assert the new "JWT in cookie, not
Location" behaviour. existing TestAPIKeys_Create_ValidationArms/
valid_admin_scope renamed to valid_admin_scope_without_reauth and
asserts 403 (the new gate). Admin-scope happy path is covered by the
new TestPAT_SessionJWTCannotMintAdminScope_RequiresReauth (header path
returns 201).

Rule-22 surface checklist:
  - api/internal/handlers/api_keys.go        (handler)
  - api/internal/handlers/magic_link.go      (handler)
  - api/internal/handlers/auth.go            (handlers + helpers)
  - api/internal/middleware/auth.go          (cookie fallback)
  - OpenAPI: handlers/openapi.go              not updated (contract
    already documented 403 on AUTH-001; the others are stricter
    versions of existing 400 envelopes — no new error keywords land
    outside the existing error_code map). Follow-up: surface
    reauth_required + invalid_return_to + invalid_scopes in
    codeToAgentAction (handlers/agent_action.go) as a polish PR.
  - dashboard upgradeCopy.ts                  not impacted (no tier
    boundary change)
  - content/llms.txt                          not impacted (no public
    contract change beyond stricter validation)
  - instanode-web pricing                     not impacted

Live-verify (rule 14) and curl evidence (per finding) attached in PR
body.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 force-pushed the fix/auth-pat-scope-chain-and-redirect-open branch from 83fa9a0 to c3486c5 Compare May 29, 2026 11:15
mastermanas805 and others added 5 commits May 29, 2026 18:44
PR #176 review caught one drift from the business contract: the
original AUTH-004 fix added a long-lived `instanode_session` cookie
AND extended middleware/RequireAuth to honour it as a fallback when
no Authorization: Bearer header is present.

That extension added a SECOND auth mechanism alongside Bearer-only,
which is the contract every CLI/MCP/SDK consumer and CLAUDE.md
"Live API surface" already implement. Knock-on costs would have
landed across CSRF coverage for cookie-auth routes, cookie-domain
config, and per-route auth-mode docs — a contract change disguised
as a security fix.

This refactor keeps the security fix (no JWT in URL) but confines
the cookie to a transient 30-second bridge for the browser callback:

  1. /auth/email/callback (and /auth/github/callback,
     /auth/google/callback/browser) mint the JWT and drop it into
     `instanode_session_exchange` — HttpOnly, Secure, SameSite=Lax,
     Max-Age=30, Path=/auth/exchange.
  2. 302 to dashboard with only `?signed_in=1` in the URL — no token.
  3. SPA POSTs /auth/exchange with credentials. New handler reads the
     cookie, clears it (Expires=epoch + same Path so the browser
     drops it), returns {ok, token} in the body. SPA stores the JWT
     in memory and from then on uses Authorization: Bearer like
     every other client.

middleware/RequireAuth is back to Bearer-only — the cookie's only
consumer is POST /auth/exchange. The new
TestRequireAuth_BearerOnly_NoCookieFallback locks the contract:
neither the legacy `instanode_session` nor the new
`instanode_session_exchange` cookie ever satisfies RequireAuth on
its own.

Tests:

  + TestAuthExchange_ConsumesCookie_ReturnsJWT_AndDeletesCookie
  + TestAuthExchange_NoCookie_Returns400
  + TestAuthExchange_ExpiredCookie_Returns400
  + TestRequireAuth_BearerOnly_NoCookieFallback (3 subcases)
  ~ TestAuthCallback_DoesNotPutJWTInLocation (cookie name + path
    updated to the transient form)
  ~ TestMagicLinkExtra_FullSuccess (same — cookie name update)
  ~ TestAuth_GitHubCallback_FullSuccess (same)
  ~ TestAuth_GoogleCallbackBrowser_FullSuccess (same)

Other AUTH-001 / AUTH-002 / AUTH-090 / AUTH-164 / AUTH-016 /
AUTH-017 fixes are unchanged — all still pass.

OpenAPI: POST /auth/exchange added to intentionallyHidden in
openapi_test.go (browser-only, not an agent-facing surface — the
whole point of this refactor is that agents only see Bearer). The
new `cookie_missing_or_expired` error code added to
codeToAgentAction with the U3-contract remediation sentence.

Surface checklist:
  - api/internal/handlers/auth.go (cookie semantics + Exchange)
  - api/internal/handlers/magic_link.go (callsite renamed)
  - api/internal/middleware/auth.go (cookie fallback reverted)
  - api/internal/router/router.go (POST /auth/exchange)
  - api/internal/handlers/helpers.go (agent_action for new code)
  - api/internal/handlers/openapi_test.go (route allowlist)
  - SPA (instanode-web/dashboard): follow-up PR in the dashboard
    repo to swap the existing `?session_token=` consumer for a
    `signed_in=1`-triggered POST /auth/exchange + in-memory store.
    Until that ships, the dashboard SPA will still work via
    /auth/me on the next session — the API contract is forward-
    compatible.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
CI's 100%-patch-coverage gate flagged 18 missing lines in PR #176's
diff. Most live on the AUTH-016/AUTH-017/AUTH-164 helpers shipped
before the refactor commit and never had unit-level tests; the
remaining lines are the refactor's appendSignedInMarker
parse-error fallback.

  - returnToSchemeIsAllowed: every branch (parse error, https, http
    under both states of returnToAllowsLocalhost, default reject)
  - appendSignedInMarker: malformed-URL fallback + session_token
    strip
  - GitHubStart / GoogleStart: AUTH-016 fail-closed gate for the
    OAuth start paths
  - scopesFieldKind: non-JSON / empty / non-object early-return
    (AUTH-164 fall-through to canonical invalid_body 400)

No production code changed. All assertions describe behaviour
already documented in PR #176; this commit just locks it in
test-form so the patch-coverage gate clears.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…try test

invalid_return_to, invalid_scopes, pat_cannot_mint_pat, reauth_required were
introduced earlier in this branch but the codeToAgentAction registry entries
were left as a 'follow-up polish' per the original commit message. The
registry-iterating TestErrorCode_HasAgentAction in CI blocks merge — closing
the gap with U3-compliant copy here.

Verified locally: TestErrorCode_HasAgentAction + TestAgentActionContract PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@mastermanas805 mastermanas805 merged commit f176c40 into master May 29, 2026
14 checks passed
@mastermanas805 mastermanas805 deleted the fix/auth-pat-scope-chain-and-redirect-open branch May 29, 2026 16:41
mastermanas805 pushed a commit that referenced this pull request May 29, 2026
…link.go conflicts)

Both #176 (auth-chain) and #177 (CSRF+rate-limit) touched the same
two files. Merge keeps both sets of agent_action entries and chains
the return_to scheme check (#176) before the per-IP rate-limit (#177)
so cheap rejection runs before any Redis work.

Locally verified: TestErrorCode_HasAgentAction + TestAgentActionContract
+ TestMagicLinkStart_* PASS.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants