v2 security: canonical identity (2.0.0) + OpenIddict authorization design (ADR-0002, prototype-proven)#26
Merged
Merged
Conversation
…RevocationCleanupService, TokenRevocationService, and validation logic - Implement tests for TenantAccessService covering access checks and role management. - Create tests for TenantOperationService to validate tenant scope execution. - Add tests for TokenRevocationCleanupService to ensure cleanup logic works as expected. - Develop comprehensive tests for TokenRevocationService to verify token revocation and cleanup behavior. - Introduce FluentValidatorTests and ValidatorsTests to validate request and password validation logic.
…umentation accordingly
- TenantOperationService.ExecuteInTenantScopeAsync now captures and restores the ambient AsyncLocal tenant context in a try/finally. Without restore, nested delegates and throws leaked the inner tenant into the outer request scope — latent cross-tenant write corruption. - Admin mutations (Create/Delete tenant, Grant/Revoke tenant access) promoted from RequireSysUser to RequireSysAdmin. SysSupport could previously create tenants and grant itself access to any tenant, escalating to arbitrary tenant admin. - Listings (GetAllTenants, GetUserTenants) stay on RequireSysUser — SysSupport keeps legitimate read access. - Grant/Revoke reject self-target with General.SelfTarget; guard runs before any DB lookup to avoid timing oracle.
- General.SelfTarget reclassified Validation → Forbidden.
Self-target is policy denial, not a malformed request;
matches repo convention (Auth.Forbidden, Tenant.Inactive,
CannotDeleteDefault, User.Inactive).
- Grant/Revoke endpoint switches now map ErrorType.Forbidden
to Forbid() and ErrorType.Unauthorized to Unauthorized();
Results<...> unions extended with ForbidHttpResult and
UnauthorizedHttpResult.
- Null-UserId fail-closed branch in both handlers now returns
Auth.Unauthorized (401) instead of General.Unexpected (500).
Branch unreachable behind RequireSysAdminPolicy; honest
status for defense in depth.
- Self-target integration tests updated to assert 403.
BREAKING CHANGE: POST and DELETE on
/admin/users/{userId}/tenants/{tenantIdentifier} now return
403 instead of 400 when userId == caller.UserId. OpenAPI spec
adds 401 and 403 response variants on both routes.
Phase 1 step 1: pure-additive scaffolding for canonical identity
migration. SysRoleKind values map 1:1 to IdmtDefaultRoleTypes string
constants so existing RequireRole("SysAdmin"/"SysSupport") policies
keep matching after the SysRole column replaces per-tenant role rows.
PendingEmail column reserved for OOB email-change flow (later step).
No removals; IdmtUser.TenantId stays until step 2.
Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: IdmtUser is no longer per-tenant. The TenantId column,
the IsMultiTenant() filter, and the legacy (Email, UserName, TenantId)
unique index are removed. Consumers must add an EF migration that drops
the column + index and adds a global unique index on NormalizedEmail.
IdmtUserClaimsPrincipalFactory ctor now takes IMultiTenantContextAccessor
instead of IMultiTenantStore; tenant claim is sourced from the ambient
context and the factory throws InvalidOperationException when the
ambient tenant is null (fail-closed, per audit CD-4).
- IdmtUser.GetTenantId() returns null so audit rows for global entity
writes carry no tenant. IdmtAuditLog.TenantId is already nullable.
- SysRole projected as Claim(ClaimTypes.Role, SysRole.ToString()) when
!= None so existing RequireRole("SysAdmin"/"SysSupport") policies
continue to match without code change.
- DiscoverTenants no longer reads Users.TenantId (gone); TenantAccess
is the sole source of cross-tenant membership.
- 6 integration tests fail by design and are bounded by the plan:
3 cross-tenant logins succeed because the TenantAccess gate is not
yet wired at the login path (deferred to a later step that wires the
uniform gate); 3 GET /manage/info return 400 because seeded SysAdmin
has no per-tenant IdentityRole (deferred to the role-seeding shrink).
Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
Under the canonical-identity model the handler no longer materialises a per-tenant shadow IdmtUser; the canonical row already exists and is reachable via UserManager.FindByIdAsync. Collapses three coupled defects into one transaction: - C4: shadow row was created by copying the source PasswordHash verbatim, drifting from the canonical hash on rotation. - N1: token revocation keyed on the shadow row's UserId, so a revoke in tenant A never reached the shadow session in tenant B. - N3: the shadow row was committed by UserManager.CreateAsync inside ExecuteInTenantScopeAsync before the outer TenantAccess save, leaving an unreferenced user if the outer save failed. Handler flow: FindByIdAsync (global) → tenant lookup + active check → upsert TenantAccess (Add or flip IsActive + ExpiresAt) → single SaveChangesAsync. Phase 0 self-grant guard kept verbatim. Drop ITenantOperationService injection; no compensation path remains. Also closes H7: lookups no longer compare raw Email/UserName strings. Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
Under shadow-row identity, RevokeUserTokensAsync was keyed on the caller's UserId from the outer scope, which never matched the shadow row's UserId in the target tenant — revocations issued in tenant A silently failed to cut sessions in tenant B (audit N1). Handler flow: FindByIdAsync (global) → tenant lookup → TenantAccess by (UserId, TenantId) → flip IsActive=false → single SaveChangesAsync → RevokeUserTokensAsync only on successful save. No ExecuteInTenantScopeAsync; no shadow-user deactivation; no case-sensitive Email/UserName lookup (closes H7). Token revocation is sequenced after the save so a write failure does not leave a desynchronised "revoked but still active" state. Atomicity test asserts RevokeUserTokensAsync is never called when SaveChangesAsync throws. Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: ConfirmEmailRequest no longer accepts TenantIdentifier in the POST body and the GET endpoint no longer accepts it as a query parameter. Tenant is resolved from the ambient Finbuckle context (header/route per consumer config). Clients that pass TenantIdentifier in the body will have it silently ignored; clients reading it from the link URL will need to remove it after Step 8 strips it from generated links. ConfirmEmailAsync's DataProtector token is bound to (user.Id, SecurityStamp, "EmailConfirmation") and was already tenant-agnostic. Under the now-global IdmtUser the same token validates regardless of which tenant context the request resolves under, which is the intended canonical behaviour. Handler ctor drops ITenantOperationService and uses UserManager directly; the ExecuteInTenantScopeAsync wrap is gone. This closes the C3 hygiene gap where body-supplied tenant decoupled token handling from the request's tenant strategy and would have become exploitable the moment anyone reintroduced shadow-row Id/SecurityStamp copying. Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: ResetPasswordRequest no longer accepts TenantIdentifier in the body, and a successful reset no longer flips EmailConfirmed = true on the user. The implicit side-effect was the second leg of an account-takeover chain (audit C7): an attacker with a temp session stages a new email via PUT /manage/info, then issues forgot-password against the new address they control, and the silent confirmation flip on reset bound the account to the attacker without any out-of-band proof against the victim's mailbox. Reset only proves possession of the current Email; confirmation of a new address belongs to the OOB email-change flow landing in a later step. Handler ctor drops ITenantOperationService and uses UserManager directly; the ExecuteInTenantScopeAsync wrap is gone (tenant resolved from the ambient Finbuckle context). Body-supplied TenantIdentifier is silently ignored. Two regression tests pin the C7 fix from both sides (EmailConfirmed=false stays false; =true stays true) plus a Moq Times.Never assertion that UpdateAsync is not called for the flip. An integration test reads EmailConfirmed back from the real DB. Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: PUT /manage/info no longer mutates Email immediately when NewEmail is set. The new email is staged in IdmtUser.PendingEmail and a confirmation link is sent to that address; Email is committed only when the recipient POSTs to /auth/confirm-email-change with the token. The endpoint returns 202 Accepted (Location: /auth/confirm-email-change) instead of 200 in this case. Existing clients that treated 200 as success must accept 202 and surface the "check your inbox" prompt. Closes the C7 account-takeover chain in tandem with the previous ResetPassword change: an attacker can no longer rebind the Email column to an address they control without out-of-band proof against that mailbox. Invariant: every stamp-rotating Identity mutation in the request (SetUserNameAsync, ChangePasswordAsync) completes before GenerateChangeEmailTokenAsync. The handler flushes via SaveChangesAsync and reloads the user so the token is bound to the persisted post-rotation SecurityStamp; otherwise the token would fail ChangeEmailAsync at confirm time. Tests F25 (password+email) and F44 (username+email) cover both rotation paths end-to-end via real Identity token generation against SQLite. Token revocation is now gated on credential changes only. Email-only staging keeps the bearer session alive so the user can read the confirmation mail in the same browser tab; ChangeEmailAsync rotates the stamp at confirm time and invalidates outstanding sessions then. Other surface: - New POST /auth/confirm-email-change (AllowAnonymous) — pre-checks PendingEmail to avoid wasting Identity token validation on stale state, then ChangeEmailAsync (atomic Email + EmailConfirmed + stamp) and clears PendingEmail. - New IIdmtLinkGenerator.GenerateConfirmEmailChangeLink and ApplicationOptions.ConfirmEmailChangeFormPath (default /confirm-email-change). No tenantIdentifier embedded. - New errors Email.NoPendingChange and Email.PendingMismatch. - UpdateUserInfoResult is internal sealed; both 200 and 202 carry empty bodies, so the result type does not leak into the API surface. Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: GenerateConfirmEmailLink and GeneratePasswordResetLink
no longer embed tenantIdentifier as a query parameter in either the
ServerConfirm or ClientForm branches. After Steps 5 and 6 the request
records also stopped accepting it, so any consumer SPA that read
tenantIdentifier from the link URL and echoed it back in the body must
switch to host/path-based routing — the body field is silently
ignored, the query param is gone.
BuildClientFormUrl loses its tenantIdentifier parameter; both callers
updated. Route-based tenant strategies still inject the configured
route token (default __tenant__) as a path value, so /{tenant}/...
links are unaffected.
The hardened AddTenantRouteParameter guard skips injection when a
custom route strategy is configured under the literal name
"tenantIdentifier" — a stricter version of an existing accepted
carry-forward (custom names other than the literal can still leak as
?<custom>=, documented inline).
Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: IdmtDefaultRoleTypes.DefaultRoles no longer contains SysAdmin or SysSupport — those identities are now expressed via IdmtUser.SysRole and projected as role claims at sign-in. The string constants IdmtDefaultRoleTypes.SysAdmin / SysSupport remain so existing RequireRole policies and the SysRoleKind.ToString() equivalence keep holding. CreateTenant no longer seeds those two roles into new tenants; existing per-tenant rows from older deployments become inert (no users should be assigned to them after migration). CreateTenantHandler now requires ICurrentUserService and inserts a TenantAccess(invokerUserId, newTenantId, IsActive=true) row in the same inner-scope transaction as default-role seeding. This closes the chicken-and-egg HS-4: under the uniform TenantAccess gate a SysAdmin who created a tenant could not subsequently reach it. invokerUserId is captured outside ExecuteInTenantScopeAsync (V2-CRIT-2) — the inner DI scope's CurrentUserService.User is null by design (invariant #11). Bootstrap is wrapped in BeginTransactionAsync/CommitAsync; if role creation or the TenantAccess insert fails, both roll back atomically. A defensive guard rejects creation when the tenant store does not populate IdmtTenantInfo.Id after AddAsync (would otherwise insert a TenantAccess row with TenantId = null). GetUserInfoHandler now surfaces SysRole as a role string when the user has no per-tenant IdentityRole rows. SysAdmins (whose only "role" is SysRole=SysAdmin and not a per-tenant assignment) used to fail authorisation against /manage/info with NoRolesAssigned; the response now includes the union of per-tenant roles and the SysRole projection. Production SeedDefaultDataAsync was rewritten to seed the default tenant directly via IMultiTenantStore + ITenantOperationService rather than through the now-auth-gated handler — boot has no invoker context and the handler's fail-closed correctly refuses. Pre-existing direct-handler integration tests for CreateTenant and DeleteTenant in AdminIntegrationTests + MultiTenancyIntegrationTests were converted to HTTP-driven flows since the handler can no longer be invoked without a current user. 3 cross-tenant login tests remain failing by design (KR-1, deferred to the login-gate step). Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
BREAKING CHANGE: LoginHandler and TokenLoginHandler now reject authentication when the credential-verified user has no active TenantAccess row for the request's resolved tenant. The check fires after CheckPasswordSignInAsync (so a wrong password still returns the same Auth.Unauthorized as a tenant-mismatch — no enumeration oracle) and before any cookie/token is issued. Closes KR-1: under canonical IdmtUser without this gate a user with credentials in tenant A could log in to tenant B simply by hitting B's login endpoint. The TenantAccess gate is uniform per locked decision #4 — SysRole does not short-circuit. Five sanity tests pin this invariant against TenantAccessService directly so a future contributor adding a SysRole fast-path for ergonomics fails CI. RegisterUser now writes the inviting tenant's TenantAccess row in the same transaction as user creation. Without this, a user who registers via /manage/users could not subsequently log in (the new gate would reject them on the very next request). The Add is safe against unique collisions because UserManager.CreateAsync rejects duplicate emails before reaching the TenantAccess insert. AdminIntegrationTests' GetUserTenants assertion was updated to reflect the new invariant (freshly registered user has exactly one TenantAccess for the registering tenant, not zero). Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
Phase 1 ships a CLI harness for moving an existing deployment from the
shadow-row model to the canonical model. The migrator is a documented
harness, not a production-grade tool — it covers the happy path, the
F42 security invariant (pre-migration bearer rejected after
SecurityStamp rotation), and the dry-run/apply ack handshake.
CanonicalIdentityDataMigrator (Idmt.Plugin/Migration/):
- DryRunAsync groups duplicate IdmtUsers by NormalizedEmail, picks
the oldest GUID v7 as the canonical Id, folds SysRole, and emits a
SHA-256 fingerprint of the migration plan.
- ApplyAsync refuses to run unless its computed fingerprint matches
the operator-supplied --ack-dryrun-fingerprint. The whole mutation
block (TenantAccess UserId rewrite, IdmtAuditLog UserId rewrite,
IdentityUserRole + AspNetUserTokens rewrites, RevokedToken legacy
row deletion, duplicate IdmtUser deletion, SysRole fold,
per-survivor SecurityStamp rotation) is wrapped in
BeginTransactionAsync / CommitAsync. Bulk ExecuteUpdate /
ExecuteDelete operations participate in the ambient transaction,
so a SaveChangesAsync failure rolls everything back.
- MigrationCurrentUserService stubs ICurrentUserService for the
context-less migration path so audit emission does not NRE.
AddIdmtMigration replaces the runtime ICurrentUserService
registration with the stub.
tools/Idmt.Migrator: net10.0 console host. Args: --dry-run,
--apply, --ack-dryrun-fingerprint <sha>,
--accept-cross-tenant-merges <ids>, --provider {sqlite,sqlserver}.
Trailing value-taking switches now emit a clear "missing value for
flag" error rather than the misleading "unknown argument" fallback.
Tests: 12 unit cases against an in-memory SQLite fixture exercise
the dry-run determinism, ack handshake, individual rewrites,
SysRole fold, stamp rotation, and an explicit
Apply_SaveChangesAsyncFails_RollsBackBulkOperations regression that
triggers a unique-index collision after a bulk delete and asserts
the row is restored. F42 integration test mints a bearer ticket via
the live login flow, runs the migrator's stamp rotation against the
canonical user, and asserts the previously-issued refresh token is
rejected — closes KR-2 / V2-CRIT-3 alternative path.
Documented residuals: F41 (Phase 0 DDL snapshot fidelity test) and
F47 (audit-emission exact-count assertion) deferred per harness
scope; rationale captured in MigrationApplyTests class header. The
--accept-cross-tenant-merges flag is wired but currently merges all
duplicate groups unconditionally — reserved for future hardening.
Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
Bumps Idmt.Plugin to 2.0.0 to reflect the breaking schema, API, and behavioural changes shipped over the previous 11 commits. CLAUDE.md's Multi-Tenancy and Authentication sections are rewritten so future contributors and AI assistants see the canonical model rather than the shadow-row description that pre-dated this work. CHANGELOG.md (new) captures the breaking change list, the new surface, the consumer migration runbook, the security-audit findings closed (C3, C4, C7, N1, N3, H7), and the residual risks that intentionally roll into Phase 2+ (KR-1 access-token expiry window, KR-3 EmailConfirmed=false bearer continuation, R18 email-spam vector pending rate-limit policy). Refs SECURITY_PHASE_1_CANONICAL_IDENTITY.md
Introduce the v2 authorization-layer design and the throwaway spike that gates it. - adr/0002: "own the policy, rent the protocol" — OpenIddict as the protocol engine (reference tokens, per-request revocation, BFF session for browsers), IDMT owning the canonical-identity/TenantAccess/SysRole policy. Support tokens mint server-side via IOpenIddictTokenManager.CreateAsync inside an IDMT-owned transaction (not a public RFC 8693 grant) so the audit row shares the token-store transaction. - adr/0001 + 0002 sketches + evaluation: supporting design record. - SECURITY_AUDIT + phase docs: the findings motivating the rewrite. - spike/: standalone host + xUnit project (out of Idmt.slnx/CI) proving §7.0 gates 1-4 on real infra — dual-context composition, instant reference-token revocation, per-request audience binding, and same-transaction support-audit atomicity. 8/8 green.
Phase B of the OpenIddict spike proves the remaining prototype-gate items on real infrastructure (.NET 10, OpenIddict 7.5.0, Finbuckle 10.0.3, SQLite); the full suite is 16/16. ADR-0002 flips Proposed -> Accepted with the corrections the spike surfaced. - Gate 6 (SecurityStamp revocation): all-user revoke via RevokeBySubjectAsync; single-tenant revoke via RevokeByAuthorizationIdAsync on a per-(user,tenant) OpenIddict authorization. A token entry has no audience column (audience lives in the encrypted payload), so single-tenant uses authorization grouping, not an audience filter. UserTokenMint is the shared prerequisite. Proven on a 100-token user; cost is one store call. - Gate 5 (anti-subtraction seam): a last-wins PostConfigure lock re-clamps locked options, and IdmtSelfCheckStartupFilter throws a typed IdmtSecurityInvariantException at host start if an invariant was subtracted after the lock. - Gate 7 (BFF): a server-side session store keeps the reference token; the cookie is only an opaque session id. A resolver maps cookie -> bearer before authentication so the cookie path runs the same audience handler a raw bearer does; a mutating request without an anti-forgery token is rejected. The session token is acquired by a client-credentials back-channel stand-in (subject=client; user identity in the session); auth-code+PKCE stays a §7.1 open question. ADR §2.7 corrected (RevokeBySubjectAsync exists in 7.5.0; no audience column), §2.9 locked-set entry updated, §7.0 records the prototype outcome and the scoped stand-ins, Status -> Accepted.
…ser) Replaces gate 7's client-credentials back-channel stand-in with a real interactive authorization-code + PKCE flow, so the BFF session token carries subject = the authenticated user. Full spike suite 19/19. - /auth/login establishes an interactive authorization-server cookie session (AuthServerLogin), distinct from bff_session and the API validation scheme. - /connect/authorize (OpenIddict passthrough) issues a code for the logged-in user, audienced to the tenant via a custom 'tenant' param + SetAudiences (the standard RFC 8707 'resource' param trips OpenIddict invalid_target on an unregistered URN). /connect/token gains an authorization_code branch. - BFF /bff/login-pkce generates the S256 PKCE pair, stores the verifier server-side, and redirects; /bff/callback exchanges code+verifier via the back-channel and stores the reference token in the server-side session — the browser holds only the opaque bff_session cookie. Public spike-spa client (ClientType=Public) requires PKCE. - Gate 8 tests: full browser flow yields whoami subject = user id; the session runs the same audience handler (acme 200 / globex 401); a no-challenge authorize is rejected. Verified by mutation review. Security note: the spike's OAuth `state` is not bound to the initiating browser (login-CSRF) — documented in-code as a SPIKE LIMITATION and carried to the v2 plan as a hardening requirement; it is not a composition unknown this gate exists to prove. ADR §7.0 records the gate 8 outcome (19 tests, subject=user supersedes the gate 7 stand-in); §7.1 marks interactive first-party auth resolved to auth-code + PKCE, leaving machine-client auth and the scale-out backplane open.
Reconciles the branch with main's #25, which restructured src/Idmt.Plugin to Idmt.Plugin and made old-model V1 fixes. Conflicts resolved in favor of the branch's canonical-identity 2.0.0 rewrite, which supersedes #25's old-model handlers (those still referenced the dropped user.TenantId and shadow-user logic). #25's other content is already on the branch: the service unit tests, the rate-limiting default, and the directory move are present via shared commits, and the CI workflows are identical. Net result equals the branch tip. Verified: dotnet build Idmt.slnx clean, 503/503 tests pass (366 unit + 137 integration), dotnet format clean.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
This branch advances IDMT's security architecture in two layers: a shipped
identity refactor (2.0.0), and an accepted, prototype-proven design for the v2
authorization rewrite.
What's in here
1. Canonical identity + uniform TenantAccess gate (2.0.0, breaking)
IdmtUseris now the global canonical identity (one row per human);TenantIddropped from the user.
IdmtRolestays per-tenant.SysRoleKindis a globalsystem-role flag projected as a role claim at sign-in.
TenantAccessgate is uniform across all users (no SysRole short-circuit)and is enforced at login by
LoginHandler/TokenLoginHandler.Grant/RevokeTenantAccess)rewritten around canonical
UserId; OOB email-change staging viaPendingEmail;tenantIdentifierstripped from confirm/reset links. Canonical-identity datamigrator harness included.
adr/0001-canonical-identity-and-tenant-access.md).2. v2 authorization design: ADR-0002, Accepted (
adr/0002-...)reference (opaque) access tokens with per-request revocation, an IDMT-owned
per-request handler binding token
audto the Finbuckle-resolved tenant,server-side support-token mint, and a backend-for-frontend browser session.
compositions that "cannot be settled on paper" (ADR section 7.0).
3. Prototype spike (
spike/, throwaway, OUT ofIdmt.slnx/CI)SQLite: dual-context composition, instant reference-token revocation, per-request
audience binding, same-transaction support-audit atomicity, SecurityStamp
revocation (authorization grouping), the two-layer anti-subtraction seam, the BFF
server-side session + CSRF, and real auth-code + PKCE browser login (subject =
user). Each mechanism was mutation-verified.
Validation
dotnet formatclean. Run:dotnet test spike/Idmt.Spike.slnx.testing; all findings folded in or recorded.
Notes for reviewers
spike/tree is design validation, not product: it is excluded fromIdmt.slnxand CI and is not meant to ship. The v2 packages(
Idmt.Core/Idmt/Idmt.Mfa) are the next milestone, not part of this PR.CHANGELOG before merging.
backplane, out-of-process resource servers; plus a recorded spike-only OAuth
stateto browser binding (login-CSRF) to harden in the v2 build.