feat(db): V61 audit_logs.tenant_id NOT NULL (lock down P1 from senior reviews 2026-05-12)#99
Conversation
… reviews 2026-05-12) V59 backfilled audit_logs.tenant_id (140 NULLs -> 0 / 1168 total) and added a writer-side sentinel UUID for anonymous events. The column was deliberately left nullable so any straggler writer would surface as a metric rather than a 500 during the soak window. That window has elapsed (one week clean, verified 2026-05-11 session notes). V61 closes the gate: * Pre-flight DO block re-counts NULLs and RAISE EXCEPTION if any are present (Flyway fails the deploy rather than silently mutating prod inside a constraint migration). * ALTER COLUMN ... SET NOT NULL. Metadata-only change on PG12+ when zero NULLs exist; no full table rewrite. * No column DEFAULT — sentinel writes are the writer's responsibility (AuditLogAdapter). A DEFAULT would hide writer regressions. JPA mapping AuditLog.tenantId is updated with @column(nullable = false) so ddl-auto=validate stays consistent with the new DB constraint. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
CI note (2026-05-12) The failing V40 (audit_logs partman partitioning, 2026-04-20) requires the The Testcontainers postgres image used by CI is also bare, but Flyway in CI runs with stricter settings than prod (the This is unblocked by either:
Unit tests (Maven test) pass; GitGuardian + scan pass. |
|
Update on #101: my fixture resolves the first V40 failure (the |
…ision) (#101) * test(flyway): V39_5 fixture renames audit_logs_pkey to unblock V40 in CI V40 (partition audit_logs) renames audit_logs to audit_logs_legacy, then creates a new partitioned audit_logs root with a composite PK named audit_logs_pkey. On a fresh Testcontainers DB this fails with: ERROR: relation "audit_logs_pkey" already exists Location: db/migration/V40__partition_audit_logs.sql because PG keeps the V5-declared inline PK index/constraint name attached to the renamed (legacy) table. The new composite-PK ADD collides on the shared name space. Prod does not hit this: V40 was BASELINE-SKIPPED in 2026-04 (see V57 header comment + DB review 2026-04-30 §4), so the migration is recorded as success=t in flyway_schema_history but never ran. Prod's audit_logs is still relkind='r' and V57 (idempotent partman-aware partitioning) is the prod-side conversion path; it's fail-soft when partman is missing (current pgvector/pgvector:pg17 image) so deploys are safe. CI runs every migration end-to-end on a fresh DB, so it hits the V40 PG name collision. The Integration tests (Testcontainers) CI job has been red on every PR (including #99 V61 and #100 P0 bundle) since pre-existing 2026-05-04 baseline; it's continue-on-error so it does not block merges but is permanent noise on every PR's checks page. Fix: test-only fixture V39_5 loaded only by the integration profile via spring.flyway.locations=classpath:db/migration,classpath:db/test-fixtures. It renames audit_logs_pkey -> audit_logs_v5_pkey before V40 runs, freeing the global name. Idempotent: only renames when audit_logs is still a plain heap with the V5-default constraint name. Matches the V28_5__align_default_login_flow_for_v29.sql precedent exactly (test-only fixture pre-fixing state for a checksum-locked production migration). Zero prod surface area: the file lives in src/test/resources/db/test-fixtures/ which is never on the prod classpath. Verified locally against pgvector/pgvector:pg17: - V5-style CREATE TABLE + V40-style RENAME+CREATE+ADD-CONSTRAINT reproduces ERROR exactly. - Same sequence with V39_5 inserted between V5 and V40 succeeds; both the legacy table (audit_logs_v5_pkey) and the new partitioned root (audit_logs_pkey) coexist with distinct constraint names. - Idempotency confirmed: re-running V39_5 against a post-V40 state falls through the IF guards and is a no-op. Why not edit V40 in place: three previous in-place V40 fixes (commits 2b00a7b, a30287b, 81101f0) were each reverted (43d0050, 614a94f, f404d12) because V40's checksum is locked in prod flyway_schema_history. Why not a new V62 production migration: V57 already provides the prod-side path (idempotent heap->partitioned conversion + partman setup, fail-soft when partman absent). A V62 would duplicate V57. The CI-side problem is purely the V40 in-place hard failure, addressed surgically by this test fixture. Unblocks the Integration tests check on PR #99 (V61 audit_logs.tenant_id NOT NULL) and #100 (P0/P1 senior-review bundle). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(gitleaks): bump application-integration.yml line refs after V39_5 yaml edit The previous commit added 6 lines of comment to the spring.flyway section of application-integration.yml, shifting the test-only jwt.secret / totp.enc-key lines from 55,62 to 61,68. Update the .gitleaksignore fingerprints to match so the scan job stays green. Secrets unchanged (both are pre-existing test-only fixtures, allowlisted since the file was added). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
ALTER TABLE audit_logs ALTER COLUMN tenant_id SET NOT NULLwith a pre-flight DO block that fails loudly viaRAISE EXCEPTIONif any NULL rows slipped in between V59 (2026-05-11) and V61 application.AuditLog.tenantIdannotated with@Column(nullable = false)so Hibernateddl-auto=validatestays consistent with the new DB constraint.Why
V59 backfilled NULLs (140 -> 0 / 1168 total) and added the writer-side sentinel UUID for anonymous events. V59 deliberately left the column nullable so straggler writers would surface as
audit.publish.failuremetrics rather than 500s during the soak window. That window has now elapsed (one week clean, verified in the 2026-05-11 session notes). V61 closes the gate.The migration does NOT set a column DEFAULT — sentinel writes are the writer's responsibility (
AuditLogAdapter). A DEFAULT would hide writer regressions by silently substituting the sentinel instead of failing fast onNotNullViolation.Senior-review trail
audit_logs.tenant_idstill nullable.Test plan
flyway infoand confirm V61 isPENDING.flyway migrate, confirmsuccess=tfor V61 inflyway_schema_history.\d+ audit_logsin psql — confirmtenant_idcolumn isNOT NULL.UPDATE audit_logs SET tenant_id = NULL WHERE id = (SELECT id FROM audit_logs LIMIT 1);then run V61 — confirm Flyway fails with the RAISE EXCEPTION message and the column stays nullable.ddl-auto=validateagainst a post-V61 DB — confirm no schema-mismatch warnings onAuditLog.tenantId.Operator note
Companion items requiring operator hands are in the parent monorepo PR
OPERATOR_ACTIONS_2026-05-12.md.🤖 Generated with Claude Code