Skip to content

ops(db): unused-index 7-day audit runbook + SQL kit (replaces #61)#62

Merged
ahmetabdullahgultekin merged 1 commit into
masterfrom
ops/unused-index-audit-runbook-2026-05-12
May 12, 2026
Merged

ops(db): unused-index 7-day audit runbook + SQL kit (replaces #61)#62
ahmetabdullahgultekin merged 1 commit into
masterfrom
ops/unused-index-audit-runbook-2026-05-12

Conversation

@ahmetabdullahgultekin

Copy link
Copy Markdown
Contributor

Summary

Clean re-do of #61. Adds the prod Postgres unused-index 7-day audit runbook + helper SQL — exactly 5 new files + a 1-line ROADMAP bullet rewrite.

`infra/RUNBOOK_UNUSED_INDEX_AUDIT.md` — Day-0 / Day-1-7 / Day-7 / Day-7+ DROP / rollback.
`infra/scripts/unused-index-baseline.sql` — Day-0 snapshot into `public.ops_unused_index_baseline` (sidecar, NOT `pg_stat_reset`).
`infra/scripts/unused-index-delta.sql` — daily monitoring.
`infra/scripts/unused-index-verify.sql` — Day-7 verifier (delta==0 AND size > 10 MB).
`infra/scripts/unused-index-drop-template.sql` — `BEGIN; ... ROLLBACK;` default so a naive `psql < file` is a no-op.

Key design decisions

  • Forbidden tables (hard-coded in every script): `webauthn_credentials`, `oauth2_clients`, `refresh_tokens`, `audit_logs` — traffic patterns still settling per Senior DB Review.
  • DB-name guard: every script has `DO $$ ... IF current_database() <> 'identity_core' RAISE EXCEPTION ... $$` so running against the wrong DB aborts immediately.
  • Sidecar baseline, not `pg_stat_reset`: keeps all the other useful stats counters intact.
  • >10 MB strict size gate: only large enough indexes are worth the soak.
  • Net expected drops post-7-day-soak: just 2 — `idx_api_keys_key_hash` (clean duplicate of UNIQUE) + `idx_voice_embeddings_ivfflat` (928 kB, largest single waste in `identity_core`).
  • All 13 unused `users` indexes: KEEP — table will grow past the seq-scan break-even (~2 k rows).
  • All 3 HNSW indexes on biometric_db: KEEP — pay off above ~10 k rows.

Test plan

What this supersedes

PR #61 was opened from a stale worktree base that pre-dated #57/#58/#59/#60, so its diff inadvertently carried 150+ unrelated files (archived docs reappearing at root, `.claude/*` agent leftovers, build artefacts). That branch + PR are now closed.

🤖 Generated with Claude Code

Adds the operator-facing assets for the ROADMAP-tracked unused-index
audit. Execution against prod is operator-controlled — nothing here
runs by itself.

  infra/RUNBOOK_UNUSED_INDEX_AUDIT.md  — main runbook (Day-0 / Day-1-7
    monitoring / Day-7 verification / Day-7+ DROP gate / rollback).
  infra/scripts/unused-index-baseline.sql  — Day-0 snapshot into
    `public.ops_unused_index_baseline` (sidecar table, NOT pg_stat_reset).
  infra/scripts/unused-index-delta.sql  — daily monitoring SQL.
  infra/scripts/unused-index-verify.sql  — Day-7 verifier (idx_scan-delta
    == 0 AND size > 10 MB).
  infra/scripts/unused-index-drop-template.sql  — DROP INDEX template
    with `BEGIN; ... ROLLBACK;` default so a naive `psql ... < file` is
    a no-op. Operator manually flips the tail to COMMIT.

Forbidden tables (hard-coded in every script): `webauthn_credentials`,
`oauth2_clients`, `refresh_tokens`, `audit_logs`.

Database-name guard: every script has
`DO $$ ... IF current_database() <> 'identity_core' RAISE EXCEPTION ...`
so running against the wrong DB aborts immediately. (The earlier ROADMAP
cited `identity_core_db`; the live name is `identity_core`.)

Candidate inventory: ~25 indexes with idx_scan = 0 after 58+ live days,
sourced from `archive/2026-05/reviews/SENIOR_DB_REVIEW_2026-05-04.md`
Appendix C. Net expected drops post-7-day-soak: 2 indexes
(`idx_api_keys_key_hash` — clean duplicate of UNIQUE constraint;
`idx_voice_embeddings_ivfflat` — 928 kB, the largest single waste in
identity_core). All 13 unused `users` indexes are KEEP because the
table will grow past the seq-scan break-even (~2 k rows). All 3 HNSW
indexes on biometric_db are KEEP because they pay off above ~10 k rows.

Supersedes PR #61 which inadvertently carried 150+ unrelated files
from a stale worktree base.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings May 12, 2026 15:19
@ahmetabdullahgultekin ahmetabdullahgultekin merged commit 078a9be into master May 12, 2026
2 checks passed
@ahmetabdullahgultekin ahmetabdullahgultekin deleted the ops/unused-index-audit-runbook-2026-05-12 branch May 12, 2026 15:19

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds an operator-facing runbook and helper SQL scripts to run a 7-day Postgres unused-index audit against the identity_core prod DB, plus updates the ROADMAP item to point at the new runbook.

Changes:

  • Added infra/RUNBOOK_UNUSED_INDEX_AUDIT.md detailing Day-0 baseline, Days 1–7 delta monitoring, Day-7 verification, and a gated drop/rollback process.
  • Added four SQL scripts (baseline, delta, verify, drop-template) to support the audit workflow with a DB-name guard and forbidden-table list.
  • Updated the ROADMAP unused-index audit bullet to reference the runbook and scripts.

Reviewed changes

Copilot reviewed 6 out of 6 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
ROADMAP.md Rewrites the unused-index audit bullet to point at the new runbook/scripts.
infra/RUNBOOK_UNUSED_INDEX_AUDIT.md New operator runbook for the 7-day audit workflow (baseline/delta/verify/drop/rollback).
infra/scripts/unused-index-baseline.sql Creates/truncates a sidecar baseline table and captures a Day-0 snapshot.
infra/scripts/unused-index-delta.sql Day-N delta report vs baseline with recommended action labels.
infra/scripts/unused-index-verify.sql Day-7 strict verification output + rollback DDL extraction.
infra/scripts/unused-index-drop-template.sql Gated DROP INDEX template wrapped in BEGIN/ROLLBACK by default.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ROADMAP.md
## Active wave — Ops + DB hygiene

- **Unused-index 7-day audit** — reset `pg_stat_user_indexes`, monitor for 7 days, then `DROP INDEX` confirmed-zero-scan ones (25+ candidates per Senior DB Appendix C). Caution: do not drop on `webauthn_credentials`, `oauth2_clients`, `refresh_tokens`, `audit_logs` until traffic patterns settle. **Not yet kicked off in this session.**
- **Unused-index 7-day audit** — runbook ready at `infra/RUNBOOK_UNUSED_INDEX_AUDIT.md` (2026-05-12). Uses a sidecar `public.ops_unused_index_baseline` snapshot (NOT `pg_stat_reset`), monitors deltas over 7 days, then `DROP INDEX` confirmed-zero-scan candidates > 10 MB only. Forbidden tables hard-coded in every script: `webauthn_credentials`, `oauth2_clients`, `refresh_tokens`, `audit_logs`. Day-0 / Day-7 / Drop template SQL in `infra/scripts/unused-index-{baseline,delta,verify,drop-template}.sql`. Operator runs Day-0 from runbook Step 2; agent does not touch prod. Candidate list (~25 per `archive/2026-05/reviews/SENIOR_DB_REVIEW_2026-05-04.md` Appendix C); net expected drops post-soak: `idx_api_keys_key_hash` (clean duplicate-of-UNIQUE) + `idx_voice_embeddings_ivfflat` (928 kB, largest waste), plus whatever Day-7 verification surfaces.
-- Effect: creates public.ops_unused_index_baseline and inserts a single
-- snapshot of pg_stat_user_indexes tagged with now().
-- Idempotency: re-running clears the prior baseline. If you need to keep
-- multiple snapshots, comment out the TRUNCATE on line 28.
IF current_database() <> 'identity_core' THEN
RAISE EXCEPTION 'This script must run against identity_core, not %', current_database();
END IF;
IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = 'ops_unused_index_baseline') THEN
IF current_database() <> 'identity_core' THEN
RAISE EXCEPTION 'This script must run against identity_core, not %', current_database();
END IF;
IF NOT EXISTS (SELECT 1 FROM pg_class WHERE relname = 'ops_unused_index_baseline') THEN
Comment on lines +42 to +43
-- have inserted them by mistake. The trigger below would catch it, but the
-- right answer is to NOT type them.
Comment on lines +61 to +67
-- DROP INDEX IF EXISTS public.idx_voice_embeddings_ivfflat;
DROP INDEX IF EXISTS public.TODO_INDEX_NAME_1;
\echo 'Dropped TODO_INDEX_NAME_1 (or no-op if already missing).'

-- 2) Optional next drop candidate. Comment out or duplicate as needed.
-- DROP INDEX IF EXISTS public.TODO_INDEX_NAME_2;
-- \echo 'Dropped TODO_INDEX_NAME_2 (or no-op if already missing).'
Comment on lines +36 to +38
- **Skip indexes < 10 MB** in the Day-7 candidate list — the disk savings
aren't worth the rollback risk. Smaller indexes can be left for a future
cycle.
Comment on lines +309 to +310
- `infra/RUNBOOK_FLYWAY_REPAIR.md` — sibling runbook style guide.
- `infra/RUNBOOK_DR.md` — disaster recovery if rollback layer 3 is invoked.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants