infra(traefik+ops): XFF strip + OPERATOR_ACTIONS 2026-05-12#67
Merged
Conversation
P1 hygiene from 2026-05-12 senior reviews (backend, DB, infra, security):
* infra/traefik: vendored copy of /opt/projects/infra/traefik/config/
with forwardedHeaders.trustedIPs: [] on both :80 and :443 entryPoints.
RateLimitInterceptor.getClientIP in identity-core-api consumes
`XFF.split(",")[0]` so the prior config (no forwardedHeaders block)
let an attacker bypass every per-IP bucket (login, MFA, biometric,
qr-generate) by setting their own X-Forwarded-For. Empty trustedIPs
causes Traefik to strip incoming XFF and write its own using the peer
IP. Internal Docker bridge (172.20.0.0/24) is NOT trusted because
external clients never connect from that range — only Docker-network
containers, and those don't set XFF. README.md documents the
vendored-vs-live split and the sync workflow.
* OPERATOR_ACTIONS_2026-05-12.md: 5 items agents shouldn't autonomously
execute. Per-item severity, blast radius, maintenance window,
dependencies, explicit commands:
1. audit_logs partman bootstrap (V57 was a silent no-op; runbook
at infra/RUNBOOK_AUDIT_LOG_PARTMAN.md prepped Option A image)
2. RLS theatre (V25 left FORCE commented; 9 tables relforcerowsecurity=f;
app role is postgres superuser → RLS bypassed)
3. web-app/.env.production still byte-identical to leaked literal
6bdedd2; live bundle is clean but rebuild-from-tree would regress
4. parent main fast-forward: master 220 ahead, main 134 ahead but
all already merged via PR #51 — `git push origin master:main
--force-with-lease` reconciles
5. HS512 kid hs-2026-04 revocation pending Team Auth-Java PR;
rebuild api container after merge
Companion api PR fix/2026-05-12-infra-hygiene ships V61 NOT NULL for
audit_logs.tenant_id (locks down the V59 backfill).
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This was referenced May 12, 2026
| - **CRITICAL** — exposes a live, exploitable security or correctness gap. | ||
| - **HIGH** — drift between deployed config and committed config; reviewers | ||
| cannot reason about prod from code. | ||
| - **MEDIUM** — hygiene + cosmetic; safe to defer but easy to land. |
Comment on lines
+1
to
+5
| # OPERATOR ACTIONS — 2026-05-12 | ||
|
|
||
| Items surfaced by the 2026-05-12 senior reviews (backend, DB, infra, security) | ||
| that agents should not autonomously execute. Each is a checklist with explicit | ||
| commands, a maintenance-window estimate, and explicit dependencies. Severity |
| ``` | ||
|
|
||
| **Blast radius.** | ||
| A SQL-injection (or a deliberately misuse of `JdbcTemplate.queryForList`) |
| git merge-base --is-ancestor origin/main origin/master \ | ||
| && echo "OK: main is an ancestor of master, fast-forward safe." | ||
| # Apply: | ||
| git push origin master:main --force-with-lease |
Comment on lines
+24
to
+34
| # 2. Validate (Traefik watches dynamic.yml live; traefik.yml requires restart) | ||
| docker compose -f /opt/projects/infra/traefik/docker-compose.yml \ | ||
| --env-file /opt/projects/infra/traefik/.env config | ||
|
|
||
| # 3. Apply | ||
| # dynamic.yml changes: zero-restart, picked up via inotify (`watch: true`) | ||
| # traefik.yml changes: require container restart | ||
| docker compose -f /opt/projects/infra/traefik/docker-compose.yml \ | ||
| --env-file /opt/projects/infra/traefik/.env restart traefik | ||
|
|
||
| # 4. Verify access log writes peer IP, not client-supplied XFF |
| --env-file /opt/projects/infra/traefik/.env restart traefik | ||
|
|
||
| # 4. Verify access log writes peer IP, not client-supplied XFF | ||
| docker logs traefik 2>&1 | tail -20 |
Comment on lines
+199
to
+200
| is now `API_KEY_SECRET=fcb06b7…` (verified by the 2026-05-12 security | ||
| review). However the on-disk template at |
4 tasks
ahmetabdullahgultekin
added a commit
that referenced
this pull request
May 28, 2026
Low-risk doc/config polish for items Copilot flagged on PR #67 (and PR #69 where those files reached master). No behavior change to running services; the only executable change is a more-robust docs-site healthcheck path. - archive/.../OPERATOR_ACTIONS_2026-05-12.md: - redact partial live secret (API_KEY_SECRET=fcb06b7… → <redacted>) - main update: normal fast-forward `git push origin master:main`, reserve --force-with-lease for documented recovery only - add LOW to the severity legend (items 9-11 use it) - make item-count self-reference consistent (states 11; notes five→11 growth) - grammar: "a deliberately misuse" → "a deliberate misuse" - docs-site/html/identity/index.html: fallback copy now says the OpenAPI spec is publicly available at /identity/openapi.json (it ships public) - landing-website/src/index.css: comment now accurately describes the locale-aware :lang(en) uppercasing; drop the false belt-and-braces / codepoint-forcing claim and the duplicate text-transform line - docs-site/docker-compose.prod.yml: healthcheck probes /health (the dedicated nginx endpoint) instead of / - infra/traefik/README.md: add a Traefik-config dry-run validate step (compose config only validates the Compose file) and note access logs go to /var/log/traefik/access.log per accessLog.filePath, not stdout Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
forwardedHeaders.trustedIPs: []on:80and:443entryPoints. Traefik is directly internet-facing (no upstream proxy), andRateLimitInterceptor.getClientIPin identity-core-api consumesXFF.split(",")[0]— the prior config let an attacker bypass every per-IP bucket (login, MFA, biometric, qr-generate) by setting their ownX-Forwarded-For. EmptytrustedIPscauses Traefik to strip incoming XFF and write its own using the peer IP.Files
infra/traefik/config/traefik.yml— vendored copy of/opt/projects/infra/traefik/config/traefik.ymlwith the XFF hardening applied.infra/traefik/config/dynamic.yml— vendored copy (no change; mirrored for parity).infra/traefik/README.md— explains the vendored-vs-live split (live config lives at/opt/projects/infra/traefik/in the/opt/projects/.gitlocal repo) and the sync workflow.OPERATOR_ACTIONS_2026-05-12.md— the five-item operator checklist.OPERATOR_ACTIONS items
success=tbut the livepgvector/pgvector:pg17image lackspg_partman; the migration's first guardRAISE WARNING + RETURNed before any work.audit_logs.relkind='r', 1168 rows, no inheritance children. Custom image recipe at/opt/projects/infra/RUNBOOK_AUDIT_LOG_PARTMAN.md.FORCE ROW LEVEL SECURITYcommented out; every policy isOR current_tenant_id() IS NULLfail-open; app connects aspostgressuperuser; 9 RLS-enabled tables all haverelforcerowsecurity=f. Requires non-superuser app role + V62 migration + JDBC URL flip.6bdedd2. Live key has been rotated and the live bundle does NOT include the literal (audited), but rebuild-from-this-tree would regress. Operator chooses between placeholder-on-disk or history rewrite.git push origin master:main --force-with-leasereconciles.revoked-kids: [hs-2026-04]toapplication-prod.yml. After their PR merges, rebuild api container.Companion PR
identity-core-api#99— V61 NOT NULL constraint onaudit_logs.tenant_id.Test plan
docker compose -f /opt/projects/infra/traefik/docker-compose.yml --env-file /opt/projects/infra/traefik/.env configexits zero on the new traefik.yml.curl -sS -H "X-Forwarded-For: 9.9.9.9" https://api.fivucsas.com/actuator/healthand confirm Traefik's/var/log/traefik/access.logrecords the real peer IP, not9.9.9.9. Cross-check by tailing the identity-core-api container log for the same request — theclientIp=field should show the peer IP, not9.9.9.9.curl -sS -H "X-Forwarded-For: 9.9.9.9" https://api.fivucsas.com/auth/login10x rapidly from one host — confirm the rate limit triggers based on peer IP (it should), not the attacker-supplied 9.9.9.9. Without this fix, the 10 requests would each appear to come from a different IP if the attacker varied XFF per request.traefik.ymlchanges are NOT picked up via the file-watcher (watch: trueapplies to dynamic.yml only). Operator mustdocker compose ... restart traefikafter syncing the vendored copy to/opt/projects/infra/traefik/config/traefik.yml. The README documents this.🤖 Generated with Claude Code