Skip to content

Normalize identity.birthday to ISO 8601 date (YYYY-MM-DD) across all members #285

@rubin110

Description

@rubin110

Background

The identity.birthday JSONB field is stored in two coexisting formats in production:

Format n %
YYYY-MM-DDTHH:MM:SS (19 ch) 2972 94%
YYYY-MM-DD (10 ch) 158 5%
JSON null 41 1%

The current application code expects birthdays in ISO 8601 date format (YYYY-MM-DD, 10 chars). The 19-char datetime form is legacy data from the Wild Apricot import that has never been normalized.

Code expects YYYY-MM-DD

Silent bugs caused by the legacy format

  1. isOver21() silently returns false for the 2972 legacy members — the "21+" badge in admin UI never shows for them regardless of actual age.
  2. validate_age_18_or_older raises ValueError if a legacy birthday is re-submitted through the admin portal without first being trimmed to 10 chars.
  3. strptime("%Y-%m-%d") anywhere in the codebase will fail on the 19-char form. Any code path that round-trips a legacy birthday hits this.

Migration plan

Step 1 — Confirm candidate count (read-only)

```sql
SELECT count(*) AS will_update
FROM member
WHERE jsonb_typeof(identity->'birthday') = 'string'
AND identity->>'birthday' ~ '^\d{4}-\d{2}-\d{2}T';
```

Expected: ~2972 (as of 2026-05-23 fingerprint).

Step 2 — Take a backup

```bash
docker exec dh_db pg_dump -U dh -d deepharbor --table=member --data-only --column-inserts > /tmp/member_backup_pre_birthday_normalize.sql
```

Step 3 — (Optional) Pause DHDispatcher

The migration will fire `log_member_changes()` per row, creating ~2972 `member_changes` rows. DHADController may then attempt to sync birthday changes to AD/B2C. Consider stopping the dispatcher temporarily:

```bash
docker compose -f /opt/docker/deepharbor/docker-compose.yml stop dh_dispatcher
```

…and restarting after Step 5 verification passes.

Step 4 — Run the migration

```sql
BEGIN;

UPDATE member
SET identity = jsonb_set(
identity,
'{birthday}',
to_jsonb(substring(identity->>'birthday', 1, 10))
)
WHERE jsonb_typeof(identity->'birthday') = 'string'
AND identity->>'birthday' ~ '^\d{4}-\d{2}-\d{2}T';

-- Verify row count matches Step 1 expectation before commit
COMMIT;
```

Step 5 — Verify

```sql
-- Should return 0
SELECT count(*) AS still_dirty
FROM member
WHERE jsonb_typeof(identity->'birthday') = 'string'
AND identity->>'birthday' !~ '^\d{4}-\d{2}-\d{2}$';

-- Format distribution should show only 10-char strings + JSON null
SELECT
CASE
WHEN jsonb_typeof(identity->'birthday') = 'null' THEN 'json_null'
WHEN identity->>'birthday' ~ '^\d{4}-\d{2}-\d{2}$' THEN 'iso_date_10ch'
ELSE 'unexpected'
END AS state,
count(*) AS n
FROM member
WHERE jsonb_typeof(identity) = 'object' AND identity ? 'birthday'
GROUP BY state;
```

Expected: iso_date_10ch = 3130, json_null = 41, unexpected = 0.

Step 6 — Restart dispatcher (if paused)

```bash
docker compose -f /opt/docker/deepharbor/docker-compose.yml start dh_dispatcher
```

Out of scope

  • The 41 JSON-null birthdays — these are legitimate "no birthday on file" cases, leave as-is.
  • Future writes — already enforced as YYYY-MM-DD by FIELD_VALIDATORS and <input type="date">.
  • Synthetic seed data — the simulator should still produce the mixed format until this migration runs in prod, so tests can catch code that doesn't tolerate the legacy form. After migration, update the seed.

Source

Findings from prod fingerprinting session 2026-05-23 (.claude/notes/2026-05-23-prod-fingerprint-findings.md).

Metadata

Metadata

Assignees

Type

No fields configured for Task.

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions