-
Notifications
You must be signed in to change notification settings - Fork 4
feat: MinorTraceChemistry backfill job (#600) #640
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
kbighorse
wants to merge
27
commits into
staging
Choose a base branch
from
600-minor-trace-chemistry-backfill
base: staging
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
27 commits
Select commit
Hold shift + click to select a range
2491e15
feat: implement Radionuclides backfill job and BDD step definitions
kbighorse 90e87b7
fix: harden chemistry backfill test cleanup
kbighorse 03df1f1
fix: harden backfill robustness and remove misleading batch_size
kbighorse 2cba0ae
fix: make migration idempotent, remove batch-size drift, scope test c…
kbighorse 20cee26
fix: harden migration FK lookup, add row-error tracebacks, scope test…
kbighorse 56fa0a6
Formatting changes
kbighorse c1d1b3f
fix: scope sample assertions, normalize existing_keys case, broaden r…
kbighorse 848524a
Formatting changes
kbighorse 4d867ee
Merge branch 'staging' into 558-radionuclides-backfill
kbighorse ac44f6b
fix(tests): scope analysis method cleanup to only backfill-created rows
kbighorse 10f3e41
fix(tests): clear backfill tracking after scenario cleanup
kbighorse 1f09e20
fix: deterministic volume handling and explicit CLI arg rejection
kbighorse ae60c1f
Formatting changes
kbighorse 243f5cf
Merge remote-tracking branch 'origin/staging' into 558-radionuclides-…
kbighorse f7cef22
Merge branch '558-radionuclides-backfill' of https://github.com/DataI…
kbighorse 09c31ff
fix: make pg_cron optional for local development
kbighorse 72cdf83
fix: make pg_cron optional for local development
kbighorse 0acbfb6
Merge branch '558-radionuclides-backfill' of https://github.com/DataI…
kbighorse c906134
Merge staging into 600-minor-trace-chemistry-backfill
kbighorse 7b92d1f
feat: implement MinorTraceChemistry backfill job and BDD steps
kbighorse 6f6f098
Formatting changes
kbighorse 9bc0e08
fix: use ug/L default_unit for minor trace parameters, warn on NULL A…
kbighorse 5b05b22
fix: address PR review comments from Copilot
kbighorse 2d6cd6e
fix: add missing AnalysisDate to radionuclides BDD scenarios
kbighorse 9fb6905
fix: address second round of Copilot review comments
kbighorse c265328
Formatting changes
kbighorse f96f5e7
chore: move datetime imports to module level, add logger.warning for …
kbighorse File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
30 changes: 30 additions & 0 deletions
30
alembic/versions/1f1bdf75e0c0_merge_migration_heads_after_staging_.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,30 @@ | ||
| """merge migration heads after staging merge | ||
|
|
||
| Revision ID: 1f1bdf75e0c0 | ||
| Revises: 545a5b77e5e8, t6u7v8w9x0y1 | ||
| Create Date: 2026-04-03 10:12:48.253856 | ||
|
|
||
| """ | ||
|
|
||
| from typing import Sequence, Union | ||
|
|
||
| from alembic import op | ||
| import geoalchemy2 | ||
| import sqlalchemy as sa | ||
| import sqlalchemy_utils | ||
|
|
||
| # revision identifiers, used by Alembic. | ||
| revision: str = "1f1bdf75e0c0" | ||
| down_revision: Union[str, Sequence[str], None] = ("545a5b77e5e8", "t6u7v8w9x0y1") | ||
| branch_labels: Union[str, Sequence[str], None] = None | ||
| depends_on: Union[str, Sequence[str], None] = None | ||
|
|
||
|
|
||
| def upgrade() -> None: | ||
| """Upgrade schema.""" | ||
| pass | ||
|
|
||
|
|
||
| def downgrade() -> None: | ||
| """Downgrade schema.""" | ||
| pass |
204 changes: 204 additions & 0 deletions
204
alembic/versions/545a5b77e5e8_add_chemistry_backfill_columns_to_.py
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,204 @@ | ||
| """add chemistry backfill columns to observation and sample | ||
|
|
||
| Revision ID: 545a5b77e5e8 | ||
| Revises: d5e6f7a8b9c0 | ||
| Create Date: 2026-02-27 11:30:45.380002 | ||
|
|
||
| """ | ||
|
|
||
| from typing import Sequence, Union | ||
|
|
||
| from alembic import op | ||
| import sqlalchemy as sa | ||
|
|
||
| # revision identifiers, used by Alembic. | ||
| revision: str = "545a5b77e5e8" | ||
| down_revision: Union[str, Sequence[str], None] = "d5e6f7a8b9c0" | ||
| branch_labels: Union[str, Sequence[str], None] = None | ||
| depends_on: Union[str, Sequence[str], None] = None | ||
|
|
||
|
|
||
| def upgrade() -> None: | ||
| """Add chemistry backfill columns to observation and sample tables.""" | ||
| # Observation table: 4 new columns | ||
| op.add_column( | ||
| "observation", | ||
| sa.Column( | ||
| "nma_pk_chemistryresults", | ||
| sa.String(), | ||
| nullable=True, | ||
| comment="NM_Aquifer GlobalID for chemistry results — transfer audit and idempotent upsert key", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "observation", | ||
| sa.Column( | ||
| "detect_flag", | ||
| sa.Boolean(), | ||
| nullable=True, | ||
| comment="True=detected, False=below detection limit (legacy Symbol '<'), None=no qualifier", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "observation", | ||
| sa.Column( | ||
| "uncertainty", | ||
| sa.Float(), | ||
| nullable=True, | ||
| comment="Measurement uncertainty for the observation value", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "observation", | ||
| sa.Column( | ||
| "analysis_agency", | ||
| sa.String(), | ||
| nullable=True, | ||
| comment="Agency or lab that performed the analysis", | ||
| ), | ||
| ) | ||
| op.create_unique_constraint( | ||
| "uq_observation_nma_pk_chemistryresults", | ||
| "observation", | ||
| ["nma_pk_chemistryresults"], | ||
| ) | ||
|
|
||
| # Observation version table (sqlalchemy-continuum) | ||
| op.add_column( | ||
| "observation_version", | ||
| sa.Column( | ||
| "nma_pk_chemistryresults", | ||
| sa.String(), | ||
| autoincrement=False, | ||
| nullable=True, | ||
| comment="NM_Aquifer GlobalID for chemistry results — transfer audit and idempotent upsert key", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "observation_version", | ||
| sa.Column( | ||
| "detect_flag", | ||
| sa.Boolean(), | ||
| autoincrement=False, | ||
| nullable=True, | ||
| comment="True=detected, False=below detection limit (legacy Symbol '<'), None=no qualifier", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "observation_version", | ||
| sa.Column( | ||
| "uncertainty", | ||
| sa.Float(), | ||
| autoincrement=False, | ||
| nullable=True, | ||
| comment="Measurement uncertainty for the observation value", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "observation_version", | ||
| sa.Column( | ||
| "analysis_agency", | ||
| sa.String(), | ||
| autoincrement=False, | ||
| nullable=True, | ||
| comment="Agency or lab that performed the analysis", | ||
| ), | ||
| ) | ||
|
|
||
| # Sample table: 3 new columns | ||
| op.add_column( | ||
| "sample", | ||
| sa.Column( | ||
| "nma_pk_chemistrysample", | ||
| sa.String(), | ||
| nullable=True, | ||
| comment="NM_Aquifer SamplePtID for chemistry samples — transfer audit key", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "sample", | ||
| sa.Column( | ||
| "volume", | ||
| sa.Float(), | ||
| nullable=True, | ||
| comment="Volume of the sample collected", | ||
| ), | ||
| ) | ||
| op.add_column( | ||
| "sample", | ||
| sa.Column( | ||
| "volume_unit", | ||
| sa.String(), | ||
| nullable=True, | ||
| comment="Unit for the sample volume (e.g. mL, L)", | ||
| ), | ||
| ) | ||
| op.create_unique_constraint( | ||
| "uq_sample_nma_pk_chemistrysample", | ||
| "sample", | ||
| ["nma_pk_chemistrysample"], | ||
| ) | ||
|
|
||
| # Drop stale thing_id column from NMA_Radionuclides (model no longer defines it; | ||
| # relationships go through NMA_Chemistry_SampleInfo.thing_id instead). | ||
| # Guard against environments where the column was already removed. | ||
| conn = op.get_bind() | ||
| has_thing_id = conn.execute( | ||
| sa.text( | ||
| "SELECT 1 FROM information_schema.columns " | ||
| "WHERE table_name = 'NMA_Radionuclides' AND column_name = 'thing_id'" | ||
| ) | ||
| ).scalar() | ||
| if has_thing_id: | ||
| # FK name may differ across environments; look it up dynamically. | ||
| fks = sa.inspect(conn).get_foreign_keys("NMA_Radionuclides") | ||
| for fk in fks: | ||
| if "thing_id" in fk["constrained_columns"] and fk.get("name"): | ||
| op.drop_constraint(fk["name"], "NMA_Radionuclides", type_="foreignkey") | ||
| op.drop_column("NMA_Radionuclides", "thing_id") | ||
|
|
||
|
|
||
| def downgrade() -> None: | ||
| """Remove chemistry backfill columns.""" | ||
| # Restore NMA_Radionuclides.thing_id (add nullable, backfill, then enforce) | ||
| op.add_column( | ||
| "NMA_Radionuclides", | ||
| sa.Column( | ||
| "thing_id", | ||
| sa.Integer(), | ||
| nullable=True, | ||
| ), | ||
| ) | ||
| op.execute( | ||
| 'UPDATE "NMA_Radionuclides" r ' | ||
| "SET thing_id = csi.thing_id " | ||
| 'FROM "NMA_Chemistry_SampleInfo" csi ' | ||
| "WHERE r.chemistry_sample_info_id = csi.id" | ||
| ) | ||
| op.alter_column("NMA_Radionuclides", "thing_id", nullable=False) | ||
| op.create_foreign_key( | ||
| "NMA_Radionuclides_thing_id_fkey", | ||
| "NMA_Radionuclides", | ||
| "thing", | ||
| ["thing_id"], | ||
| ["id"], | ||
| ondelete="CASCADE", | ||
| ) | ||
|
|
||
| op.drop_constraint("uq_sample_nma_pk_chemistrysample", "sample", type_="unique") | ||
| op.drop_column("sample", "volume_unit") | ||
| op.drop_column("sample", "volume") | ||
| op.drop_column("sample", "nma_pk_chemistrysample") | ||
|
|
||
| op.drop_column("observation_version", "analysis_agency") | ||
| op.drop_column("observation_version", "uncertainty") | ||
| op.drop_column("observation_version", "detect_flag") | ||
| op.drop_column("observation_version", "nma_pk_chemistryresults") | ||
|
|
||
| op.drop_constraint( | ||
| "uq_observation_nma_pk_chemistryresults", "observation", type_="unique" | ||
| ) | ||
| op.drop_column("observation", "analysis_agency") | ||
| op.drop_column("observation", "uncertainty") | ||
| op.drop_column("observation", "detect_flag") | ||
| op.drop_column("observation", "nma_pk_chemistryresults") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ug/Lis used as the default unit for MinorTraceChemistry Parameters/Observations, but it is not present incore/lexicon.json(andParameter.default_unit/Observation.unitare lexicon-backed). This will cause FK/validation failures when running the backfill against a freshly initialized database (outside the BDD test seeding, which insertsug/Lmanually). Add aug/Lunit term to the lexicon (or switch to an existing unit term already seeded).