Legacy Transfer Pipeline.md

Legacy Transfer Pipeline

Purpose

The transfers/ package migrates legacy NM_Aquifer and related source data into the current PostgreSQL/PostGIS schema. This is not a one-off historical artifact — it remains an active operational part of the repository.

Main Entry Point

python -m transfers.transfer

Orchestration Model

transfers/transfer.py runs the pipeline in phases:

Optional schema reset and rebuild
Foundational transfers (parallel)
Well transfer
Non-well location-type transfers (parallel)
Large parallel transfer group for independent domains
Sequential chemistry and sensor-dependent stages
Location cleanup

A separate continuous-water-levels-only path is controlled by environment flags.

Environment Toggles

The orchestrator reads many TRANSFER_* environment variables:

Category	Variables
Data domains	`TRANSFER_WELL_SCREENS`, `TRANSFER_SENSORS`, `TRANSFER_CONTACTS`, `TRANSFER_PERMISSIONS`, `TRANSFER_WATER_LEVELS`, `TRANSFER_CHEMISTRY_*`, `TRANSFER_NGWMN`, `TRANSFER_SURFACE_WATER`, `TRANSFER_WEATHER`, `TRANSFER_NON_WELL_THING_TYPES`
Behavior	`DROP_AND_REBUILD_DB`, `ERASE_AND_REBUILD`, `CLEANUP_LOCATIONS`, `CONTINUOUS_WATER_LEVELS`
Performance	`TRANSFER_LIMIT`, `TRANSFER_TEST_POINTIDS`, `TRANSFER_PARALLEL_WELLS`, `TRANSFER_WORKERS`

.env.example does not list every toggle the code currently honors.

Schema Reset Behavior

If DROP_AND_REBUILD_DB=true, the transfer flow:

Recreates the public schema
Recreates PostGIS
Runs Alembic migrations
Syncs full-text-search triggers
Initializes lexicon data
Initializes parameter data

Outputs

Logs: transfers/logs/
Metrics: transfers/metrics/
Optional upload to GCS bucket (metrics and logs)

Spatial Transformation

During import, coordinates are automatically converted from UTM (NAD83 / SRID 26913) to WGS84 (SRID 4326). Legacy contact records are also normalized via OwnerKey mapping and canonicalization.

Performance Guidance

Avoid ORM-heavy bulk object creation for high-volume tables
Prefer SQLAlchemy Core inserts for large row counts
Keep data migrations idempotent and safe to re-run

These rules matter because many transfer tables contain very large row counts.

Operational Caveats

The transfer script protects against accidentally targeting ocotilloapi_test
It does not fully protect against every wrong local database selection — always inspect .env before running
The transfer script does not document a safe staging target or source credentials in-repo; confirm the target DB with the team before running

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Legacy Transfer Pipeline.md

Legacy Transfer Pipeline

Purpose

Main Entry Point

Orchestration Model

Environment Toggles

Schema Reset Behavior

Outputs

Spatial Transformation

Performance Guidance

Operational Caveats

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally