Well Inventory Import Reference.md

Well Inventory Import Reference

This page documents the well inventory CSV ingestion behavior in detail, covering the import workflow, transaction model, field mappings, validation rules, and operational notes.

For a broader overview of all CSV import surfaces, see CSV Ingestion and Validation.

Workflow Overview

1. CSV read

The CLI reads the source file as bytes
UTF-8 with BOM is supported via utf-8-sig decoding so the first header is parsed correctly

2. File validation

File must be a .csv
File cannot be empty and must contain at least one data row
Delimiter must be a comma
Duplicate header names are rejected
Maximum row count: 2000

3. Row parsing and schema validation

Each row is parsed into WellInventoryRow in schemas/well_inventory.py
well_name_point_id supports auto-generation if blank or if a placeholder token such as NM-XXXX or WL-XXXX is provided
Validation is best-effort: valid rows continue, invalid rows are collected into validation_errors, and one bad row does not abort the full file

4. Import execution

The import runs inside one outer DB transaction
Each row is processed inside a nested transaction (savepoint)
A failed row rolls back only its own work; a successful row remains pending until the final commit
Monitoring Plan groups are created lazily — a fully failed project does not leave behind an empty group

5. Final commit and response

The outer transaction commits once after all rows are processed. The response includes:

summary.total_rows_processed
summary.total_rows_imported
summary.validation_errors_or_warnings
validation_errors
wells

6. CLI progress output

Interactive terminal sessions emit progress messages during validation and import (validation progress, project-level and per-row import starts, periodic counts, final completion counts). Progress output is suppressed for non-interactive sessions.

Transaction and Rerun Behavior

The importer is best-effort, single-commit, and idempotent on rerun.

Best-effort row handling — rows are imported independently via savepoints; a row-level failure does not abort other rows.

Single final commit — successful rows are not visible to other DB sessions until session.commit(). Long-running imports appear to "land all at once" from another DB connection.

Idempotent reruns — if a row appears to have been imported already, the importer skips recreating the related records. Rerun behavior is skip existing, not update existing. Re-importing a CSV with new values for already-imported rows does not backfill or overwrite those fields.

Field Mapping Reference

For fields that persist to multiple places, the same source column is repeated on multiple rows so each destination remains easy to scan.

Source CSV Column	Schema Field	DB Table	DB Column	Validation / Normalization	Persistence Behavior
`project`	`project`	`group`	`name`	Required.	Finds an existing `Monitoring Plan` group or creates one lazily when the first row for that project imports successfully.
`well_name_point_id`	`well_name_point_id`	`thing`	`name`	Required unless blank or placeholder token triggers auto-generation. Must be unique per file.	Primary well identifier.
`site_name`	`site_name`	`thing_id_link`	`alternate_id`	Blank → `None`.	Stored as NMBGMR alternate ID with relation `same_as`.
`ose_well_record_id`	`ose_well_record_id`	`thing_id_link`	`alternate_id`	Blank → `None`.	Stored as NMOSE alternate ID with relation `same_as`.
`date_time`	`date_time`	`thing`	`first_visit_date`	Required. Must be timezone-naive ISO datetime. Converted to America/Denver, then stored as UTC-aware.	Sets the well's first visit date.
`date_time`	`date_time`	`field_event`	`event_date`	Required.	Used as the initial well inventory field-event timestamp.
`utm_easting`, `utm_northing`, `utm_zone`	`utm_easting`, `utm_northing`, `utm_zone`	`location`	`point`	Required. `utm_zone` must be `12N` or `13N`. Coordinates must resolve inside New Mexico.	Transformed from UTM to WGS84 and stored as the location geometry.
`elevation_ft`	`elevation_ft`	`location`	`elevation`	Optional. Blank → `None`.	Converted from feet to meters. Missing values currently store as `0.0`.
`public_availability_acknowledgement`	`public_availability_acknowledgement`	`location`	`release_status`	Blank → `None`.	`True → public`, `False → private`, blank/missing → `draft`.
`elevation_method`	`elevation_method`	`data_provenance`	`collection_method`	Lexicon-backed, case-insensitive, whitespace-tolerant. Blank → `None`.	Creates location elevation provenance. Missing value falls back to `Unknown`.
`total_well_depth_ft`	`total_well_depth_ft`	`thing`	`well_depth`	Optional. Blank → `None`.	Passed through `CreateWell`.
`depth_source`	`depth_source`	`data_provenance`	`origin_type`	Lexicon-backed, case-insensitive, whitespace-tolerant. Blank → `None`.	Creates well depth provenance. Also used when building the historic depth note.
`date_drilled`	`date_drilled`	`thing`	`well_completion_date`	Optional. Blank → `None`. Must be past or today.	Passed through `CreateWell`.
`completion_source`	`completion_source`	`data_provenance`	`origin_type`	Lexicon-backed, case-insensitive, whitespace-tolerant. Blank → `None`.	Creates completion-date provenance.
`measuring_point_height_ft`	`measuring_point_height_ft`	`measuring_point_history`	`measuring_point_height`	Optional. Blank → `None`. If both `measuring_point_height_ft` and `mp_height` are present and unequal, the row fails.	Primary well-level measuring point height.
`measuring_point_description`	`measuring_point_description`	`measuring_point_history`	`measuring_point_description`	Blank → `None`.	Stored with the measuring-point history row.
`well_purpose`	`well_purpose`	`well_purpose`	`purpose`	Lexicon-backed. Blank → `None`.	Adds a well purpose row.
`well_purpose_2`	`well_purpose_2`	`well_purpose`	`purpose`	Lexicon-backed. Blank → `None`.	Adds a second well purpose row when present.
`well_status` / `well_hole_status`	`well_status`	`status_history`	`status_value`	Alias supported via `well_hole_status`. Lexicon-backed. Blank → `None`.	Creates `Well Status` history.
`monitoring_status`	`monitoring_status`	`status_history`	`status_value`	Lexicon-backed. Blank → `None`.	Creates `Monitoring Status` history.
`monitoring_frequency`	`monitoring_frequency`	`monitoring_frequency_history`	`monitoring_frequency`	Lexicon-backed. Blank → `None`. Legacy `Complete` normalized to `monitoring_frequency = None` and `monitoring_status = Not currently monitored`.	Creates monitoring-frequency history only when a real frequency remains after normalization.
`is_open`	`is_open`	`status_history`	`status_value`	Blank → `None`.	Creates `Open Status` history as `Open` or `Closed`.
`datalogger_possible`	`datalogger_possible`	`status_history`	`status_value`	Blank → `None`.	Creates `Datalogger Suitability Status` history.
`specific_location_of_well`	`specific_location_of_well`	`note`	`content`	Blank → `None`.	Stored as `Access` note on the well.
`contact_special_requests_notes`	`contact_special_requests_notes`	`note`	`content`	Blank → `None`.	Stored as `General` note on the well and on each created contact.
`sampling_scenario_notes`	`sampling_scenario_notes`	`note`	`content`	Blank → `None`.	Stored as `Sampling Procedure` note on the well.
`well_measuring_notes`	`well_measuring_notes`	`note`	`content`	Blank → `None`.	Stored as `Sampling Procedure` note on the well.
`well_notes`	`well_notes`	`note`	`content`	Blank → `None`.	Stored as `General` note on the well.
`water_notes`	`water_notes`	`note`	`content`	Blank → `None`.	Stored as `Water` note on the well.
`directions_to_site`	`directions_to_site`	`note`	`content`	Blank → `None`.	Stored as `Directions` note on the location.
`historic_depth_to_water_ft`	`historic_depth_to_water_ft`	`note`	`content`	Optional. Blank → `None`.	When present, a formatted `Historical` note is created on the well using `depth_source` as the source text.
`field_staff`	`field_staff`	`contact`	`name`	Required.	Reused or created as NMBGMR `Field Event Participant` contact.
`field_staff`	`field_staff`	`field_event_participant`	`contact_id`	Required.	Linked to field event with participant role `Lead`.
`field_staff_2` / `field_staff_3`	`field_staff_2` / `field_staff_3`	`field_event_participant`	`contact_id`	Blank → `None`.	Linked to field event with participant role `Participant`.
`contact_n_name`, `contact_n_organization`, `contact_n_role`, `contact_n_type`	`contact_n_*`	`contact`	`name`, `organization`, `role`, `contact_type`	If any contact data is present, at least one of name/organization is required, plus role and type. Organization is lexicon-backed.	Creates or reuses a contact, then links it to the well.
`contact_n_phone_i`, `contact_n_phone_i_type`	`contact_n_phone_i`, `contact_n_phone_i_type`	`phone`	`phone_number`, `phone_type`	Phone format validated. Type required when phone is present.	Added under the created contact.
`contact_n_email_i`, `contact_n_email_i_type`	`contact_n_email_i`, `contact_n_email_i_type`	`email`	`email`, `email_type`	Email format validated. Type required when email is present.	Added under the created contact.
`contact_n_address_i_*`	`contact_n_address_i_*`	`address`	address fields	If any address fields are present, line 1, city, state, postal code, and type are all required. State and ZIP are validated.	Added under the created contact.
`result_communication_preference`	`result_communication_preference`	`note`	`content`	Blank → `None`.	Stored as `Communication` note on each created contact.
`repeat_measurement_permission`	`repeat_measurement_permission`	`permission_history`	`permission_allowed`	Blank → `None`. Requires a contact.	Creates `Water Level Sample` permission.
`sampling_permission`	`sampling_permission`	`permission_history`	`permission_allowed`	Blank → `None`. Requires a contact.	Creates `Water Chemistry Sample` permission.
`datalogger_installation_permission`	`datalogger_installation_permission`	`permission_history`	`permission_allowed`	Blank → `None`. Requires a contact.	Creates `Datalogger Installation` permission.
`measurement_date_time` / `water_level_date_time`	`measurement_date_time`	`sample`	`sample_date`	Alias supported via `water_level_date_time`. Required only when `depth_to_water_ft` is provided.	Creates a groundwater-level sample when present.
`measurement_date_time` / `water_level_date_time`	`measurement_date_time`	`observation`	`observation_datetime`	Alias supported.	Creates a groundwater-level observation when present.
`sample_method`	`sample_method`	`sample`	`sample_method`	Lexicon-backed. Blank → `None`.	Missing value falls back to `Unknown`.
`water_level_notes`	`water_level_notes`	`sample`	`notes`	Blank → `None`.	Stored on the sample.
`water_level_notes`	`water_level_notes`	`observation`	`notes`	Blank → `None`.	Stored on the observation.
`depth_to_water_ft`	`depth_to_water_ft`	`observation`	`value`	Blank → `None`. Requires `measurement_date_time` when provided.	Persisted as the groundwater-level measurement. Blank values still allow attempted measurement records when `measurement_date_time` exists.
`mp_height` / `mp_height_ft`	`mp_height`	`observation`	`measuring_point_height`	Alias supported via `mp_height_ft`. Blank → `None`.	Used for groundwater-level observation measuring point height.
`level_status`	`level_status`	`observation`	`groundwater_level_reason`	Lexicon-backed. Blank → `None`.	Stored as the observation reason/status.
`data_quality`	`data_quality`	`observation`	`nma_data_quality`	Lexicon-backed. Blank → `None`.	Stored on the observation when provided.

Additional tables touched indirectly by the importer:

thing_contact_association
group_thing_association
location_thing_association
field_activity (well inventory)
field_activity (groundwater level, when a measurement event exists)

Validation and Normalization Rules

Required fields

project
well_name_point_id (or a supported auto-generation placeholder)
date_time
field_staff
utm_easting, utm_northing, utm_zone

Fields such as site_name, elevation_ft, elevation_method, measuring_point_height_ft, and depth_to_water_ft are optional.

Blank and missing values

Many optional CSV fields normalize blank strings to None
This includes optional lexicon-backed fields, contact organization, well status, and depth-to-water
Whitespace-only lexicon values are treated as blank

Lexicon-backed fields

The following fields are validated against lexicon-backed enums. Matching is case-insensitive and whitespace-tolerant, but strict with respect to the underlying allowed terms:

elevation_method
depth_source
well_pump_type
monitoring_status
well_status
sample_method
data_quality
monitoring_frequency

Legacy monitoring frequency normalization

The legacy source value monitoring_frequency = Complete is special-cased before validation:

monitoring_frequency → None
monitoring_status → Not currently monitored

This prevents Complete from being treated as a real lexicon term.

Contact rules

For each contact block (contact_1_*, contact_2_*):

If any contact data is provided, at least one of contact_n_name or contact_n_organization is required, along with contact_n_role and contact_n_type
If a phone is provided, its phone type is required
If an email is provided, its email type is required
If any address fields are provided, the full required set must be present: line 1, type, city, state, and postal code

Water level rules

measurement_date_time is required only when depth_to_water_ft is provided
Blank depth_to_water_ft is treated as missing, not invalid
A row may still create sample/observation records when the attempted depth_to_water_ft is blank, as long as a measurement event exists

Coordinates

utm_zone must be 12N or 13N
UTM coordinates must resolve to a location within New Mexico

Database and Lexicon Considerations

The source CSV can validate successfully and still fail during persistence if the target DB is missing required lexicon terms. This is especially relevant for contact organizations, status values, and monitoring frequencies. If the DB has not been initialized or updated with the current lexicon set, imports can fail with foreign key errors such as contact_organization_fkey.

For nullable lexicon-backed fields, the importer prefers None over a default value unless the business explicitly wants a specific stored term.

Operational Notes

Test database behavior — BDD tests default to ocotilloapi_test. After schema or lexicon changes, rebuilding the test DB may be necessary to avoid stale state.

Expected summary semantics — the final summary reports processed rows, imported rows, and rows with issues. For reruns, a row may count as successfully handled even when the importer skips recreating already-imported records.

Known Gaps

Reruns do not backfill newly introduced fields onto already-imported rows — existing data is skipped, not updated
Monitoring-status reasons and other future field additions may require one-time backfill logic, read-path updates, and explicit rerun/update policy decisions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Well Inventory Import Reference.md

Well Inventory Import Reference

Workflow Overview

1. CSV read

2. File validation

3. Row parsing and schema validation

4. Import execution

5. Final commit and response

6. CLI progress output

Transaction and Rerun Behavior

Field Mapping Reference

Validation and Normalization Rules

Required fields

Blank and missing values

Lexicon-backed fields

Legacy monitoring frequency normalization

Contact rules

Water level rules

Coordinates

Database and Lexicon Considerations

Operational Notes

Known Gaps

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally