-
Notifications
You must be signed in to change notification settings - Fork 4
Well Inventory Import Reference.md
This page documents the well inventory CSV ingestion behavior in detail, covering the import workflow, transaction model, field mappings, validation rules, and operational notes.
For a broader overview of all CSV import surfaces, see CSV Ingestion and Validation.
- The CLI reads the source file as bytes
- UTF-8 with BOM is supported via
utf-8-sigdecoding so the first header is parsed correctly
- File must be a
.csv - File cannot be empty and must contain at least one data row
- Delimiter must be a comma
- Duplicate header names are rejected
- Maximum row count: 2000
- Each row is parsed into
WellInventoryRowinschemas/well_inventory.py -
well_name_point_idsupports auto-generation if blank or if a placeholder token such asNM-XXXXorWL-XXXXis provided - Validation is best-effort: valid rows continue, invalid rows are collected into
validation_errors, and one bad row does not abort the full file
- The import runs inside one outer DB transaction
- Each row is processed inside a nested transaction (savepoint)
- A failed row rolls back only its own work; a successful row remains pending until the final commit
-
Monitoring Plangroups are created lazily — a fully failed project does not leave behind an empty group
The outer transaction commits once after all rows are processed. The response includes:
summary.total_rows_processedsummary.total_rows_importedsummary.validation_errors_or_warningsvalidation_errorswells
Interactive terminal sessions emit progress messages during validation and import (validation progress, project-level and per-row import starts, periodic counts, final completion counts). Progress output is suppressed for non-interactive sessions.
The importer is best-effort, single-commit, and idempotent on rerun.
Best-effort row handling — rows are imported independently via savepoints; a row-level failure does not abort other rows.
Single final commit — successful rows are not visible to other DB sessions until session.commit(). Long-running imports appear to "land all at once" from another DB connection.
Idempotent reruns — if a row appears to have been imported already, the importer skips recreating the related records. Rerun behavior is skip existing, not update existing. Re-importing a CSV with new values for already-imported rows does not backfill or overwrite those fields.
For fields that persist to multiple places, the same source column is repeated on multiple rows so each destination remains easy to scan.
| Source CSV Column | Schema Field | DB Table | DB Column | Validation / Normalization | Persistence Behavior |
|---|---|---|---|---|---|
project |
project |
group |
name |
Required. | Finds an existing Monitoring Plan group or creates one lazily when the first row for that project imports successfully. |
well_name_point_id |
well_name_point_id |
thing |
name |
Required unless blank or placeholder token triggers auto-generation. Must be unique per file. | Primary well identifier. |
site_name |
site_name |
thing_id_link |
alternate_id |
Blank → None. |
Stored as NMBGMR alternate ID with relation same_as. |
ose_well_record_id |
ose_well_record_id |
thing_id_link |
alternate_id |
Blank → None. |
Stored as NMOSE alternate ID with relation same_as. |
date_time |
date_time |
thing |
first_visit_date |
Required. Must be timezone-naive ISO datetime. Converted to America/Denver, then stored as UTC-aware. | Sets the well's first visit date. |
date_time |
date_time |
field_event |
event_date |
Required. | Used as the initial well inventory field-event timestamp. |
utm_easting, utm_northing, utm_zone
|
utm_easting, utm_northing, utm_zone
|
location |
point |
Required. utm_zone must be 12N or 13N. Coordinates must resolve inside New Mexico. |
Transformed from UTM to WGS84 and stored as the location geometry. |
elevation_ft |
elevation_ft |
location |
elevation |
Optional. Blank → None. |
Converted from feet to meters. Missing values currently store as 0.0. |
public_availability_acknowledgement |
public_availability_acknowledgement |
location |
release_status |
Blank → None. |
True → public, False → private, blank/missing → draft. |
elevation_method |
elevation_method |
data_provenance |
collection_method |
Lexicon-backed, case-insensitive, whitespace-tolerant. Blank → None. |
Creates location elevation provenance. Missing value falls back to Unknown. |
total_well_depth_ft |
total_well_depth_ft |
thing |
well_depth |
Optional. Blank → None. |
Passed through CreateWell. |
depth_source |
depth_source |
data_provenance |
origin_type |
Lexicon-backed, case-insensitive, whitespace-tolerant. Blank → None. |
Creates well depth provenance. Also used when building the historic depth note. |
date_drilled |
date_drilled |
thing |
well_completion_date |
Optional. Blank → None. Must be past or today. |
Passed through CreateWell. |
completion_source |
completion_source |
data_provenance |
origin_type |
Lexicon-backed, case-insensitive, whitespace-tolerant. Blank → None. |
Creates completion-date provenance. |
measuring_point_height_ft |
measuring_point_height_ft |
measuring_point_history |
measuring_point_height |
Optional. Blank → None. If both measuring_point_height_ft and mp_height are present and unequal, the row fails. |
Primary well-level measuring point height. |
measuring_point_description |
measuring_point_description |
measuring_point_history |
measuring_point_description |
Blank → None. |
Stored with the measuring-point history row. |
well_purpose |
well_purpose |
well_purpose |
purpose |
Lexicon-backed. Blank → None. |
Adds a well purpose row. |
well_purpose_2 |
well_purpose_2 |
well_purpose |
purpose |
Lexicon-backed. Blank → None. |
Adds a second well purpose row when present. |
well_status / well_hole_status
|
well_status |
status_history |
status_value |
Alias supported via well_hole_status. Lexicon-backed. Blank → None. |
Creates Well Status history. |
monitoring_status |
monitoring_status |
status_history |
status_value |
Lexicon-backed. Blank → None. |
Creates Monitoring Status history. |
monitoring_frequency |
monitoring_frequency |
monitoring_frequency_history |
monitoring_frequency |
Lexicon-backed. Blank → None. Legacy Complete normalized to monitoring_frequency = None and monitoring_status = Not currently monitored. |
Creates monitoring-frequency history only when a real frequency remains after normalization. |
is_open |
is_open |
status_history |
status_value |
Blank → None. |
Creates Open Status history as Open or Closed. |
datalogger_possible |
datalogger_possible |
status_history |
status_value |
Blank → None. |
Creates Datalogger Suitability Status history. |
specific_location_of_well |
specific_location_of_well |
note |
content |
Blank → None. |
Stored as Access note on the well. |
contact_special_requests_notes |
contact_special_requests_notes |
note |
content |
Blank → None. |
Stored as General note on the well and on each created contact. |
sampling_scenario_notes |
sampling_scenario_notes |
note |
content |
Blank → None. |
Stored as Sampling Procedure note on the well. |
well_measuring_notes |
well_measuring_notes |
note |
content |
Blank → None. |
Stored as Sampling Procedure note on the well. |
well_notes |
well_notes |
note |
content |
Blank → None. |
Stored as General note on the well. |
water_notes |
water_notes |
note |
content |
Blank → None. |
Stored as Water note on the well. |
directions_to_site |
directions_to_site |
note |
content |
Blank → None. |
Stored as Directions note on the location. |
historic_depth_to_water_ft |
historic_depth_to_water_ft |
note |
content |
Optional. Blank → None. |
When present, a formatted Historical note is created on the well using depth_source as the source text. |
field_staff |
field_staff |
contact |
name |
Required. | Reused or created as NMBGMR Field Event Participant contact. |
field_staff |
field_staff |
field_event_participant |
contact_id |
Required. | Linked to field event with participant role Lead. |
field_staff_2 / field_staff_3
|
field_staff_2 / field_staff_3
|
field_event_participant |
contact_id |
Blank → None. |
Linked to field event with participant role Participant. |
contact_n_name, contact_n_organization, contact_n_role, contact_n_type
|
contact_n_* |
contact |
name, organization, role, contact_type
|
If any contact data is present, at least one of name/organization is required, plus role and type. Organization is lexicon-backed. | Creates or reuses a contact, then links it to the well. |
contact_n_phone_i, contact_n_phone_i_type
|
contact_n_phone_i, contact_n_phone_i_type
|
phone |
phone_number, phone_type
|
Phone format validated. Type required when phone is present. | Added under the created contact. |
contact_n_email_i, contact_n_email_i_type
|
contact_n_email_i, contact_n_email_i_type
|
email |
email, email_type
|
Email format validated. Type required when email is present. | Added under the created contact. |
contact_n_address_i_* |
contact_n_address_i_* |
address |
address fields | If any address fields are present, line 1, city, state, postal code, and type are all required. State and ZIP are validated. | Added under the created contact. |
result_communication_preference |
result_communication_preference |
note |
content |
Blank → None. |
Stored as Communication note on each created contact. |
repeat_measurement_permission |
repeat_measurement_permission |
permission_history |
permission_allowed |
Blank → None. Requires a contact. |
Creates Water Level Sample permission. |
sampling_permission |
sampling_permission |
permission_history |
permission_allowed |
Blank → None. Requires a contact. |
Creates Water Chemistry Sample permission. |
datalogger_installation_permission |
datalogger_installation_permission |
permission_history |
permission_allowed |
Blank → None. Requires a contact. |
Creates Datalogger Installation permission. |
measurement_date_time / water_level_date_time
|
measurement_date_time |
sample |
sample_date |
Alias supported via water_level_date_time. Required only when depth_to_water_ft is provided. |
Creates a groundwater-level sample when present. |
measurement_date_time / water_level_date_time
|
measurement_date_time |
observation |
observation_datetime |
Alias supported. | Creates a groundwater-level observation when present. |
sample_method |
sample_method |
sample |
sample_method |
Lexicon-backed. Blank → None. |
Missing value falls back to Unknown. |
water_level_notes |
water_level_notes |
sample |
notes |
Blank → None. |
Stored on the sample. |
water_level_notes |
water_level_notes |
observation |
notes |
Blank → None. |
Stored on the observation. |
depth_to_water_ft |
depth_to_water_ft |
observation |
value |
Blank → None. Requires measurement_date_time when provided. |
Persisted as the groundwater-level measurement. Blank values still allow attempted measurement records when measurement_date_time exists. |
mp_height / mp_height_ft
|
mp_height |
observation |
measuring_point_height |
Alias supported via mp_height_ft. Blank → None. |
Used for groundwater-level observation measuring point height. |
level_status |
level_status |
observation |
groundwater_level_reason |
Lexicon-backed. Blank → None. |
Stored as the observation reason/status. |
data_quality |
data_quality |
observation |
nma_data_quality |
Lexicon-backed. Blank → None. |
Stored on the observation when provided. |
Additional tables touched indirectly by the importer:
thing_contact_associationgroup_thing_associationlocation_thing_association-
field_activity(well inventory) -
field_activity(groundwater level, when a measurement event exists)
project-
well_name_point_id(or a supported auto-generation placeholder) date_timefield_staff-
utm_easting,utm_northing,utm_zone
Fields such as site_name, elevation_ft, elevation_method, measuring_point_height_ft, and depth_to_water_ft are optional.
- Many optional CSV fields normalize blank strings to
None - This includes optional lexicon-backed fields, contact organization, well status, and depth-to-water
- Whitespace-only lexicon values are treated as blank
The following fields are validated against lexicon-backed enums. Matching is case-insensitive and whitespace-tolerant, but strict with respect to the underlying allowed terms:
elevation_methoddepth_sourcewell_pump_typemonitoring_statuswell_statussample_methoddata_qualitymonitoring_frequency
The legacy source value monitoring_frequency = Complete is special-cased before validation:
-
monitoring_frequency→None -
monitoring_status→Not currently monitored
This prevents Complete from being treated as a real lexicon term.
For each contact block (contact_1_*, contact_2_*):
- If any contact data is provided, at least one of
contact_n_nameorcontact_n_organizationis required, along withcontact_n_roleandcontact_n_type - If a phone is provided, its phone type is required
- If an email is provided, its email type is required
- If any address fields are provided, the full required set must be present: line 1, type, city, state, and postal code
-
measurement_date_timeis required only whendepth_to_water_ftis provided - Blank
depth_to_water_ftis treated as missing, not invalid - A row may still create sample/observation records when the attempted
depth_to_water_ftis blank, as long as a measurement event exists
-
utm_zonemust be12Nor13N - UTM coordinates must resolve to a location within New Mexico
The source CSV can validate successfully and still fail during persistence if the target DB is missing required lexicon terms. This is especially relevant for contact organizations, status values, and monitoring frequencies. If the DB has not been initialized or updated with the current lexicon set, imports can fail with foreign key errors such as contact_organization_fkey.
For nullable lexicon-backed fields, the importer prefers None over a default value unless the business explicitly wants a specific stored term.
Test database behavior — BDD tests default to ocotilloapi_test. After schema or lexicon changes, rebuilding the test DB may be necessary to avoid stale state.
Expected summary semantics — the final summary reports processed rows, imported rows, and rows with issues. For reruns, a row may count as successfully handled even when the importer skips recreating already-imported records.
- Reruns do not backfill newly introduced fields onto already-imported rows — existing data is skipped, not updated
- Monitoring-status reasons and other future field additions may require one-time backfill logic, read-path updates, and explicit rerun/update policy decisions