fix: improve error handling and logging for recording interval estimation feat: add auto-generation prefix extraction for well IDs with new regex support by jirhiker · Pull Request #531 · DataIntegrationGroup/OcotilloAPI

jirhiker · 2026-02-16T19:30:27Z

Summary

This PR improves well inventory auto-generation prefix handling and hardens transfer logging/error behavior for sensors and wells.

Changes

Well inventory auto-generation prefix extraction

Added _extract_autogen_prefix support for:
- direct prefixes like XY-
- placeholder tokens like WL-XXXX, SAC-XXXX
- optional whitespace normalization around hyphen (e.g. ABC -xxxx -> ABC-)
- blank input defaulting to NM-
Expanded tests to cover supported/unsupported forms, including upper-bound rejection for 4-letter prefixes (ABCD- -> None).

ConstructionMethod normalization in well transfer

In /Users/jross/Programming/DIG/OcotilloAPI/transfers/well_transfer.py, row.ConstructionMethod is now stripped before lexicon lookup.
This prevents lookup misses caused by leading/trailing whitespace in legacy CSV values.

Sensor recording interval error handling/logging

In /Users/jross/Programming/DIG/OcotilloAPI/transfers/sensor_transfer.py, recording interval estimation failure handling was tightened:
- clearer capture/log message when interval estimation fails
- error context now explicitly states estimation failure and includes estimator error details
Improves observability when RecordingInterval is invalid/missing and fallback estimation cannot derive a value.

Why

Reduce false negatives from messy source data (ConstructionMethod whitespace).
Improve diagnostics for sensor transfer failures.
Make well ID auto-generation rules explicit, testable, and resilient to input variation.

Validation

Added/updated unit coverage in /Users/jross/Programming/DIG/OcotilloAPI/tests/test_well_inventory.py.
Local targeted test run was attempted but blocked in this environment by missing utm dependency during test collection.

…er validation, and add CSV feature tests

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…er validation, and add CSV feature tests

…gic for well_name_point_id

…atting

…LI commands

…cessing

…oint_id

…ganizations

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

… in CSV upload

…ontacts in CSV upload

…plicate contacts

…x function

…ocess

…tion feat: add auto-generation prefix extraction for well IDs with new regex support

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3c11d05927

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

transfers/sensor_transfer.py

transfers/well_transfer.py

services/well_inventory_csv.py

Copilot

Pull request overview

This pull request aims to improve error handling and logging for recording interval estimation in sensor transfers, and add auto-generation prefix extraction for well IDs with new regex support. However, the implementation is incomplete and contains critical bugs.

Changes:

Added error message improvements for recording interval estimation failures in sensor transfers
Introduced new autogen prefix extraction function for well IDs with support for 2-3 letter uppercase prefixes
Added .strip() call for ConstructionMethod to handle whitespace in well transfer processing
Commented out the old _step method in well_transfer.py (145 lines)

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 6 comments.

File	Description
transfers/sensor_transfer.py	Enhanced error logging for recording interval estimation with more detailed error messages, but introduced inverted conditional logic
services/well_inventory_csv.py	Added incomplete autogen prefix extraction function with new regex patterns and duplicate imports
tests/test_well_inventory.py	Added comprehensive test cases for the new prefix extraction feature that will fail due to incomplete implementation
transfers/well_transfer.py	Commented out old `_step` method and added whitespace handling for ConstructionMethod field

Comments suppressed due to low confidence (1)

transfers/sensor_transfer.py:215

The condition logic appears to be inverted. The code logs success and captures an "estimated" error when recording_interval is None, but logs a critical error when recording_interval has a value. This is backwards - when recording_interval is None (the error case), the else block at line 217 should execute, and when recording_interval has a value (success), the if block should execute. The condition should be "if recording_interval is not None:" instead of "if recording_interval is None:".

            if recording_interval is None:
                recording_interval_unit = unit
                logger.info(
                    f"name={sensor.name}, serial_no={sensor.serial_no}. "
                    f"estimated recording interval: {recording_interval} {unit}"
                )
                self._capture_error(
                    pointid,
                    f"Estimated recording interval={recording_interval} {unit}. Is this correct?",
                    "RecordingInterval",
                )

services/well_inventory_csv.py

transfers/well_transfer.py

transfers/sensor_transfer.py

tests/test_well_inventory.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (2)

transfers/well_transfer.py:420

This large block of commented-out code (142 lines) should be removed rather than left in the codebase. Commented-out code creates maintenance burden and confusion. If this code needs to be preserved for reference, it should be retrievable from version control history.

    # def _step(self, session: Session, df: pd.DataFrame, i: int, row: pd.Series):
    #
    #     try:
    #         first_visit_date = get_first_visit_date(row)
    #         well_purposes = (
    #             [] if isna(row.CurrentUse) else self._extract_well_purposes(row)
    #         )
    #         well_casing_materials = (
    #             [] if isna(row.CasingDescription) else extract_casing_materials(row)
    #         )
    #         well_pump_type = extract_well_pump_type(row)
    #
    #         wcm = None
    #         if notna(row.ConstructionMethod):
    #             wcm = self._get_lexicon_value(
    #                 row, f"LU_ConstructionMethod:{row.ConstructionMethod}", "Unknown"
    #             )
    #
    #         mpheight = row.MPHeight
    #         mpheight_description = row.MeasuringPoint
    #         if mpheight is None:
    #             mphs = self._measuring_point_estimator.estimate_measuring_point_height(
    #                 row
    #             )
    #             if mphs:
    #                 try:
    #                     mpheight = mphs[0][0]
    #                     mpheight_description = mphs[1][0]
    #                 except IndexError:
    #                     if self.verbose:
    #                         logger.warning(
    #                             f"Measuring point height estimation failed for well {row.PointID}, {mphs}"
    #                         )
    #
    #         data = CreateWell(
    #             location_id=0,
    #             name=row.PointID,
    #             first_visit_date=first_visit_date,
    #             hole_depth=row.HoleDepth,
    #             well_depth=row.WellDepth,
    #             well_casing_diameter=(
    #                 row.CasingDiameter * 12 if row.CasingDiameter else None
    #             ),
    #             well_casing_depth=row.CasingDepth,
    #             release_status="public" if row.PublicRelease else "private",
    #             measuring_point_height=mpheight,
    #             measuring_point_description=mpheight_description,
    #             notes=(
    #                 [{"content": row.Notes, "note_type": "General"}]
    #                 if row.Notes
    #                 else []
    #             ),
    #             well_completion_date=row.CompletionDate,
    #             well_driller_name=row.DrillerName,
    #             well_construction_method=wcm,
    #             well_pump_type=well_pump_type,
    #         )
    #
    #         CreateWell.model_validate(data)
    #     except ValidationError as e:
    #         self._capture_validation_error(row.PointID, e)
    #         return
    #
    #     well = None
    #     try:
    #         well_data = data.model_dump(exclude=EXCLUDED_FIELDS)
    #         well_data["thing_type"] = "water well"
    #         well_data["nma_pk_welldata"] = row.WellID
    #         well_data["nma_pk_location"] = row.LocationId
    #
    #         well = Thing(**well_data)
    #         session.add(well)
    #
    #         if well_purposes:
    #             for wp in well_purposes:
    #                 # TODO: add validation logic here
    #                 if wp in WellPurposeEnum:
    #                     wp_obj = WellPurpose(thing=well, purpose=wp)
    #                     session.add(wp_obj)
    #                 else:
    #                     logger.critical(f"{well.name}. Invalid well purpose: {wp}")
    #
    #         if well_casing_materials:
    #             for wcm in well_casing_materials:
    #                 # TODO: add validation logic here
    #                 if wcm in WellCasingMaterialEnum:
    #                     wcm_obj = WellCasingMaterial(thing=well, material=wcm)
    #                     session.add(wcm_obj)
    #                 else:
    #                     logger.critical(
    #                         f"{well.name}. Invalid well casing material: {wcm}"
    #                     )
    #     except Exception as e:
    #         if well is not None:
    #             session.expunge(well)
    #
    #         self._capture_error(row.PointID, str(e), "UnknownField")
    #
    #         logger.critical(f"Error creating well for {row.PointID}: {e}")
    #         return
    #
    #     try:
    #         location, elevation_method, notes = make_location(
    #             row, self._cached_elevations
    #         )
    #         session.add(location)
    #         # session.flush()
    #         self._added_locations[row.PointID] = (elevation_method, notes)
    #     except Exception as e:
    #         import traceback
    #
    #         traceback.print_exc()
    #         self._capture_error(row.PointID, str(e), str(e), "Location")
    #         logger.critical(f"Error making location for {row.PointID}: {e}")
    #
    #         return
    #
    def _extract_well_purposes(self, row) -> list[str]:
        cu = row.CurrentUse

        if isna(cu):
            return []
        else:
            purposes = []
            for cui in cu:
                if cui == "A":
                    # skip "Open, unequipped well" as that gets mapped to the status_history table
                    continue
                p = self._get_lexicon_value(row, f"LU_CurrentUse:{cui}")
                if p is not None:
                    purposes.append(p)
            return purposes

    def _add_formation_zone(self, row, well, formations):
        # --- Set Formation Completion (NOT depth-based stratigraphy) ---
        # This simply records which formation the well was completed in.
        # For detailed depth-interval stratigraphy, see stratigraphy_transfer.py

        formation_code = row.FormationZone

transfers/well_transfer.py:649

The PR description states that ConstructionMethod normalization prevents lookup misses caused by whitespace, but the implementation only strips leading/trailing whitespace. It does not handle internal whitespace that might exist in legacy CSV values (e.g., "Hand Dug" vs "HandDug"). Consider whether internal whitespace normalization is also needed for robust matching.

                name=row.PointID,
                first_visit_date=first_visit_date,
                hole_depth=row.HoleDepth,
                well_depth=row.WellDepth,

transfers/sensor_transfer.py

services/well_inventory_csv.py

transfers/sensor_transfer.py

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

Comments suppressed due to low confidence (1)

transfers/sensor_transfer.py:209

The success/failure branches for recording interval estimation are inverted. When estimate_recording_interval fails it appears to return recording_interval=None (with an error), but the current code treats None as a successful estimate and treats non-None as a failure. Swap the condition to check for a non-None interval (and only log/capture the "Estimated recording interval" message on success).

            if recording_interval is not None:
                recording_interval_unit = unit
                logger.info(
                    f"name={sensor.name}, serial_no={sensor.serial_no}. "
                    f"estimated recording interval: {recording_interval} {unit}"

services/well_inventory_csv.py

transfers/well_transfer.py

tests/test_well_inventory.py

Improve well-inventory CLI feedback and validation handling; add real-user CSV feature coverage

…tion feat: add auto-generation prefix extraction for well IDs with new regex support

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

…tions

jirhiker and others added 30 commits February 14, 2026 23:19

chore: update pydantic and pydantic-core versions, enhance phone numb…

ee350ea

…er validation, and add CSV feature tests

Formatting changes

1936f9a

chore: update pydantic and pydantic-core versions, enhance phone numb…

b93b00c

…er validation, and add CSV feature tests

chore: update phone validation output format in CLI tests

40fbe54

Formatting changes

f70ec28

Update schemas/well_inventory.py

4c1156b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Formatting changes

783a6ab

delete file

9c06f8c

Apply suggestions from code review

70cc08c

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Formatting changes

06c2120

chore: update pydantic and pydantic-core versions, enhance phone numb…

23ce228

…er validation, and add CSV feature tests

chore: update CSV validation scenarios and improve auto-generation lo…

d03b553

…gic for well_name_point_id

Formatting changes

87d1315

chore: limit displayed validation errors to 10 and update output form…

f8496cf

…atting

feat: add theme support and improve validation output formatting in C…

0a76f6b

…LI commands

Formatting changes

b822c6f

feat: add validation for missing well_name_point_id column in CSV pro…

3b7c561

…cessing

test: update test for blank well_name_point_id to auto-generate IDs

c9d1305

test: update CSV test to include a valid row with a blank well_name_p…

6e895ca

…oint_id

feat: enhance CSV processing to handle duplicate contact names and or…

21ad925

…ganizations

Update services/well_inventory_csv.py

7c081d4

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update tests/features/environment.py

9765313

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update tests/test_cli_commands.py

f5d9013

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

refactor: rename step implementations for clarity and consistency

619f59f

test: add test for handling multiple contacts with null organizations…

95f1426

… in CSV upload

test: remove redundant test for handling multiple null organization c…

81a324e

…ontacts in CSV upload

test: streamline CSV upload tests for blank well_name_point_id and du…

1d6d697

…plicate contacts

fix: update type hint for well_id parameter in _extract_autogen_prefi…

729faba

…x function

fix: enhance error handling and validation reporting in CSV upload pr…

f8ceb2c

…ocess

fix: improve error handling and logging for recording interval estima…

3c11d05

…tion feat: add auto-generation prefix extraction for well IDs with new regex support

jirhiker requested review from Copilot and marissafichera February 16, 2026 19:30

Copilot started reviewing on behalf of jirhiker February 16, 2026 19:30 View session

chatgpt-codex-connector bot reviewed Feb 16, 2026

View reviewed changes

transfers/sensor_transfer.py Outdated Show resolved Hide resolved

transfers/well_transfer.py Show resolved Hide resolved

services/well_inventory_csv.py Show resolved Hide resolved

fix: enhance autogen value handling with regex validation

5338013

Copilot AI reviewed Feb 16, 2026

View reviewed changes

Update services/well_inventory_csv.py

9f66270

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 16, 2026 19:34

Copilot started reviewing on behalf of jirhiker February 16, 2026 19:35 View session

jirhiker and others added 2 commits February 16, 2026 12:37

Update services/well_inventory_csv.py

41dd2e2

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update transfers/well_transfer.py

f37a852

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI reviewed Feb 16, 2026

View reviewed changes

transfers/sensor_transfer.py Outdated Show resolved Hide resolved

services/well_inventory_csv.py Show resolved Hide resolved

transfers/sensor_transfer.py Outdated Show resolved Hide resolved

Update transfers/sensor_transfer.py

0daea1f

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Copilot AI review requested due to automatic review settings February 16, 2026 19:39

Copilot started reviewing on behalf of jirhiker February 16, 2026 19:39 View session

jirhiker added 2 commits February 16, 2026 12:39

fix: correct logic for recording interval check in sensor_transfer.py

ee8d8db

fix: remove unsupported pattern handling in well_inventory_csv.py

d95904b

Copilot AI reviewed Feb 16, 2026

View reviewed changes

services/well_inventory_csv.py Outdated Show resolved Hide resolved

services/well_inventory_csv.py Show resolved Hide resolved

transfers/well_transfer.py Show resolved Hide resolved

tests/test_well_inventory.py Show resolved Hide resolved

jirhiker and others added 9 commits February 16, 2026 12:44

Merge pull request #521 from DataIntegrationGroup/well-inventory-csv-fix

6599c42

Improve well-inventory CLI feedback and validation handling; add real-user CSV feature coverage

fix: improve error handling and logging for recording interval estima…

dcd49b4

…tion feat: add auto-generation prefix extraction for well IDs with new regex support

Update services/well_inventory_csv.py

5b7df1b

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update services/well_inventory_csv.py

19f8de1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update transfers/well_transfer.py

b363ace

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

Update transfers/sensor_transfer.py

7c6bab5

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>

fix: correct logic for recording interval check in sensor_transfer.py

066ab6a

fix: remove unsupported pattern handling in well_inventory_csv.py

08c4beb

fix: add imports for shapely and sqlalchemy to support database opera…

8896bb6

…tions

jirhiker merged commit db0dc8f into staging Feb 16, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: improve error handling and logging for recording interval estimation feat: add auto-generation prefix extraction for well IDs with new regex support#531

fix: improve error handling and logging for recording interval estimation feat: add auto-generation prefix extraction for well IDs with new regex support#531
jirhiker merged 46 commits intostagingfrom
transfer-fix-review-feedback

jirhiker commented Feb 16, 2026 •

edited

Loading

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jirhiker commented Feb 16, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes

Why

Validation

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jirhiker commented Feb 16, 2026 •

edited

Loading