feat(transfers): add permissions transfer functionality and update configuration by jirhiker · Pull Request #546 · DataIntegrationGroup/OcotilloAPI

jirhiker · 2026-02-25T18:15:00Z

Summary

This PR fixes permissions transfer ordering and adds first-class permissions coverage to transfer options and transfer validation reporting.

Problem

Permissions were silently missing because:

Permissions transfer ran in parallel with Contacts, but permissions require contact associations to exist first.
transfer-results did not include any permissions comparison spec, so missing PermissionHistory rows were not reported.
There was no dedicated transfer toggle for permissions.

Changes

Sequencing fix:
- Removed permissions execution from parallel group 1.
- Run permissions immediately after group 1 completes, so contacts/associations are committed first.
New transfer option:
- Added TransferOptions.transfer_permissions.
- Wired env var TRANSFER_PERMISSIONS (default True).
- Added TRANSFER_PERMISSIONS=True to .env.example.
Transfer-results permissions validation:
- Added PermissionsTransferResult type.
- Added Permissions entry to TRANSFER_COMPARISON_SPECS with option_field="transfer_permissions".
- Added dedicated builder logic for permissions that:
  - derives expected permission keys from WellData (SampleOK, MonitorOK) for wells that have contact associations,
  - derives destination keys from PermissionHistory,
  - compares composite keys: pointid|permission_type|permission_allowed,
  - honors disabled state when TRANSFER_PERMISSIONS=false.

Why this works

Permissions transfer now executes only after its prerequisites are present.
Validation now explicitly checks permissions, so zero/missing permissions are surfaced in oco transfer-results output instead of being invisible.

Files changed

transfers/transfer.py
.env.example
transfers/transfer_results_types.py
transfers/transfer_results_specs.py
transfers/transfer_results_builder.py

Validation

python -m py_compile on updated transfer-results modules and transfer.py passed.
pytest could not be fully run in sandbox due to blocked DB connection to localhost:5432.

…nfiguration

chatgpt-codex-connector

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7a51c38596

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

Open a pull request for review
Mark a draft as ready
Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

chatgpt-codex-connector · 2026-02-25T18:21:25Z

transfers/transfer_results_builder.py

+        .where(PermissionHistory.target_table == "thing")
+        .where(
+            PermissionHistory.permission_type.in_(
+                ("Water Chemistry Sample", "Water Level Sample")
+            )
+        )


Scope permission destination query to water wells

The permissions source side is explicitly limited to Thing.thing_type == "water well", but the destination query only filters by target_table and permission_type. If the database has PermissionHistory rows for non-well things using the same permission types, this audit will report false extras and inflate destination counts, so source and destination are no longer compared over the same population.

Useful? React with 👍 / 👎.

chatgpt-codex-connector · 2026-02-25T18:21:25Z

transfers/transfer_results_builder.py

+        sample_ok = getattr(row, "SampleOK", None)
+        if sample_ok is not None:
+            rows.append(
+                f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"


Parse permission flags instead of casting with bool()

Casting CSV values with bool(sample_ok) treats any non-empty string as True, so values like "False" or "0" will be interpreted as allowed permissions when these columns are string-typed. That produces incorrect source keys and therefore incorrect missing/extra counts in transfer-results output for those datasets.

Useful? React with 👍 / 👎.

Copilot

Pull request overview

This PR adds support for auditing and orchestrating a “Permissions” transfer, including a new transfer toggle and updated transfer-results reporting.

Changes:

Adds a new Permissions entry to the transfer-results auditing specs/types and implements custom comparison logic for PermissionHistory.
Updates the transfer orchestration to run permissions after parallel group 1 (after contacts are available) and adds an env toggle TRANSFER_PERMISSIONS.
Updates docs and CI environment configuration.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 6 comments.

Show a summary per file

File	Description
`transfers/transfer_results_types.py`	Registers `Permissions` as a transfer-results type so it appears in auditing output.
`transfers/transfer_results_specs.py`	Adds `Permissions` to `TRANSFER_COMPARISON_SPECS` and imports `PermissionHistory` + result type.
`transfers/transfer_results_builder.py`	Implements permissions-specific source/destination key building and result computation.
`transfers/transfer.py`	Adds `transfer_permissions` option/env toggle and runs permissions after group 1 completes.
`transfers/README.md`	Documents the transfer auditing CLI usage and output.
`.github/workflows/tests.yml`	Adds `PYGEOAPI_POSTGRES_USER` to CI env.
`.env.example`	Adds `TRANSFER_PERMISSIONS` toggle.

Copilot · 2026-02-25T18:22:20Z

transfers/transfer.py

+        try:
+            result_name, result, elapsed = _execute_permissions_with_timing(
+                "Permissions"
+            )
+            results_map[result_name] = result
+            logger.info(f"Task {result_name} completed in {elapsed:.2f}s")
+        except Exception as e:
+            logger.critical(f"Task Permissions failed: {e}")


The sequential permissions transfer failure log only includes str(e) and drops the stack trace, while other parallel-transfer error paths log traceback.format_exc(). Logging the full traceback here would make diagnosing transfer failures significantly easier.

Copilot · 2026-02-25T18:22:20Z

transfers/transfer_results_builder.py

+    eligible_pointids = {name for (name,) in eligible_rows if name}
+    if not eligible_pointids:
+        return pd.Series([], dtype=object)
+
+    rows: list[str] = []
+    for row in wdf.itertuples(index=False):
+        pointid = getattr(row, "PointID", None)
+        if pointid not in eligible_pointids:
+            continue
+
+        sample_ok = getattr(row, "SampleOK", None)
+        if sample_ok is not None:
+            rows.append(
+                f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"
+            )
+
+        monitor_ok = getattr(row, "MonitorOK", None)
+        if monitor_ok is not None:
+            rows.append(
+                f"{_normalize_key(pointid)}|Water Level Sample|{bool(monitor_ok)}"


In _permissions_source_series, membership is checked against eligible_pointids using the raw PointID value from the CSV, but the produced keys are normalized/lowercased via _normalize_key. This can incorrectly drop eligible wells when casing/whitespace differs between CSV and DB values. Normalize both sides (e.g., build eligible_pointids as normalized keys and compare with _normalize_key(pointid)).

Suggested change

eligible_pointids = {name for (name,) in eligible_rows if name}

if not eligible_pointids:

return pd.Series([], dtype=object)

rows: list[str] = []

for row in wdf.itertuples(index=False):

pointid = getattr(row, "PointID", None)

if pointid not in eligible_pointids:

continue

sample_ok = getattr(row, "SampleOK", None)

if sample_ok is not None:

rows.append(

f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"

)

monitor_ok = getattr(row, "MonitorOK", None)

if monitor_ok is not None:

rows.append(

f"{_normalize_key(pointid)}|Water Level Sample|{bool(monitor_ok)}"

eligible_pointids = {

_normalize_key(name) for (name,) in eligible_rows if _normalize_key(name)

}

if not eligible_pointids:

return pd.Series([], dtype=object)

rows: list[str] = []

for row in wdf.itertuples(index=False):

pointid = getattr(row, "PointID", None)

normalized_pointid = _normalize_key(pointid)

if not normalized_pointid or normalized_pointid not in eligible_pointids:

continue

sample_ok = getattr(row, "SampleOK", None)

if sample_ok is not None:

rows.append(

f"{normalized_pointid}|Water Chemistry Sample|{bool(sample_ok)}"

)

monitor_ok = getattr(row, "MonitorOK", None)

if monitor_ok is not None:

rows.append(

f"{normalized_pointid}|Water Level Sample|{bool(monitor_ok)}"

Copilot · 2026-02-25T18:22:21Z

transfers/transfer_results_builder.py

+    rows: list[str] = []
+    for row in wdf.itertuples(index=False):
+        pointid = getattr(row, "PointID", None)
+        if pointid not in eligible_pointids:
+            continue
+
+        sample_ok = getattr(row, "SampleOK", None)
+        if sample_ok is not None:
+            rows.append(
+                f"{_normalize_key(pointid)}|Water Chemistry Sample|{bool(sample_ok)}"
+            )
+


_permissions_source_series appends permission rows for every matching WellData record, which can inflate counts and introduce duplicate keys if WellData contains duplicate PointID rows. The actual permissions transfer de-duplicates by well (visited) and effectively uses the first matching WellData value per well. Consider de-duping WellData by PointID (stable) before iterating, or aggregating per PointID to mirror transfer behavior.

Copilot · 2026-02-25T18:22:21Z

transfers/transfer_results_builder.py

+        with session_ctx() as session:
+            source_series = (
+                _permissions_source_series(session)
+                if enabled
+                else pd.Series([], dtype=object)
+            )
+            source_keys = set(source_series.unique().tolist())
+            source_keyed_row_count = int(source_series.shape[0])
+            source_duplicate_key_row_count = source_keyed_row_count - len(source_keys)
+            agreed_transfer_row_count = source_keyed_row_count
+


_build_permissions reads WellData into source_df to count rows, but _permissions_source_series re-reads WellData internally. This duplicates I/O and parsing work for large CSVs. Consider passing the already-loaded source_df (or a pre-cleaned version) into _permissions_source_series so the file is only read once.

Suggested change

with session_ctx() as session:

source_series = (

_permissions_source_series(session)

if enabled

else pd.Series([], dtype=object)

)

source_keys = set(source_series.unique().tolist())

source_keyed_row_count = int(source_series.shape[0])

source_duplicate_key_row_count = source_keyed_row_count - len(source_keys)

agreed_transfer_row_count = source_keyed_row_count

if enabled:

# Derive the source key series directly from the already-loaded CSV

if spec.source_key_column in source_df.columns:

source_series = source_df[spec.source_key_column].astype(str)

else:

source_series = pd.Series([], dtype=object)

else:

source_series = pd.Series([], dtype=object)

source_keys = set(source_series.unique().tolist())

source_keyed_row_count = int(source_series.shape[0])

source_duplicate_key_row_count = source_keyed_row_count - len(source_keys)

agreed_transfer_row_count = source_keyed_row_count

with session_ctx() as session:

Copilot · 2026-02-25T18:22:21Z

transfers/transfer_results_builder.py

+            destination_row_count = int(
+                session.execute(
+                    select(func.count())
+                    .select_from(PermissionHistory)
+                    .where(PermissionHistory.target_table == "thing")
+                    .where(
+                        PermissionHistory.permission_type.in_(
+                            ("Water Chemistry Sample", "Water Level Sample")
+                        )
+                    )
+                ).scalar_one()
+            )


destination_row_count for permissions is computed without the same constraints used to build destination_series (inner join to Thing, Thing.name non-null). This can make the reported destination row count inconsistent with the key series (e.g., orphaned PermissionHistory rows or unnamed Things). Align the count query with _permissions_destination_series filters (and any additional filters like end_date is NULL).

Suggested change

destination_row_count = int(

session.execute(

select(func.count())

.select_from(PermissionHistory)

.where(PermissionHistory.target_table == "thing")

.where(

PermissionHistory.permission_type.in_(

("Water Chemistry Sample", "Water Level Sample")

)

)

).scalar_one()

)

destination_row_count = int(destination_series.shape[0])

Copilot · 2026-02-25T18:22:21Z

transfers/transfer_results_builder.py

+    sql = (
+        select(
+            Thing.name.label("point_id"),
+            PermissionHistory.permission_type.label("permission_type"),
+            PermissionHistory.permission_allowed.label("permission_allowed"),
+        )
+        .select_from(PermissionHistory)
+        .join(Thing, Thing.id == PermissionHistory.target_id)
+        .where(PermissionHistory.target_table == "thing")
+        .where(
+            PermissionHistory.permission_type.in_(
+                ("Water Chemistry Sample", "Water Level Sample")
+            )
+        )


_permissions_destination_series pulls all PermissionHistory rows for the permission types, including historical/ended records. This can create false matches (a well may match a historical value even if the current active permission differs) and inflate extra counts over time. Consider restricting to the active/current records (e.g., end_date IS NULL) and, if multiples exist, selecting the latest start_date per (target_id, permission_type) to match how the API surfaces permissions.

Suggested change

sql = (

select(

Thing.name.label("point_id"),

PermissionHistory.permission_type.label("permission_type"),

PermissionHistory.permission_allowed.label("permission_allowed"),

)

.select_from(PermissionHistory)

.join(Thing, Thing.id == PermissionHistory.target_id)

.where(PermissionHistory.target_table == "thing")

.where(

PermissionHistory.permission_type.in_(

("Water Chemistry Sample", "Water Level Sample")

)

)

# Select only active permission records (end_date IS NULL) and, for each

# (target_id, permission_type), keep the record with the latest start_date.

latest_active_subq = (

select(

PermissionHistory.target_id.label("target_id"),

PermissionHistory.permission_type.label("permission_type"),

func.max(PermissionHistory.start_date).label("max_start_date"),

)

.where(PermissionHistory.target_table == "thing")

.where(

PermissionHistory.permission_type.in_(

("Water Chemistry Sample", "Water Level Sample")

)

)

.where(PermissionHistory.end_date.is_(None))

.group_by(

PermissionHistory.target_id,

PermissionHistory.permission_type,

)

.subquery()

)

sql = (

select(

Thing.name.label("point_id"),

PermissionHistory.permission_type.label("permission_type"),

PermissionHistory.permission_allowed.label("permission_allowed"),

)

.select_from(PermissionHistory)

.join(

latest_active_subq,

(

(PermissionHistory.target_id == latest_active_subq.c.target_id)

& (

PermissionHistory.permission_type

== latest_active_subq.c.permission_type

)

& (

PermissionHistory.start_date

== latest_active_subq.c.max_start_date

)

),

)

.join(Thing, Thing.id == PermissionHistory.target_id)

feat(transfers): add permissions transfer functionality and update co…

7a51c38

…nfiguration

Copilot AI review requested due to automatic review settings February 25, 2026 18:15

Copilot started reviewing on behalf of jirhiker February 25, 2026 18:15 View session

jirhiker merged commit 563d5d1 into staging Feb 25, 2026
10 checks passed

chatgpt-codex-connector bot reviewed Feb 25, 2026

View reviewed changes

Copilot AI reviewed Feb 25, 2026

View reviewed changes

TylerAdamMartinez deleted the transfer-fix-permission-history branch February 26, 2026 18:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(transfers): add permissions transfer functionality and update configuration#546

feat(transfers): add permissions transfer functionality and update configuration#546
jirhiker merged 1 commit intostagingfrom
transfer-fix-permission-history

jirhiker commented Feb 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Uh oh!

chatgpt-codex-connector bot Feb 25, 2026

Uh oh!

chatgpt-codex-connector bot Feb 25, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Copilot AI Feb 25, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jirhiker commented Feb 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Problem

Changes

Why this works

Files changed

Validation

Uh oh!

Uh oh!

chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

💡 Codex Review

Uh oh!

chatgpt-codex-connector bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

chatgpt-codex-connector bot Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Feb 25, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jirhiker commented Feb 25, 2026 •

edited

Loading