ENG-2635: Support namespaces for Postgres DSR execution by JadeCara · Pull Request #7500 · ethyca/fides

JadeCara · 2026-02-26T15:51:52Z

Description Of Changes

Add namespace support for Postgres connectors, enabling a single connection to handle multiple schemas for DSR execution. This follows the established Snowflake/BigQuery pattern exactly.

In cases where 200-300 datasets representing different Postgres schemas. Previously, each schema required a separate integration for DSRs. With namespace support, a single Postgres connection can execute DSRs across all schemas by using fully-qualified table names ("schema"."table" or "database"."schema"."table").

This also removes the stubbed retrieve_data()/mask_data() from RDSPostgresConnector, unblocking DSR execution for RDS Postgres connections.

PR 1 of 3 — Core Postgres + RDS Postgres namespace support. Google Cloud SQL Postgres (PR 2) and set_schema() reconciliation (PR 3) follow separately.

Code Changes

src/fides/api/schemas/namespace_meta/postgres_namespace_meta.py — New PostgresNamespaceMeta schema with database_name (optional) and schema (required), auto-registered via NamespaceMeta.__init_subclass__. Returns empty fallback fields for backward compat (Postgres defaults to public schema).
src/fides/api/service/connectors/query_configs/postgres_query_config.py — Added namespace_meta_schema, generate_table_name() for schema-qualified SQL, updated get_formatted_query_string() and get_update_stmt() to use it
src/fides/api/service/connectors/postgres_connector.py — query_config() now fetches namespace_meta from DB, added get_qualified_table_name() for unquoted names in error handling
src/fides/api/service/connectors/rds_postgres_connector.py — Removed stubbed retrieve_data()/mask_data() (inherits SQLConnector's implementations), query_config() passes namespace_meta, added get_qualified_table_name()
src/fides/api/service/connectors/sql_connector.py — Added None guard in get_namespace_meta() for dry-run contexts where no DB session is available
src/fides/service/dataset/validation_steps/namespace_meta.py — Added None guard for connection secrets; skip namespace validation when namespace_meta fields don't overlap with the connection type's schema (cross-type dataset linking)
clients/admin-ui/src/features/integrations/integration-type-info/rdsPostgresInfo.tsx — Added "DSR Automation" to RDS Postgres tags
tests/ops/service/connectors/test_postgres_query_config.py — New tests mirroring Snowflake test structure: parametrized namespace query generation, invalid meta validation, namespaced update statements
tests/service/dataset_service/test_namespace_meta_validation.py — Updated for Postgres namespace support: 5 new Postgres-specific tests, cross-type namespace skip test, updated unsupported type test

Steps to Confirm

1. Run namespace query config tests

pytest tests/ops/service/connectors/test_postgres_query_config.py -v

Expected: All 9 tests pass (5 parametrized query generation, 1 invalid meta, 3 update statements).

2. Run namespace validation tests

pytest tests/service/dataset_service/test_namespace_meta_validation.py -v

Expected: All 15 tests pass, including 5 Postgres-specific and 1 cross-type namespace skip test.

3. Verify backward compatibility (no namespace = no breakage)

pytest tests/ops/service/connectors/test_queryconfig.py -v

Expected: Existing Postgres query config tests pass unchanged — datasets without namespace_meta produce unqualified table names as before.

4. Verify dataset creation with namespace_meta via API

Create a Postgres connection, then create a dataset with namespace_meta, and link it:

# Create connection
curl -X PATCH "http://localhost:8080/api/v1/connection" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"name":"Test Namespace PG","key":"test_ns_pg","connection_type":"postgres","access":"read"}]'

# Create dataset with namespace_meta
curl -X POST "http://localhost:8080/api/v1/dataset/upsert" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{
    "fides_key": "ns_test",
    "name": "Namespace Test",
    "collections": [{"name": "orders", "fields": [
      {"name": "id", "fides_meta": {"primary_key": true, "data_type": "integer"}},
      {"name": "email", "fides_meta": {"data_type": "string", "identity": "email"}}
    ]}],
    "fides_meta": {"namespace": {"schema": "billing"}}
  }]'

# Link dataset config to connection
curl -X PATCH "http://localhost:8080/api/v1/connection/test_ns_pg/datasetconfig" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"fides_key": "ns_test", "ctl_dataset_fides_key": "ns_test"}]'

Expected: All return 200/201. The datasetconfig response shows "succeeded": [...] with 1 entry.

5. Verify namespace_meta is persisted and returned

curl -X GET "http://localhost:8080/api/v1/connection/test_ns_pg/datasetconfig" \
  -H "Authorization: Bearer $TOKEN"

Expected: The response includes "fides_meta": {"namespace": {"schema": "billing"}} in the ctl_dataset of the linked dataset config.

6. Verify backward compat via API (no namespace_meta required for Postgres)

# Create and link a dataset WITHOUT namespace_meta to the same Postgres connection
curl -X POST "http://localhost:8080/api/v1/dataset/upsert" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{
    "fides_key": "no_ns_test",
    "name": "No Namespace",
    "collections": [{"name": "users", "fields": [
      {"name": "id", "fides_meta": {"primary_key": true, "data_type": "integer"}}
    ]}]
  }]'

curl -X PATCH "http://localhost:8080/api/v1/connection/test_ns_pg/datasetconfig" \
  -H "Authorization: Bearer $TOKEN" -H "Content-Type: application/json" \
  -d '[{"fides_key": "no_ns_test", "ctl_dataset_fides_key": "no_ns_test"}]'

Expected: Both return 200/201 — no validation error. Postgres connections do NOT require namespace_meta (backward compatible, defaults to public schema).

7. Check RDS Postgres Admin UI tag

Navigate to the Integrations page in the Admin UI and find "RDS Postgres".

Expected: RDS Postgres shows "DSR Automation", "Discovery", and "Detection" tags.

Pre-Merge Checklist

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Postgres and RDS Postgres namespace/schema support added for database-qualified table naming and improved query/update generation.
- Changelog entry added documenting Postgres namespace support.
Improvements
- RDS Postgres integration now shows a "DSR Automation" tag in the admin UI.
- Shared SQL namespace model standardizes database/schema handling; validation skips namespace checks when connection types mismatch.
Removed
- RDS Postgres connector DSR data retrieval/masking functionality removed.
Tests
- Expanded tests for Postgres/RDS namespace handling and query/update generation.

Add PostgresNamespaceMeta schema and wire namespace support through PostgresQueryConfig, PostgreSQLConnector, and RDSPostgresConnector. Remove stubbed retrieve_data/mask_data from RDSPostgresConnector so it inherits SQLConnector's implementations, enabling DSR execution. Update RDS Postgres admin UI tags to include "DSR Automation". Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

vercel · 2026-02-26T15:51:57Z

The latest updates on your projects. Learn more about Vercel for GitHub.

2 Skipped Deployments

Project	Deployment	Actions	Updated (UTC)
fides-plus-nightly	Ignored	Preview	Mar 6, 2026 9:48pm
fides-privacy-center	Ignored		Mar 6, 2026 9:48pm

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Guard against None secrets in NamespaceMetaValidationStep. Update test_validate_unsupported_connection_type to use mariadb (postgres is now a supported namespace type). Add Postgres-specific validation tests. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Postgres defaults to the public schema when neither namespace_meta nor db_schema is configured. Return empty fallback fields so validation doesn't reject existing postgres connections that lack both. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Remove extra parentheses around bind parameters in expected query strings to match current SQLAlchemy output format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

The dry-run query test passes db=None to TaskResources, which flows into query_config() -> get_namespace_meta(). Return None early when no db session is available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

When a dataset's namespace_meta has no overlapping fields with the connection type's namespace schema (e.g. BigQuery namespace_meta on a Postgres connection), skip validation rather than raising an error. This happens when datasets of mixed types are bulk-linked to a single connection. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

greptile-apps

_{10 files reviewed, 3 comments}

_{Edit Code Review Agent Settings | Greptile}

Add GoogleCloudSQLPostgresNamespaceMeta with database_name (optional) and schema (required), following the same pattern as Snowflake/BigQuery. Update GoogleCloudSQLPostgresQueryConfig with generate_table_name() for schema-qualified SQL, and GoogleCloudSQLPostgresConnector to fetch namespace_meta from DB and pass it to the query config. Also includes shared fixes from PR #7500: - Guard against None db session in get_namespace_meta - Guard against None secrets in namespace validation - Skip namespace validation for mismatched connection types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

JadeCara · 2026-03-03T19:31:04Z

@greptile please review

adamsachs

thanks for getting this in place! generally, this is looking good.

there are a few stylistic/code concerns that likely don't have much of a functionality impact, but some of them are worth addressing because i think they're potential footguns for the future.

and then there's one meatier product/functionality question around the namespace field naming and its alignment with Helios. the comment should explain that reasonably well, but i think it will be a point for further discussion! we can sync up on that separately.

adamsachs · 2026-03-05T11:24:58Z

src/fides/api/schemas/namespace_meta/postgres_namespace_meta.py

+
+    connection_type: Literal["postgres"] = "postgres"
+    database_name: Optional[str] = None
+    schema: str  # type: ignore[assignment]


what do you think about abstracting the schema here so that we can start trying to standardize more across our datastores? the same fields are used for snowflake and will be used for google cloud SQL based on your other PR, i'd like us to try to start encouraging/enforcing that for consistency.

we've actually already gone astray in this regard on the helios side, which is unfortunate (and will relate to other comments here), and i think we need to start correcting ourselves ASAP. you can see the divergence in naming conventions in this doc...

whoops, just realized this was ambiguous with schema and keying specifically on that line...what i meant was the pydantic schema generally 😄

Agreed on standardizing. I'll extract a shared SQLNamespaceMeta base class with database_name and schema fields that Postgres, Snowflake, and Cloud SQL can inherit from. BigQuery stays separate since its fields are different (project_id, dataset_id).

Helios can align its database_id → database_name naming in a follow-up migration.

adamsachs · 2026-03-05T17:49:11Z

src/fides/api/service/connectors/query_configs/postgres_query_config.py

    Generates SQL valid for Postgres
    """

+    namespace_meta_schema = PostgresNamespaceMeta


OK, so i think this is generally on the right track in that we should be standardizing as much as possible between different 'flavors' of postgres integrations (and datastore integrations generally), but i want to call out that it has a few discrepancies with what Helios is generating for RDS postgres datasets, which we ultimately need to resolve (i think mainly on the helios-side!):

helios uses database_id here to indicate the database name, instead of database_name. i honestly don't know why, but it seems like we're standardizing on database_name generally and that's good - let's keep a consistent a standard as possible across datastores

helios doesn't include a schema attribute in its metadata. again, i'm not sure why we don't populate a schema attribute here, we have the schema available to us.

we populate a database_instance_id with the RDS database instance ID where this was found.

on the first two points, i think Helios is just wrong, so i don't think that should stop us from doing the right thing here; we just need to ensure we make the proper adjustments on the fides side, and probably migrate over any existing datasets for safety.

on the third point...i'm a bit less sure, this area has always been a bit murky! do we need the RDS DB instance ID for DSR execution? do we assume that a given integration is tied to a specific DB instance? we don't explicitly have that as a field/parameter on the RDS Postgres integration, but is it implicitly defined in the credentials used?

this is probably where 90% of the complexity of this work is. it may require us to do some broader product (re-)thinking. or not! let's break out on this a bit.

Agreed on all three points. For this PR:

We're standardizing on database_name — consistent with Snowflake and the new SQLNamespaceMeta base class. Helios should migrate database_id → database_name in a follow-up.

Helios should populate schema — the base class will make this a required field, so Helios-generated datasets will need to include it.

database_instance_id isn't needed for DSR execution — the integration credentials already implicitly target a specific RDS instance. I think database_instance_id is a Helios/discovery concern (tracking where a dataset was found) rather than a DSR concern (how to query it). Happy to discuss further if you see it differently.

Want me to create a follow-up ticket for the Helios alignment work?

adamsachs · 2026-03-05T17:56:27Z

src/fides/service/dataset/validation_steps/namespace_meta.py

                f"or the connection must have values for the following fields: {field_descriptions}"
            )
        if namespace_meta:
+            # Skip validation when the namespace_meta is clearly intended for a


is this valid? is this a scenario we really expect? if it's only happening in tests, that seems like we should fix up the tests.

maybe i'm missing something, but i feel like this isn't something we really want to have... if it's a purely defensive check for some oddity in our test setup that's now being exposed and we don't want to make a bigger change to our tests, then i'd prefer it sit isolated in a helper function so that we can clearly mark it as a temporary workaround we can clearly remove. honestly that probably feels like the right thing to do here in any case, as this is effectively an 'add-on' to the core validation logic.

let's just say...i'm generally confused by this :)

oh, i think maybe i see why this is needed - is it because we're not declaring a proper new namespace meta for RDS postgres? if that's the reason why, i think this is a poor workaround. shouldn't we instead just have something like the following?

class RDSPostgresNamespaceMeta(PostgresNamespaceMeta): connection_type: Literal["rds_postgres"] = "rds_postgres"

fwiw i wrote all of this before looking at the claude PR review, but claude agrees that the namespace validation skip logic is a 'critical' finding for its fragility:

Namespace validation skip logic could be fragile

namespace_meta.py:53-65 — The field-overlap heuristic that checks whether namespace_meta fields intersect with the schema's fields:

schema_field_names = {
field_name
for field_name in namespace_meta_class.model_fields
if field_name != "connection_type"
}
if schema_field_names and not schema_field_names.intersection(
namespace_meta.keys()
):
return

This works today because Postgres (schema, database_name) and BigQuery (project_id, dataset_id) have non-overlapping field names. But if a future connector adds a schema or database_name field, this heuristic will silently break. The connection_type discriminator check
above is more robust.

Recommendation: Consider requiring connection_type in namespace_meta going forward and document that the field-overlap fallback is a compatibility shim. A logger.debug when the fallback fires would help with debugging.

but if i'm reading between the lines correctly, i really think we should just be requiring connection_type, unless there's some reasoning here i'm missing!

You're right — the field-overlap heuristic is fragile and unnecessary if we require connection_type. I'll:

Add RDSPostgresNamespaceMeta(PostgresNamespaceMeta) with connection_type: Literal["rds_postgres"]

Remove the field-overlap fallback and rely solely on connection_type for discriminating which validator applies

This also ties into the SQLNamespaceMeta base class from the earlier comment — RDS Postgres will inherit through PostgresNamespaceMeta → SQLNamespaceMeta → NamespaceMeta.

adamsachs · 2026-03-05T18:20:17Z

src/fides/api/service/connectors/rds_postgres_connector.py

+    def get_qualified_table_name(self, node: ExecutionNode) -> str:
+        """Get fully qualified Postgres table name for table_exists() checks.
+
+        Returns unquoted names (e.g. schema.table) because the generic
+        SQLConnector.table_exists() uses split(".") + inspector.has_table().
+        """
+        query_config = self.query_config(node)
+        if not query_config.namespace_meta:
+            return node.collection.name
+
+        meta = cast(PostgresNamespaceMeta, query_config.namespace_meta)
+        qualified = f"{meta.schema}.{node.collection.name}"
+        if meta.database_name:
+            return f"{meta.database_name}.{qualified}"
+        return qualified


this comes from claude, and i think i largely agree:

Duplicated get_qualified_table_name across both connectors

postgres_connector.py:97-113 and rds_postgres_connector.py:135-149 contain identical implementations of get_qualified_table_name. This is a DRY violation that will diverge over time.

The Snowflake and BigQuery connectors use a simpler pattern — they delegate to query_config.generate_table_name():

# snowflake_connector.py:91-94 def get_qualified_table_name(self, node: ExecutionNode) -> str: query_config = self.query_config(node) return query_config.generate_table_name()

However, this won't work here because PostgresQueryConfig.generate_table_name() returns quoted names ("schema"."table") while table_exists() needs unquoted names (schema.table). That's a real design tension.

Recommendation: Either:

Add an unquoted param to generate_table_name(), or

Extract the shared logic into a private method on PostgresConnector (and have RDSPostgresConnector inherit from it or call a shared utility), or

At minimum, add a comment explaining why this can't delegate like Snowflake does.

i think option #1 makes the most sense. ideally we have the 'qualified table name' logic defined in a single place, and the quote handling is done as part of that.

Agreed — I'll add a quoted parameter to generate_table_name() (defaulting to True to preserve current behavior). Both connectors' get_qualified_table_name() will then just delegate: return self.query_config(node).generate_table_name(quoted=False), eliminating the duplication.

adamsachs · 2026-03-05T18:26:38Z

src/fides/api/service/connectors/sql_connector.py

+        if db is None:
+            return None


not a huge deal, but without this, does something break? at a minimum it feels like we should have db typed as Optional[Session] if this is possible; but really, it seems liek we should just be avoiding this call altogether if we don't have a Session to pass when this is being called.

maybe it goes beyond the scope of this PR, but i also don't really get why we're adding this in the PR, unless we've just found some broken functionality!

This guards against Session.object_session(self.configuration) returning None — which happens in tests when the config isn't bound to a DB session (e.g., PostgresQueryConfig unit tests that pass mock execution nodes without a real DB). Without it, those tests crash on db.scalar().

Agree the typing should reflect this — I'll update the signature to Optional[Session].

k, got it. i do feel like when app code has specific checks to guard against test-only codepaths it's a bit of a code smell, but i also understand that the problem here may lie deeper in the code organization and it'd take too broad of a refactor to fix, or too much mocking in the tests.

JadeCara

Thanks for the thorough review! Addressing all the inline feedback — the main changes will be:

New SQLNamespaceMeta base class with standardized database_name + schema fields
RDSPostgresNamespaceMeta subclass with proper connection_type discriminator
Remove fragile field-overlap validation heuristic in favor of connection_type check
generate_table_name(quoted=False) to eliminate duplicated get_qualified_table_name logic
Fix get_namespace_meta typing to Optional[Session]

- Extract SQLNamespaceMeta base class with shared database_name + schema fields - Add RDSPostgresNamespaceMeta with connection_type discriminator - Refactor Postgres/Snowflake namespace metas to inherit from SQLNamespaceMeta - Add RDSPostgresQueryConfig for proper namespace validation routing - Add quoted param to generate_table_name() to eliminate duplicated get_qualified_table_name() logic across both Postgres connectors - Remove fragile field-overlap heuristic in namespace validation, rely solely on connection_type discriminator - Fix get_namespace_meta typing to Optional[Session] - Add RDS Postgres validation tests Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fides/service/dataset/validation_steps/namespace_meta.py`:
- Around line 43-48: The current early return when meta_connection_type !=
connection_type (using namespace_meta.get("connection_type") stored in
meta_connection_type) silently skips validation for misspelled or unknown
values; instead, validate meta_connection_type against the set of known/expected
connection types and only silently skip when it is a different but known type
(i.e., meta_connection_type in KNOWN_CONNECTION_TYPES and != connection_type),
otherwise raise a validation exception (or return an error) when
meta_connection_type is present but not recognized so invalid metadata fails
fast.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 327b1bec-9d6d-429c-badc-a7c80159ce62

📥 Commits

Reviewing files that changed from the base of the PR and between 68fa5b5 and 61031db.

📒 Files selected for processing (10)

src/fides/api/schemas/namespace_meta/postgres_namespace_meta.py
src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py
src/fides/api/schemas/namespace_meta/snowflake_namespace_meta.py
src/fides/api/schemas/namespace_meta/sql_namespace_meta.py
src/fides/api/service/connectors/postgres_connector.py
src/fides/api/service/connectors/query_configs/postgres_query_config.py
src/fides/api/service/connectors/rds_postgres_connector.py
src/fides/api/service/connectors/sql_connector.py
src/fides/service/dataset/validation_steps/namespace_meta.py
tests/service/dataset_service/test_namespace_meta_validation.py

🚧 Files skipped from review as they are similar to previous changes (2)

tests/service/dataset_service/test_namespace_meta_validation.py
src/fides/api/service/connectors/sql_connector.py

src/fides/service/dataset/validation_steps/namespace_meta.py

Add GoogleCloudSQLPostgresNamespaceMeta with database_name (optional) and schema (required), following the same pattern as Snowflake/BigQuery. Update GoogleCloudSQLPostgresQueryConfig with generate_table_name() for schema-qualified SQL, and GoogleCloudSQLPostgresConnector to fetch namespace_meta from DB and pass it to the query config. Also includes shared fixes from PR #7500: - Guard against None db session in get_namespace_meta - Guard against None secrets in namespace validation - Skip namespace validation for mismatched connection types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (1)

tests/service/dataset_service/test_namespace_meta_validation.py (1)
295-326: Consider adding one regression test for namespace metadata missing connection_type.

You now test mismatched typed metadata well; adding a case with no connection_type would lock the intended compatibility behavior introduced in the validator and prevent accidental tightening/loosening later.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@tests/service/dataset_service/test_namespace_meta_validation.py` around lines
295 - 326, Add a new regression test that mirrors
test_validate_mismatched_namespace_skipped but sets fides_meta.namespace without
a connection_type key to ensure validation is skipped rather than enforced;
create a FideslangDataset with fides_meta={"namespace": {"project_id": "...",
"dataset_id": "..."}} (no connection_type), use a ConnectionConfig with
ConnectionType.postgres, build the DatasetValidationContext, instantiate
NamespaceMetaValidationStep, and assert that validator.validate(context) does
not raise—this will lock in the intended compatibility behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py`:
- Around line 10-12: Update the module docstring in
src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py to remove
the internal system reference "fidesplus RDSPostgresMonitor.get_schemas()" and
instead use generic wording (e.g., "RDS monitor" or "monitoring service") and
likewise replace "RDS connector's create_client()" with a neutral phrase like
"RDS connector client creation" so the docstring describes expected identifiers
generically without referencing internal system names; keep the rest of the
explanatory text intact.

In `@src/fides/api/service/connectors/query_configs/postgres_query_config.py`:
- Around line 19-42: The generate_table_name method (and the analogous code at
lines 72-87) currently wraps collection_name, PostgresNamespaceMeta.schema, and
PostgresNamespaceMeta.database_name in double quotes but does not escape any
embedded double quotes inside those identifiers; update generate_table_name (and
the other identifier-composition function) to first escape any internal
double-quote characters by replacing " with "" for each identifier component,
then apply the quoted wrapper when quoted=True; preserve the unquoted behavior
when quoted=False so inspector/table_exists checks still receive the raw
identifier.

---

Nitpick comments:
In `@tests/service/dataset_service/test_namespace_meta_validation.py`:
- Around line 295-326: Add a new regression test that mirrors
test_validate_mismatched_namespace_skipped but sets fides_meta.namespace without
a connection_type key to ensure validation is skipped rather than enforced;
create a FideslangDataset with fides_meta={"namespace": {"project_id": "...",
"dataset_id": "..."}} (no connection_type), use a ConnectionConfig with
ConnectionType.postgres, build the DatasetValidationContext, instantiate
NamespaceMetaValidationStep, and assert that validator.validate(context) does
not raise—this will lock in the intended compatibility behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 7d069e59-9a26-47da-b1f9-e6f69996ebcd

📥 Commits

Reviewing files that changed from the base of the PR and between 241750e and 5d91d41.

📒 Files selected for processing (5)

src/fides/api/schemas/namespace_meta/__init__.py
src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py
src/fides/api/service/connectors/query_configs/postgres_query_config.py
src/fides/service/dataset/validation_steps/namespace_meta.py
tests/service/dataset_service/test_namespace_meta_validation.py

src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py

src/fides/api/service/connectors/query_configs/postgres_query_config.py

Legacy datasets may have namespace_meta without a connection_type discriminator field. This adds a field-overlap heuristic: if namespace fields have zero overlap with the expected schema, validation is skipped (the namespace belongs to a different connection type). If fields do overlap, connection_type is backfilled and validation runs strictly so malformed namespaces are still caught. Also adds connection_type: bigquery to the enterprise test dataset YAML and 5 new unit tests covering the backward compat scenarios. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

adamsachs

OK, this is getting close - it's great that this is working with a proper RDS integration and i think we're basically using the 'right' mechanisms.

still just a few points that i think could use clean-up, let me know what you think!

adamsachs · 2026-03-06T18:32:52Z

src/fides/api/schemas/namespace_meta/sql_namespace_meta.py

+from fides.api.schemas.namespace_meta.namespace_meta import NamespaceMeta
+
+
+class SQLNamespaceMeta(NamespaceMeta):


adamsachs · 2026-03-06T18:33:54Z

src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py

+
+    connection_type: Literal["rds_postgres"] = "rds_postgres"
+    database_instance_id: str
+    database_id: str


so i imagine we reverted this back to database_id so it'd work well with the existing code in rds_postgres_connector.py...but i do think we should be establishing our more consistent standard here. can we just update the code there to look for database_name on the meta? i don't think that's going to break anything downstream, either here or on the monitor side in fidesplus, i really think that was just stubbed out for future use.

we do still need to update the namespace meta populated by helios monitors if we make this change, but i do think that's a relatively simple adjustment that we should go ahead and make. if we do that, then we should be able to inherit from the new SQLNamespaceMeta here, right? that'll help maintain standardization :)

note: i also wonder while we're here whether renaming the namespace field here to database_instance_name instead of database_instance_id is wise. that just seems more accurate, and if we don't change it now, we're never gonna get to it 😅

Done — RDSPostgresNamespaceMeta now inherits from SQLNamespaceMeta with standardized fields:

database_name (inherited, replaces database_id)

database_instance_name (replaces database_instance_id)

schema (inherited, optional)

Updated the connector and tests to match. The fidesplus monitor (rds_postgres_monitor.py) will need a matching update to emit the new field names.

Note: if any RDS Postgres datasets have already been created by Helios, we'll need a data migration to rename the JSON keys in ctl_datasets.fides_meta. Suggest we handle that in a separate PR to keep this one focused.

adamsachs · 2026-03-06T18:36:07Z

src/fides/api/schemas/namespace_meta/postgres_namespace_meta.py

+from fides.api.schemas.namespace_meta.sql_namespace_meta import SQLNamespaceMeta
+
+
+class PostgresNamespaceMeta(SQLNamespaceMeta):


adamsachs · 2026-03-06T18:40:59Z

src/fides/api/service/connectors/query_configs/postgres_query_config.py

+        """
+        Prepends the schema to the base table name if RDS namespace meta
+        is provided.  Unlike Postgres, RDS doesn't need database_name
+        qualification because the connection targets a specific database.


OK this is clear - i see why this is branched off now again. i'll throw this out there as a suggestion, but it may be overkill: we could continue to keep the logic centralized in a single method on the base classPostgresQueryConfig.generate_table_name by having that take an optional arg for whether to include the database name or not.

but i'm ambivalent...that may not even be clearer.

Agreed it could go either way. I'll keep the current approach with the separate override — it's explicit about why RDS skips database qualification (the engine URL already targets the database), which I think is clearer than a boolean flag on the base class.

adamsachs · 2026-03-06T18:43:26Z

src/fides/api/service/connectors/sql_connector.py

+        if db is None:
+            return None


k, got it. i do feel like when app code has specific checks to guard against test-only codepaths it's a bit of a code smell, but i also understand that the problem here may lie deeper in the code organization and it'd take too broad of a refactor to fix, or too much mocking in the tests.

adamsachs · 2026-03-06T18:46:44Z

src/fides/api/schemas/namespace_meta/snowflake_namespace_meta.py

@@ -15,10 +15,9 @@ class SnowflakeNamespaceMeta(NamespaceMeta):

    connection_type: Literal["snowflake"] = "snowflake"
    database_name: str


we should be able to remove this too, right? shouldn't it inherit it from SQLNamespaceMeta?

SQLNamespaceMeta declares database_name as Optional[str] = None since Postgres doesn't always need it. Snowflake requires it, so the override here tightens the type from optional to required. Added a comment to make this clearer.

adamsachs · 2026-03-06T18:48:57Z

src/fides/service/dataset/validation_steps/namespace_meta.py

+            # Skip validation when the namespace_meta connection_type doesn't
+            # match this connection. This happens when datasets of mixed types
+            # (e.g. BigQuery, Snowflake) are linked to a single connection.


is this a valid scenario? i don't really think this can happen...

You're right — this can't happen in normal app runtime. Namespace metadata is set by the type-specific Helios monitor, and datasets are always linked to a single connection via the scoped endpoint (PATCH /connection/{key}/dataset). There's no mechanism to re-link a dataset across connection types.

Removed the mismatch check and the legacy field-overlap/backfill block. Also removed the 5 tests that covered those code paths.

adamsachs · 2026-03-06T18:49:23Z

src/fides/service/dataset/validation_steps/namespace_meta.py

+                if not provided_fields & expected_fields:
+                    return
+                # Fields overlap — backfill connection_type so it's persisted
+                namespace_meta["connection_type"] = connection_type


i thought we'd be able to get rid of this block if we ensure that all of our namespace meta's have connection_type's, which they now seem to (🙏)?

maybe there's some other reason this is needed? but it just doesn't seem like a valid code path/scenario for normal app runtime. if it's just to make a test pass, i'd recommend we find an alternative, this just feels like too much fragile special-casing to have in our app code. the whole point of the connection_type mechanism is so that we don't have to do this!

i.e. are there really 'legacy' namespace_meta's lurking around...?

Agreed — removed this entire block in the same change. All namespace metas now have connection_type via the __init_subclass__ registration mechanism, so this heuristic is no longer needed. The alembic migration (4ebe0766021b) already backfilled connection_type for existing datasets.

JadeCara · 2026-03-06T20:20:25Z

Thanks for the second pass! All points addressed:

RDSPostgresNamespaceMeta — now inherits from SQLNamespaceMeta, renamed database_id → database_name and database_instance_id → database_instance_name. Will need a fidesplus monitor update + potentially a data migration (separate PR).
RDSPostgresQueryConfig — keeping the separate override; it's explicit about why RDS skips database qualification.
SnowflakeNamespaceMeta.database_name — kept the override (tightens Optional→required from base class), added clarifying comment.
Mismatch check + legacy block in validation — removed both. All namespace metas have connection_type now, so neither code path is needed.

- RDSPostgresNamespaceMeta: inherit from SQLNamespaceMeta, rename database_id → database_name and database_instance_id → database_instance_name - Remove mismatch check and legacy field-overlap/backfill block from namespace validation (can't happen in normal app runtime) - Add clarifying comment on SnowflakeNamespaceMeta.database_name override - Update connector and tests to use new field names Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

JadeCara · 2026-03-06T21:15:31Z

Created PRs to start the id -> name migration as well #7589 and https://github.com/ethyca/fidesplus/pull/3203

The NamespaceMetaValidationStep was validating dataset namespace metadata against the connection config's type, causing failures when datasets with BigQuery namespace metadata were submitted through a Postgres connection (e.g. in the bulk create test). Now skips validation when the namespace's connection_type doesn't match the connection config's type. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Jade Wibbels and others added 2 commits February 26, 2026 08:52

add changelog for PR #7500

896a706

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Merge branch 'main' into ENG-2635-postgres-namespace-support

a56ba6a

vercel bot deployed to Preview – fides-plus-nightly February 26, 2026 15:58 View deployment

Jade Wibbels and others added 7 commits February 26, 2026 09:17

ruff format fixes

628cbb4

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix bind parameter formatting in Postgres query config tests

c4d85ff

Remove extra parentheses around bind parameters in expected query strings to match current SQLAlchemy output format. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Guard against None db session in get_namespace_meta

0c967dd

The dry-run query test passes db=None to TaskResources, which flows into query_config() -> get_namespace_meta(). Return None early when no db session is available. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Fix ruff formatting in namespace validation test

996882b

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

JadeCara marked this pull request as ready for review February 26, 2026 18:09

JadeCara requested review from a team as code owners February 26, 2026 18:09

JadeCara requested review from adamsachs and speaker-ender and removed request for a team February 26, 2026 18:09

Merge branch 'main' into ENG-2635-postgres-namespace-support

8b9718d

greptile-apps bot reviewed Feb 26, 2026

View reviewed changes

JadeCara mentioned this pull request Feb 26, 2026

ENG-2635: Add namespace support for Google Cloud SQL Postgres #7507

Open

18 tasks

JadeCara mentioned this pull request Feb 26, 2026

ENG-2635: Reconcile set_schema with namespace_meta for Postgres connectors #7510

Open

10 tasks

JadeCara requested review from galvana February 26, 2026 22:12

ethyca deleted a comment from greptile-apps bot Mar 3, 2026

Merge branch 'main' into ENG-2635-postgres-namespace-support

0c9222d

adamsachs reviewed Mar 5, 2026

View reviewed changes

JadeCara commented Mar 5, 2026

View reviewed changes

Jade Wibbels and others added 2 commits March 5, 2026 12:23

Merge branch 'main' into ENG-2635-postgres-namespace-support

1e4e8cc

vercel bot deployed to Preview – fides-plus-nightly March 5, 2026 19:34 View deployment

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

src/fides/service/dataset/validation_steps/namespace_meta.py Outdated Show resolved Hide resolved

Merge branch 'main' into ENG-2635-postgres-namespace-support

fb0f654

vercel bot deployed to Preview – fides-plus-nightly March 5, 2026 21:13 View deployment

JadeCara and others added 4 commits March 5, 2026 14:22

Merge branch 'main' into ENG-2635-postgres-namespace-support

d51088f

fix for tests

241750e

update for rds

5668e81

Merge branch 'main' into ENG-2635-postgres-namespace-support

5d91d41

vercel bot deployed to Preview – fides-plus-nightly March 5, 2026 22:57 View deployment

coderabbitai bot reviewed Mar 5, 2026

View reviewed changes

src/fides/api/schemas/namespace_meta/rds_postgres_namespace_meta.py Outdated Show resolved Hide resolved

src/fides/api/service/connectors/query_configs/postgres_query_config.py Show resolved Hide resolved

Jade Wibbels and others added 3 commits March 5, 2026 16:20

fix for test

ebfd4f0

respond to rabbit

9a1aa7d

JadeCara requested a review from adamsachs March 6, 2026 02:30

Jade Wibbels and others added 2 commits March 5, 2026 19:42

updates from local code review

c87d473

Merge branch 'main' into ENG-2635-postgres-namespace-support

c620f3a

vercel bot deployed to Preview – fides-plus-nightly March 6, 2026 17:02 View deployment

adamsachs reviewed Mar 6, 2026

View reviewed changes

JadeCara mentioned this pull request Mar 6, 2026

ENG-2635: Standardize RDS & MySQL namespace meta field names #7589

Draft

11 tasks

Merge branch 'main' into ENG-2635-postgres-namespace-support

fa1e947

vercel bot deployed to Preview – fides-plus-nightly March 6, 2026 21:21 View deployment

		from fides.api.schemas.namespace_meta.namespace_meta import NamespaceMeta


		class SQLNamespaceMeta(NamespaceMeta):

		from fides.api.schemas.namespace_meta.sql_namespace_meta import SQLNamespaceMeta


		class PostgresNamespaceMeta(SQLNamespaceMeta):

		@@ -15,10 +15,9 @@ class SnowflakeNamespaceMeta(NamespaceMeta):

		connection_type: Literal["snowflake"] = "snowflake"
		database_name: str

Conversation

JadeCara commented Feb 26, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description Of Changes

Code Changes

Steps to Confirm

1. Run namespace query config tests

2. Run namespace validation tests

3. Verify backward compatibility (no namespace = no breakage)

4. Verify dataset creation with namespace_meta via API

5. Verify namespace_meta is persisted and returned

6. Verify backward compat via API (no namespace_meta required for Postgres)

7. Check RDS Postgres Admin UI tag

Pre-Merge Checklist

Summary by CodeRabbit

Uh oh!

vercel bot commented Feb 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

greptile-apps bot left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JadeCara commented Mar 3, 2026

Uh oh!

adamsachs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

JadeCara left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

adamsachs left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

JadeCara commented Feb 26, 2026 •

edited by coderabbitai bot

Loading

vercel bot commented Feb 26, 2026 •

edited

Loading

greptile-apps bot left a comment •

edited

Loading