ENG-2635: Add namespace support for Google Cloud SQL Postgres#7507
ENG-2635: Add namespace support for Google Cloud SQL Postgres#7507JadeCara wants to merge 19 commits intoENG-2635-postgres-namespace-supportfrom
Conversation
|
The latest updates on your projects. Learn more about Vercel for GitHub. 2 Skipped Deployments
|
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
e0bef11 to
b71e285
Compare
Greptile SummaryThis PR adds namespace metadata support to the Google Cloud SQL Postgres connector, enabling schema-qualified and optionally database-qualified table names in generated SQL queries. The implementation closely follows the existing Snowflake connector pattern. Key Changes:
Test Coverage:
Observations:
Confidence Score: 4/5
Important Files Changed
Last reviewed commit: 03e297e |
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds namespace metadata support for Google Cloud SQL Postgres and Postgres/RDS: new SQL namespace base class and connector/query-config changes to generate qualified table names, validation guards, admin UI tag updates, changelogs, and expanded tests for query generation and namespace validation. Changes
Sequence DiagramsequenceDiagram
participant Client as Client Request
participant Connector as Connector
participant DB as Database Session
participant QueryConfig as Query Config
Client->>Connector: Run operation with ExecutionNode
Connector->>DB: Fetch namespace_meta (optional)
DB-->>Connector: namespace_meta or None
Connector->>QueryConfig: Construct with node + namespace_meta
QueryConfig->>QueryConfig: generate_table_name()
alt namespace_meta present
QueryConfig->>QueryConfig: Build qualified name (database.schema.table)
else namespace_meta absent
QueryConfig->>QueryConfig: Use base table name (schema.table or table)
end
QueryConfig-->>Connector: Formatted SQL (SELECT/UPDATE)
Connector->>DB: Execute SQL
DB-->>Client: Return results
Estimated Code Review Effort🎯 4 (Complex) | ⏱️ ~45 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
tests/ops/service/connectors/test_google_cloud_postgres_query_config.py (1)
70-72: Avoid hardcoded dataset keys in traversal lookupsThese fixtures depend on a fixed dataset key string, which makes tests brittle if
example_datasets[0]changes. Usedataset_config.fides_keyto bind assertions to the fixture data instead of fixture ordering assumptions.♻️ Proposed change
- yield traversal.traversal_node_dict[ - CollectionAddress("postgres_example_test_dataset", "customer") - ].to_mock_execution_node() + yield traversal.traversal_node_dict[ + CollectionAddress(dataset_config.fides_key, "customer") + ].to_mock_execution_node() @@ - yield traversal.traversal_node_dict[ - CollectionAddress("postgres_example_test_dataset", "address") - ].to_mock_execution_node() + yield traversal.traversal_node_dict[ + CollectionAddress(dataset_config.fides_key, "address") + ].to_mock_execution_node()Also applies to: 87-89
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/ops/service/connectors/test_google_cloud_postgres_query_config.py` around lines 70 - 72, The test uses a hardcoded dataset key "postgres_example_test_dataset" when indexing traversal.traversal_node_dict; replace that literal with the fixture's dataset_config.fides_key so the lookup becomes CollectionAddress(dataset_config.fides_key, "customer") and then call .to_mock_execution_node() as before; make the same replacement for the other occurrence that looks up the "customer" collection (the block around the later assertion) so tests no longer rely on example_datasets ordering.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/fides/service/dataset/validation_steps/namespace_meta.py`:
- Around line 43-63: The code assumes namespace_meta is a dict (used via
namespace_meta.get, namespace_meta.keys(), and iteration) which raises
AttributeError for non-dict inputs; update the beginning of the validation in
the same block (before meta_connection_type =
namespace_meta.get("connection_type")) to first assert namespace_meta is a
mapping (e.g., isinstance(namespace_meta, Mapping) or dict) and if not, raise
the module's ValidationError with a clear message; ensure subsequent uses of
namespace_meta (meta_connection_type, namespace_meta.keys(), and membership
checks against namespace_meta_class.model_fields) occur only after this type
check so malformed metadata goes through the ValidationError path.
In `@tests/ops/service/connectors/test_google_cloud_postgres_query_config.py`:
- Around line 133-141: The test test_generate_query_with_invalid_namespace_meta
is constructing a GoogleCloudSQLPostgresNamespaceMeta instance which raises
ValidationError before GoogleCloudSQLPostgresQueryConfig is exercised; change
the test to pass a raw invalid dict (e.g., {"database_name": None} or missing
required keys) as the namespace argument to GoogleCloudSQLPostgresQueryConfig so
the config's validation path runs, keeping the test assertion that a
ValidationError is raised when instantiating GoogleCloudSQLPostgresQueryConfig
with that invalid namespace.
---
Nitpick comments:
In `@tests/ops/service/connectors/test_google_cloud_postgres_query_config.py`:
- Around line 70-72: The test uses a hardcoded dataset key
"postgres_example_test_dataset" when indexing traversal.traversal_node_dict;
replace that literal with the fixture's dataset_config.fides_key so the lookup
becomes CollectionAddress(dataset_config.fides_key, "customer") and then call
.to_mock_execution_node() as before; make the same replacement for the other
occurrence that looks up the "customer" collection (the block around the later
assertion) so tests no longer rely on example_datasets ordering.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 8e2838fa-2ac3-4224-a8a1-cd78fb1ca5df
📒 Files selected for processing (9)
changelog/7507-gcp-postgres-namespace-support.yamlclients/admin-ui/src/features/integrations/integration-type-info/googleCloudSQLPostgresInfo.tsxsrc/fides/api/schemas/namespace_meta/google_cloud_sql_postgres_namespace_meta.pysrc/fides/api/service/connectors/google_cloud_postgres_connector.pysrc/fides/api/service/connectors/query_configs/google_cloud_postgres_query_config.pysrc/fides/api/service/connectors/sql_connector.pysrc/fides/service/dataset/validation_steps/namespace_meta.pytests/ops/service/connectors/test_google_cloud_postgres_query_config.pytests/service/dataset_service/test_namespace_meta_validation.py
tests/ops/service/connectors/test_google_cloud_postgres_query_config.py
Outdated
Show resolved
Hide resolved
|
Addressed the nitpick from CodeRabbit about hardcoded dataset keys in the test fixtures — switched both |
Add GoogleCloudSQLPostgresNamespaceMeta with database_name (optional) and schema (required), following the same pattern as Snowflake/BigQuery. Update GoogleCloudSQLPostgresQueryConfig with generate_table_name() for schema-qualified SQL, and GoogleCloudSQLPostgresConnector to fetch namespace_meta from DB and pass it to the query config. Also includes shared fixes from PR #7500: - Guard against None db session in get_namespace_meta - Guard against None secrets in namespace validation - Skip namespace validation for mismatched connection types Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Add isinstance guard for namespace_meta before dict-style access - Fix invalid namespace test to exercise QueryConfig validation path - Replace hardcoded dataset key with dataset_config.fides_key in fixtures - Remove unused ValidationError import Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
- Inherit GoogleCloudSQLPostgresNamespaceMeta from SQLNamespaceMeta - Add quoted param to generate_table_name() for inspector checks - Simplify get_qualified_table_name() to delegate to query config Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
87f6fcd to
76611ba
Compare
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
…postgres-namespace-support
Ticket ENG-2635
Description Of Changes
Add namespace metadata support to the Google Cloud SQL for Postgres connector, enabling schema-qualified and optionally database-qualified table names in generated SQL queries. This is PR 2 of the ENG-2635 work — PR 1 (#7500) added the same capability for standard Postgres.
When a dataset has
fides_meta.namespaceconfigured with aschema(and optionallydatabase_name), queries are generated with qualified table names like"billing"."customer"or"prod_db"."billing"."customer"instead of just"customer".The
set_schemaconnection secret (db_schema) path remains as the legacy approach. The two mechanisms are mutually exclusive in practice — users configure one or the other.This is PR 2 of 3
This PR also includes the cross-type namespace validation fixes from PR 1 (needed on this branch since it diverges from main before PR 1 merges). #7500
Code Changes
src/fides/api/schemas/namespace_meta/google_cloud_sql_postgres_namespace_meta.py- New namespace meta schema with requiredschemafield, optionaldatabase_name, and empty fallback secret fields (GCS Postgres defaults to public schema)src/fides/api/service/connectors/query_configs/google_cloud_postgres_query_config.py- Addednamespace_meta_schema,generate_table_name(),get_formatted_query_string(), andget_update_stmt()to generate schema-qualified SQLsrc/fides/api/service/connectors/google_cloud_postgres_connector.py- Addedquery_config()override to fetch namespace meta from DB,get_qualified_table_name()for table existence checks, clarifying docstring onset_schemasrc/fides/api/service/connectors/sql_connector.py- AddedNoneguard inget_namespace_meta()for cases wheredbisNonesrc/fides/service/dataset/validation_steps/namespace_meta.py- Added cross-type namespace skip logic (explicitconnection_typecheck + field overlap check),Nonesecrets guardclients/admin-ui/src/features/integrations/integration-type-info/googleCloudSQLPostgresInfo.tsx- Added "DSR Automation" tagtests/ops/service/connectors/test_google_cloud_postgres_query_config.py- New integration tests: 5 parametrized query generation tests, 1 invalid meta test, 3 update statement teststests/service/dataset_service/test_namespace_meta_validation.py- Added 5 GCS Postgres validation tests + cross-type skip test, changed unsupported type test to usemariadbSteps to Confirm
1. Run the namespace meta validation unit tests (no Docker required):
nox -s "pytest(misc-unit)" -- tests/service/dataset_service/test_namespace_meta_validation.py -vExpected: All 19 tests pass, including the new GCS Postgres tests:
test_validate_gcs_postgres_without_namespace_or_schema— PASSEDtest_validate_gcs_postgres_with_valid_namespace— PASSEDtest_validate_gcs_postgres_with_connection_defaults— PASSEDtest_validate_gcs_postgres_with_invalid_namespace— PASSEDtest_validate_mismatched_namespace_skipped— PASSED2. Run the query config integration tests (requires Postgres via Docker):
nox -s "pytest(ops-integration)" -- tests/ops/service/connectors/test_google_cloud_postgres_query_config.py -vExpected: All 10 tests pass:
test_generate_query_with_namespace_meta(5 parametrized cases) — queries produce:schema="billing"→SELECT ... FROM "billing"."customer" WHERE (email = :email)database_name="prod_db", schema="billing"→SELECT ... FROM "prod_db"."billing"."customer" WHERE (email = :email){"schema": "billing"}→ same as first caseconnection_type→ same as first caseNone→SELECT ... FROM "customer" WHERE (email = :email)test_generate_query_with_invalid_namespace_meta— raisesValidationErrortest_generate_update_stmt→UPDATE "address" SET ...test_generate_namespaced_update_stmt→UPDATE "billing"."address" SET ...test_generate_namespaced_update_stmt_with_database→UPDATE "prod_db"."billing"."address" SET ...3. Quick local verification (no Docker, no nox):
Expected output:
Pre-Merge Checklist
CHANGELOG.mdupdatedmaindowngrade()migration is correct and worksSummary by CodeRabbit
New Features
Bug Fixes
UI
Tests
Chores