diff --git a/README.md b/README.md index 1c66db11..ac63cadd 100644 --- a/README.md +++ b/README.md @@ -62,9 +62,11 @@ The folder `examples` contains the following Terraform implementation examples : | AWS | [aws-databricks-uc-bootstrap](examples/aws-databricks-uc-bootstrap/) | AWS UC | | AWS | [aws-remote-backend-infra](examples/aws-remote-backend-infra/) | Simple example on remote backend | | AWS | [aws-workspace-config](examples/aws-workspace-config/) | Configure workspace objects | -| GCP | [gcp-sa-provisionning](examples/gcp-sa-provisionning/) | Provisionning of the identity with the permissions required to deploy on GCP. | -| GCP | [gcp-basic](examples/gcp-basic/) | Workspace Deployment with managed vpc | -| GCP | [gcp-byovpc](examples/gcp-byovpc/) | Workspace Deployment with customer-managed vpc | +| GCP | [gcp-sa-provisioning](examples/gcp-sa-provisioning/) | Provisioning the identity (service account) with permissions required to deploy on GCP | +| GCP | [gcp-basic](examples/gcp-basic/) | Workspace deployment with Databricks-managed VPC | +| GCP | [gcp-byovpc](examples/gcp-byovpc/) | Workspace deployment with customer-managed VPC (Terraform creates the VPC) | +| GCP | [gcp-existing-vpc](examples/gcp-existing-vpc/) | Workspace deployment into a pre-existing VPC | +| GCP | [gcp-with-psc-exfiltration-protection](examples/gcp-with-psc-exfiltration-protection/) | Workspace with PrivateLink (PSC), private DNS, and restricted egress (hub-and-spoke topology) | ### Modules The folder `modules` contains the following Terraform modules : @@ -89,9 +91,13 @@ The folder `modules` contains the following Terraform modules : | AWS | [aws-workspace-with-firewall](modules/aws-workspace-with-firewall/) | Provisioning AWS Databricks E2 with an AWS Firewall | | AWS | [aws-exfiltration-protection](modules/aws-exfiltration-protection/) | An implementation of [Data Exfiltration Protection on AWS](https://www.databricks.com/blog/2021/02/02/data-exfiltration-protection-with-databricks-on-aws.html) | | AWS | aws-workspace-with-private-link | Coming soon | -| GCP | [gcp-sa-provisionning](modules/gcp-sa-provisionning/) | Provisions the identity (SA) with the correct permissions | -| GCP | [gcp-workspace-basic](modules/gcp-workspace-basic/) | Provisions a workspace with managed VPC | -| GCP | [gcp-workspace-byovpc](modules/gcp-workspace-byovpc/) | Workspace with customer-managed VPC. | +| GCP | [gcp/databricks-workspace](modules/gcp/databricks-workspace/) | Composer that orchestrates network, PSC, account, and DNS submodules based on scenario flags | +| GCP | [gcp/network](modules/gcp/network/) | VPC, subnet, router, NAT, peering, and shared-VPC binding (create or data-source lookup) | +| GCP | [gcp/private-connectivity](modules/gcp/private-connectivity/) | PSC endpoints (frontend, backend, hub-transit) and restricted-egress firewall rules | +| GCP | [gcp/account](modules/gcp/account/) | All databricks_mws_* resources: networks, workspaces, vpc_endpoint, private_access_settings | +| GCP | [gcp/dns](modules/gcp/dns/) | Private DNS zones (gcp.databricks.com, gcr.io, googleapis.com, pkg.dev) for restricted-egress workspaces | +| GCP | [gcp/service-account](modules/gcp/service-account/) | Service account with the IAM permissions required to provision Databricks workspaces | +| GCP | [gcp/unity-catalog](modules/gcp/unity-catalog/) | Metastore, GCS bucket, storage credential, external location, and default catalog | ### CI/CD pipelines The `cicd-pipelines` folder contains the following implementation examples of pipeline: diff --git a/docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md b/docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md new file mode 100644 index 00000000..9c31deeb --- /dev/null +++ b/docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md @@ -0,0 +1,3822 @@ +# GCP Modules Refactor Implementation Plan + +> **For agentic workers:** REQUIRED SUB-SKILL: Use superpowers:subagent-driven-development (recommended) or superpowers:executing-plans to implement this plan task-by-task. Steps use checkbox (`- [ ]`) syntax for tracking. + +**Goal:** Replace three duplicated GCP workspace modules with one composer (`modules/gcp/databricks-workspace`) that orchestrates five focused submodules (`network`, `private-connectivity`, `account`, `dns`, plus relocated `service-account` and `unity-catalog`). Migrate existing GCP examples one at a time onto the composer and add a new "existing VPC" example. + +**Architecture:** Composer reads orthogonal feature flags (`vpc_source`, `private_link_frontend`, `private_link_backend`, `private_access_only`, `restricted_egress`) and conditionally instantiates submodules via `count`. Dependency graph is linear: `network → private-connectivity → account → dns`. All `databricks_mws_*` resources live in `account`; all GCP-side PSC resources live in `private-connectivity`; DNS is split out because it depends on `account.workspace_url`. + +**Tech Stack:** Terraform >= 1.5, `hashicorp/google` provider, `databricks/databricks` provider, `terraform-docs`, `pre-commit`. No new tooling. + +**Spec reference:** `docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md` + +**Branch:** `feature/gcp-modules-refactor` (already created; spec committed as `2bfd9bd`). + +--- + +## File Structure + +This plan creates the following new tree (incremental — each task creates one slice): + +``` +docs/superpowers/ # already exists + ├── specs/2026-05-14-gcp-modules-refactor-design.md # already committed + └── plans/2026-05-14-gcp-modules-refactor.md # this file + +modules/gcp/ + ├── Makefile # Task 1 — recursive docs/test_docs + ├── databricks-workspace/ # Task 14–17 — composer + │ ├── main.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md # terraform-docs generates + │ ├── Makefile + │ └── tests/ # plan-time validation fixtures + │ ├── basic/main.tf + │ ├── byovpc/main.tf + │ ├── existing-vpc/main.tf + │ ├── psc-isolated/main.tf + │ └── negative-*/main.tf # expect plan failure + ├── network/ # Task 3–5 + │ ├── main.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md + │ ├── Makefile + │ └── tests/ + │ ├── create/main.tf + │ ├── existing/main.tf + │ └── create-with-hub/main.tf + ├── private-connectivity/ # Task 6–8 + │ ├── psc.tf + │ ├── firewall.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── locals.tf # regional PSC + hive metastore maps + │ ├── README.md + │ ├── Makefile + │ └── tests/ + │ ├── frontend-only/main.tf + │ ├── full-isolated/main.tf + │ └── no-egress/main.tf + ├── account/ # Task 9–13 + │ ├── main.tf # mws_networks + mws_workspaces + │ ├── vpc-endpoints.tf # mws_vpc_endpoint + │ ├── pas.tf # mws_private_access_settings + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md + │ ├── Makefile + │ └── tests/ + │ ├── databricks-managed/main.tf + │ ├── byovpc/main.tf + │ └── psc-with-pas/main.tf + ├── dns/ # Task 18–19 + │ ├── hub.tf + │ ├── spoke.tf + │ ├── variables.tf + │ ├── outputs.tf + │ ├── versions.tf + │ ├── README.md + │ ├── Makefile + │ └── tests/hub-and-spoke/main.tf + ├── service-account/ # Task 20 (git mv from modules/gcp-sa-provisioning) + └── unity-catalog/ # Task 21 (git mv from modules/gcp-unity-catalog) + +modules/gcp-sa-provisioning/ # Task 20 — replaced with deprecation README + └── README.md + +modules/gcp-unity-catalog/ # Task 21 — replaced with deprecation README + └── README.md + +examples/gcp-basic/ # Task 24 — migrated +examples/gcp-byovpc/ # Task 25 — migrated +examples/gcp-with-psc-exfiltration-protection/ # Task 26 — migrated +examples/gcp-existing-vpc/ # Task 27 — NEW +examples/gcp-sa-provisioning/ # Task 28 — repoint to relocated module + +# Deletions (Task 29 onward, PR 6) +modules/gcp-workspace-basic/ # DELETE +modules/gcp-workspace-byovpc/ # DELETE +modules/gcp-with-psc-exfiltration-protection/ # DELETE +modules/gcp-sa-provisioning/ # DELETE (stub) +modules/gcp-unity-catalog/ # DELETE (stub) +examples/gcp-sa-provisionning/ # DELETE (typo dir) +examples/gcp-test-modules/ # DELETE (state-only) +``` + +**Testing approach for each module task:** Each submodule gets `tests//main.tf` fixtures that call the module with mock vars. The "test" is `terraform init -backend=false && terraform validate && terraform plan -refresh=false` against the fixture. We don't apply — we verify the configuration is valid and the planned resource graph matches expectations. + +**Conventions to follow** (observed in existing repo): +- `versions.tf` declares required_providers and required terraform version +- `Makefile` per module has `docs:` and `test_docs:` targets calling `terraform-docs -c ../../.terraform-docs.yml .` (note: for nested `modules/gcp//`, the path becomes `../../../.terraform-docs.yml`) +- README sections between `` and `` are managed by `terraform-docs` +- Resource names use `${var.prefix}--${random_string.suffix.result}` pattern +- `random_string.suffix` is declared **only in the composer**, then passed to submodules via `suffix` input + +--- + +## PR 1 — Foundation + +This PR adds all new modules under `modules/gcp/` and relocates `service-account` + `unity-catalog`. No example is touched. The deliverable at the end of PR 1 is: a complete new module tree that passes `terraform validate` for every fixture, with no example consuming it yet. + +### Task 1: Repo scaffolding — `modules/gcp/Makefile` and `tests/` convention + +**Files:** +- Create: `modules/gcp/Makefile` + +- [ ] **Step 1: Inspect existing Makefile pattern** + +Read `modules/Makefile` and `modules/gcp-workspace-basic/Makefile` to confirm conventions. + +Run: `cat modules/Makefile modules/gcp-workspace-basic/Makefile` + +Expected: top-level discovers projects via `*/README.md`, each module Makefile invokes `terraform-docs -c ../../.terraform-docs.yml .`. + +- [ ] **Step 2: Create `modules/gcp/Makefile`** + +Write: + +```makefile +PROJECTS := $(dir $(wildcard */README.md)) + +docs: $(PROJECTS) + +$(PROJECTS): + $(MAKE) -C $@ docs + +.PHONY: $(PROJECTS) docs +``` + +- [ ] **Step 3: Update top-level `modules/Makefile` to recurse into `gcp/`** + +Read current `modules/Makefile`. It only iterates `*/README.md`. Since `modules/gcp/` has no README of its own, we add an explicit recursion. + +Edit `modules/Makefile`: + +```makefile +PROJECTS := $(dir $(wildcard */README.md)) + +docs: $(PROJECTS) gcp-recursive + +$(PROJECTS): + $(MAKE) -C $@ docs + +gcp-recursive: + $(MAKE) -C gcp docs + +.PHONY: $(PROJECTS) docs gcp-recursive +``` + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/Makefile modules/Makefile +git commit -m "$(cat <<'EOF' +build: add Makefile recursion for modules/gcp/ submodules + +Adds modules/gcp/Makefile mirroring the modules/ pattern (discover +sub-projects via */README.md) and updates modules/Makefile to recurse +into the gcp/ subdir for terraform-docs generation. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 2: `modules/gcp/network` — skeleton + variables + versions + +**Files:** +- Create: `modules/gcp/network/variables.tf` +- Create: `modules/gcp/network/main.tf` +- Create: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/versions.tf` +- Create: `modules/gcp/network/Makefile` +- Create: `modules/gcp/network/README.md` (placeholder, terraform-docs fills it) + +- [ ] **Step 1: Write `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +- [ ] **Step 2: Write `variables.tf`** + +```hcl +variable "prefix" { + type = string + description = "Prefix for generated resource names" +} + +variable "suffix" { + type = string + description = "Random suffix passed by the composer for uniqueness" +} + +variable "google_region" { + type = string + description = "GCP region for all network resources" +} + +variable "vpc_source" { + type = string + description = "Either 'create' (Terraform creates a VPC) or 'existing' (data-source lookup)" + validation { + condition = contains(["create", "existing"], var.vpc_source) + error_message = "vpc_source must be 'create' or 'existing'." + } +} + +# Spoke project always required +variable "spoke_vpc_google_project" { + type = string + description = "GCP project hosting the spoke VPC" +} + +# === Used when vpc_source = "create" ==================================== +variable "spoke_vpc_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet primary range (required when vpc_source=create)" +} + +variable "subnet_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet (required when vpc_source=create)" +} + +variable "subnet_name" { + type = string + default = null + description = "Override for spoke subnet name (default: \"${prefix}-subnet-${suffix}\")" +} + +variable "pod_cidr" { + type = string + default = null + description = "GKE secondary range for pods (optional)" +} + +variable "svc_cidr" { + type = string + default = null + description = "GKE secondary range for services (optional)" +} + +# === Used when vpc_source = "existing" ================================== +variable "existing_vpc_name" { + type = string + default = null + description = "Name of pre-existing VPC (required when vpc_source=existing)" +} + +variable "existing_subnet_name" { + type = string + default = null + description = "Name of pre-existing subnet (required when vpc_source=existing)" +} + +# === Hub configuration (only when create_hub = true) ==================== +variable "create_hub" { + type = bool + default = false + description = "Create a hub VPC + subnet + peering with the spoke. Composer passes restricted_egress here." +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC (required when create_hub=true)" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR for the hub subnet (required when create_hub=true)" +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false + description = "If true, bind the spoke VPC's project as a Shared-VPC host and the workspace project as a service project" +} + +variable "workspace_google_project" { + type = string + default = null + description = "Workspace project (used for Shared-VPC service binding)" +} +``` + +- [ ] **Step 3: Write empty `main.tf` and `outputs.tf`** + +`main.tf`: + +```hcl +# Resources added in Tasks 3, 4, 5 +``` + +`outputs.tf`: + +```hcl +output "spoke_vpc_id" { + value = null + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = null + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = null + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = null + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = null + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = null + description = "Self-link of the spoke subnet" +} + +output "hub_vpc_id" { + value = null + description = "ID of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_name" { + value = null + description = "Name of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_self_link" { + value = null + description = "Self-link of the hub VPC (null when create_hub=false)" +} + +output "hub_subnet_name" { + value = null + description = "Name of the hub subnet (null when create_hub=false)" +} + +output "nat_id" { + value = null + description = "ID of the Cloud NAT (null when vpc_source=existing)" +} +``` + +(Outputs are wired to real resources in Tasks 3–5.) + +- [ ] **Step 4: Write `Makefile`** + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +- [ ] **Step 5: Write `README.md` placeholder** + +```markdown +# modules/gcp/network + +VPC, subnet, router, NAT, peering, and Shared-VPC binding for the Databricks GCP composer. + + + +``` + +- [ ] **Step 6: Validate** + +Run: +```bash +cd modules/gcp/network && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +- [ ] **Step 7: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): scaffold module with variables and outputs + +Adds modules/gcp/network with variable declarations, empty outputs, +versions.tf, Makefile, and README placeholder. Resources to be added +in subsequent tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 3: `modules/gcp/network` — create-vpc path + fixture + +**Files:** +- Modify: `modules/gcp/network/main.tf` +- Modify: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/tests/create/main.tf` + +- [ ] **Step 1: Write the test fixture `tests/create/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} +``` + +- [ ] **Step 2: Run the fixture, expect plan to show no resources (no implementation yet)** + +```bash +cd modules/gcp/network/tests/create +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: validate passes, plan shows `No changes. Your infrastructure matches the configuration.` (no resources defined in module yet). + +- [ ] **Step 3: Implement the create-vpc path in `modules/gcp/network/main.tf`** + +```hcl +locals { + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + subnet_name = coalesce(var.subnet_name, "${var.prefix}-subnet-${var.suffix}") +} + +# === Spoke VPC (created) ================================================ +resource "google_compute_network" "spoke_vpc" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-spoke-vpc-${var.suffix}" + project = var.spoke_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +resource "google_compute_subnetwork" "spoke_subnet" { + count = local.create_vpc ? 1 : 0 + + name = local.subnet_name + project = var.spoke_vpc_google_project + network = google_compute_network.spoke_vpc[0].id + region = var.google_region + ip_cidr_range = var.subnet_cidr + private_ip_google_access = true + + dynamic "secondary_ip_range" { + for_each = var.pod_cidr != null ? [1] : [] + content { + range_name = "pods" + ip_cidr_range = var.pod_cidr + } + } + + dynamic "secondary_ip_range" { + for_each = var.svc_cidr != null ? [1] : [] + content { + range_name = "services" + ip_cidr_range = var.svc_cidr + } + } +} + +resource "google_compute_router" "router" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-router-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = google_compute_network.spoke_vpc[0].id +} + +resource "google_compute_router_nat" "nat" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-nat-${var.suffix}" + project = var.spoke_vpc_google_project + router = google_compute_router.router[0].name + region = var.google_region + nat_ip_allocate_option = "AUTO_ONLY" + source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" +} +``` + +- [ ] **Step 4: Wire outputs in `outputs.tf`** + +Replace the `null` placeholders: + +```hcl +output "spoke_vpc_id" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].id : null + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].name : null + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : null + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].id : null + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].name : null + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].self_link : null + description = "Self-link of the spoke subnet" +} + +output "nat_id" { + value = local.create_vpc ? google_compute_router_nat.nat[0].id : null + description = "ID of the Cloud NAT (null when vpc_source=existing)" +} + +# hub_* outputs still null at this point; updated in Task 5. +output "hub_vpc_id" { value = null description = "ID of the hub VPC (null when create_hub=false)" } +output "hub_vpc_name" { value = null description = "Name of the hub VPC (null when create_hub=false)" } +output "hub_vpc_self_link" { value = null description = "Self-link of the hub VPC (null when create_hub=false)" } +output "hub_subnet_name" { value = null description = "Name of the hub subnet (null when create_hub=false)" } +``` + +- [ ] **Step 5: Re-run fixture and verify resource count** + +```bash +cd modules/gcp/network/tests/create +terraform plan -refresh=false +``` + +Expected: `Plan: 4 to add, 0 to change, 0 to destroy.` (network + subnet + router + nat). + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): implement create-vpc path + +Adds google_compute_network/subnetwork/router/router_nat resources +gated on vpc_source="create". Outputs wired to real resources. +Fixture in tests/create/ asserts 4 resources are planned. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 4: `modules/gcp/network` — existing-vpc path + fixture + +**Files:** +- Modify: `modules/gcp/network/main.tf` +- Modify: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/tests/existing/main.tf` + +- [ ] **Step 1: Write fixture `tests/existing/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "existing" + spoke_vpc_google_project = "fixture-project" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} +``` + +- [ ] **Step 2: Add data sources to `main.tf`** + +Append to `modules/gcp/network/main.tf`: + +```hcl +# === Spoke VPC (data lookup) ============================================ +data "google_compute_network" "existing_spoke" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_vpc_name + project = var.spoke_vpc_google_project +} + +data "google_compute_subnetwork" "existing_spoke_subnet" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_subnet_name + project = var.spoke_vpc_google_project + region = var.google_region +} +``` + +- [ ] **Step 3: Update outputs to merge create and existing paths** + +In `outputs.tf` replace the four spoke outputs: + +```hcl +output "spoke_vpc_id" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].id : + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].id : null + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].name : + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].name : null + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].self_link : null + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].id : + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].id : null + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].name : + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].name : null + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].self_link : + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].self_link : null + description = "Self-link of the spoke subnet" +} +``` + +- [ ] **Step 4: Run fixture, expect plan with zero resources (data sources only)** + +```bash +cd modules/gcp/network/tests/existing +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: validate passes; plan shows `No changes` (data sources don't appear as planned resources without applying; we accept this — the test is `validate` passing without errors). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): implement existing-vpc path + +Adds data.google_compute_network and data.google_compute_subnetwork +lookups gated on vpc_source="existing". Spoke outputs now resolve +from either created resources or data sources. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 5: `modules/gcp/network` — hub & peering & shared-VPC + fixture + +**Files:** +- Modify: `modules/gcp/network/main.tf` +- Modify: `modules/gcp/network/outputs.tf` +- Create: `modules/gcp/network/tests/create-with-hub/main.tf` + +- [ ] **Step 1: Write fixture `tests/create-with-hub/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-spoke-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + create_hub = true + hub_vpc_google_project = "fixture-hub-project" + hub_vpc_cidr = "10.1.0.0/24" + is_spoke_vpc_shared = true + workspace_google_project = "fixture-workspace-project" +} +``` + +- [ ] **Step 2: Append hub + peering + shared-VPC resources to `main.tf`** + +```hcl +# === Hub VPC ============================================================ +resource "google_compute_network" "hub_vpc" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-vpc-${var.suffix}" + project = var.hub_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +resource "google_compute_subnetwork" "hub_subnet" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-subnet-${var.suffix}" + project = var.hub_vpc_google_project + network = google_compute_network.hub_vpc[0].id + region = var.google_region + ip_cidr_range = var.hub_vpc_cidr + private_ip_google_access = true +} + +# === Peering ============================================================ +resource "google_compute_network_peering" "hub_to_spoke" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-spoke-${var.suffix}" + network = google_compute_network.hub_vpc[0].self_link + peer_network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link +} + +resource "google_compute_network_peering" "spoke_to_hub" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-spoke-hub-${var.suffix}" + network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link + peer_network = google_compute_network.hub_vpc[0].self_link +} + +# === Shared VPC ========================================================= +resource "google_compute_shared_vpc_host_project" "host" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + project = var.spoke_vpc_google_project +} + +resource "google_compute_shared_vpc_service_project" "service" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + host_project = google_compute_shared_vpc_host_project.host[0].project + service_project = var.workspace_google_project +} +``` + +- [ ] **Step 3: Wire hub outputs in `outputs.tf`** + +```hcl +output "hub_vpc_id" { + value = var.create_hub ? google_compute_network.hub_vpc[0].id : null + description = "ID of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_name" { + value = var.create_hub ? google_compute_network.hub_vpc[0].name : null + description = "Name of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_self_link" { + value = var.create_hub ? google_compute_network.hub_vpc[0].self_link : null + description = "Self-link of the hub VPC (null when create_hub=false)" +} + +output "hub_subnet_name" { + value = var.create_hub ? google_compute_subnetwork.hub_subnet[0].name : null + description = "Name of the hub subnet (null when create_hub=false)" +} +``` + +- [ ] **Step 4: Validate and plan** + +```bash +cd modules/gcp/network/tests/create-with-hub +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: `Plan: 8 to add, 0 to change, 0 to destroy.` (4 spoke + 2 hub + 2 peering + 2 shared-vpc — wait, 4+2+2+2=10. Let me recount: spoke_vpc, spoke_subnet, router, nat = 4. hub_vpc, hub_subnet = 2. hub_to_spoke peering, spoke_to_hub peering = 2. shared_vpc_host, shared_vpc_service = 2. Total = 10). + +Expected: `Plan: 10 to add`. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/network/ +git commit -m "$(cat <<'EOF' +feat(gcp/network): add hub VPC, peering, and Shared-VPC binding + +Adds hub VPC + subnet + bidirectional peering with spoke + optional +shared-VPC host/service binding, all gated on create_hub. Composer +passes restricted_egress -> create_hub. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 6: `modules/gcp/private-connectivity` — scaffold + locals (regional maps) + +**Files:** +- Create: `modules/gcp/private-connectivity/versions.tf` +- Create: `modules/gcp/private-connectivity/variables.tf` +- Create: `modules/gcp/private-connectivity/locals.tf` +- Create: `modules/gcp/private-connectivity/outputs.tf` +- Create: `modules/gcp/private-connectivity/psc.tf` (empty) +- Create: `modules/gcp/private-connectivity/firewall.tf` (empty) +- Create: `modules/gcp/private-connectivity/Makefile` +- Create: `modules/gcp/private-connectivity/README.md` placeholder + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "prefix" { type = string } +variable "suffix" { type = string } +variable "google_region" { type = string } + +# Spoke network refs +variable "spoke_vpc_id" { type = string } +variable "spoke_vpc_self_link" { type = string } +variable "spoke_vpc_google_project" { type = string } +variable "spoke_vpc_cidr" { type = string } + +# Hub network refs (nullable when no hub) +variable "hub_vpc_id" { type = string default = null } +variable "hub_vpc_self_link" { type = string default = null } +variable "hub_vpc_google_project" { type = string default = null } +variable "hub_subnet_name" { type = string default = null } +variable "hub_vpc_cidr" { type = string default = null } + +# Feature flags +variable "enable_frontend" { type = bool default = false } +variable "enable_backend" { type = bool default = false } +variable "restrict_egress" { type = bool default = false } + +# PSC subnet CIDR (always required when this module is invoked because +# the composer only instantiates it when at least one PSC flag is true) +variable "psc_subnet_cidr" { + type = string + description = "CIDR for the dedicated PSC subnet in the spoke VPC" +} + +# Optional hive metastore IP override; falls back to regional map +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP (looked up via internal map if null)" +} +``` + +- [ ] **Step 3: `locals.tf` — regional PSC service-attachment + hive metastore maps** + +Copy the maps verbatim from `modules/gcp-with-psc-exfiltration-protection/main.tf`. The plan reproduces them in full because future region additions will land in one place going forward. + +```hcl +locals { + google_frontend_psc_targets = { + "asia-northeast1" = "projects/general-prod-asianortheast1-01/regions/asia-northeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "asia-south1" = "projects/gen-prod-asias1-01/regions/asia-south1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "asia-southeast1" = "projects/general-prod-asiasoutheast1-01/regions/asia-southeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "australia-southeast1" = "projects/general-prod-ausoutheast1-01/regions/australia-southeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "europe-west1" = "projects/general-prod-europewest1-01/regions/europe-west1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "europe-west2" = "projects/general-prod-europewest2-01/regions/europe-west2/serviceAttachments/plproxy-psc-endpoint-all-ports" + "europe-west3" = "projects/general-prod-europewest3-01/regions/europe-west3/serviceAttachments/plproxy-psc-endpoint-all-ports" + "northamerica-northeast1" = "projects/general-prod-nanortheast1-01/regions/northamerica-northeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "southamerica-east1" = "projects/gen-prod-saeast1-01/regions/southamerica-east1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-central1" = "projects/gcp-prod-general/regions/us-central1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-east1" = "projects/general-prod-useast1-01/regions/us-east1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-east4" = "projects/general-prod-useast4-01/regions/us-east4/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-west1" = "projects/general-prod-uswest1-01/regions/us-west1/serviceAttachments/plproxy-psc-endpoint-all-ports" + "us-west4" = "projects/general-prod-uswest4-01/regions/us-west4/serviceAttachments/plproxy-psc-endpoint-all-ports" + } + + google_backend_psc_targets = { + "asia-northeast1" = "projects/prod-gcp-asia-northeast1/regions/asia-northeast1/serviceAttachments/ngrok-psc-endpoint" + "asia-south1" = "projects/prod-gcp-asia-south1/regions/asia-south1/serviceAttachments/ngrok-psc-endpoint" + "asia-southeast1" = "projects/prod-gcp-asia-southeast1/regions/asia-southeast1/serviceAttachments/ngrok-psc-endpoint" + "australia-southeast1" = "projects/prod-gcp-australia-southeast1/regions/australia-southeast1/serviceAttachments/ngrok-psc-endpoint" + "europe-west1" = "projects/prod-gcp-europe-west1/regions/europe-west1/serviceAttachments/ngrok-psc-endpoint" + "europe-west2" = "projects/prod-gcp-europe-west2/regions/europe-west2/serviceAttachments/ngrok-psc-endpoint" + "europe-west3" = "projects/prod-gcp-europe-west3/regions/europe-west3/serviceAttachments/ngrok-psc-endpoint" + "northamerica-northeast1" = "projects/prod-gcp-na-northeast1/regions/northamerica-northeast1/serviceAttachments/ngrok-psc-endpoint" + "southamerica-east1" = "projects/gen-prod-saeast1-01/regions/southamerica-east1/serviceAttachments/ngrok-psc-endpoint" + "us-central1" = "projects/prod-gcp-us-central1/regions/us-central1/serviceAttachments/ngrok-psc-endpoint" + "us-east1" = "projects/prod-gcp-us-east1/regions/us-east1/serviceAttachments/ngrok-psc-endpoint" + "us-east4" = "projects/prod-gcp-us-east4/regions/us-east4/serviceAttachments/ngrok-psc-endpoint" + "us-west1" = "projects/prod-gcp-us-west1/regions/us-west1/serviceAttachments/ngrok-psc-endpoint" + "us-west4" = "projects/prod-gcp-us-west4/regions/us-west4/serviceAttachments/ngrok-psc-endpoint" + } + + # Regional default Hive Metastore IPs per Databricks docs: + # https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore + # NOTE: keep this list curated. When null, the firewall rule omits the + # managed-hive allowance (acceptable when customers run their own metastore). + default_hive_metastore_ips = { + # Filled in by ops; leave empty initially. Override via var.hive_metastore_ip. + } + + hive_metastore_ip = coalesce(var.hive_metastore_ip, try(local.default_hive_metastore_ips[var.google_region], "")) + + hub_present = var.hub_vpc_id != null +} +``` + +- [ ] **Step 4: Empty `psc.tf`, `firewall.tf`, `Makefile`, `README.md`, `outputs.tf`** + +`outputs.tf`: + +```hcl +output "psc_subnet_self_link" { value = null description = "Self-link of the PSC subnet" } +output "frontend_psc_fr_id" { value = null description = "Name of the frontend PSC forwarding rule (null when enable_frontend=false)" } +output "backend_psc_fr_id" { value = null description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" } +output "hub_frontend_psc_fr_id" { value = null description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" } +output "frontend_psc_ip_spoke" { value = null description = "IP address of the spoke-side frontend PSC endpoint" } +output "backend_psc_ip_spoke" { value = null description = "IP address of the spoke-side backend PSC endpoint" } +output "frontend_psc_ip_hub" { value = null description = "IP address of the hub-side frontend PSC endpoint (null when no hub)" } +``` + +`psc.tf` and `firewall.tf` are empty for now (filled in next tasks). + +`Makefile`: + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +`README.md`: + +```markdown +# modules/gcp/private-connectivity + +GCP-side PSC endpoints + restricted-egress firewall for the Databricks GCP composer. + + + +``` + +- [ ] **Step 5: Validate** + +```bash +cd modules/gcp/private-connectivity && terraform init -backend=false && terraform validate +``` + +Expected: `Success! The configuration is valid.` + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/private-connectivity/ +git commit -m "$(cat <<'EOF' +feat(gcp/private-connectivity): scaffold with regional PSC maps + +Adds modules/gcp/private-connectivity with variables, regional PSC +service-attachment + hive metastore maps in locals.tf, empty psc.tf +and firewall.tf, null outputs. Resources added in follow-up tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 7: `modules/gcp/private-connectivity` — PSC subnet + endpoints + fixture + +**Files:** +- Modify: `modules/gcp/private-connectivity/psc.tf` +- Modify: `modules/gcp/private-connectivity/outputs.tf` +- Create: `modules/gcp/private-connectivity/tests/full-isolated/main.tf` + +- [ ] **Step 1: Write fixture `tests/full-isolated/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + hub_subnet_name = "fixture-hub-subnet-abc123" + hub_vpc_cidr = "10.1.0.0/24" + + enable_frontend = true + enable_backend = true + restrict_egress = true + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +- [ ] **Step 2: Implement `psc.tf`** + +```hcl +# === PSC Subnet (spoke) ================================================= +resource "google_compute_subnetwork" "psc_subnet" { + name = "${var.prefix}-psc-subnet-${var.suffix}" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_id + region = var.google_region + ip_cidr_range = var.psc_subnet_cidr + private_ip_google_access = true +} + +# === Backend (SCC) PSC endpoint — spoke ================================= +resource "google_compute_address" "backend_address" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "backend_fr" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.backend_address[0].id + target = local.google_backend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — spoke ====================================== +resource "google_compute_address" "frontend_address_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.frontend_address_spoke[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — hub (transit) ============================== +resource "google_compute_address" "frontend_address_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ip-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + subnetwork = var.hub_subnet_name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ep-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + network = var.hub_vpc_id + ip_address = google_compute_address.frontend_address_hub[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} +``` + +- [ ] **Step 3: Wire PSC outputs in `outputs.tf`** + +```hcl +output "psc_subnet_self_link" { + value = google_compute_subnetwork.psc_subnet.self_link + description = "Self-link of the PSC subnet" +} + +output "frontend_psc_fr_id" { + value = var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_spoke[0].name : null + description = "Name of the frontend PSC forwarding rule (null when enable_frontend=false)" +} + +output "backend_psc_fr_id" { + value = var.enable_backend ? google_compute_forwarding_rule.backend_fr[0].name : null + description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" +} + +output "hub_frontend_psc_fr_id" { + value = local.hub_present && var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_hub[0].name : null + description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" +} + +output "frontend_psc_ip_spoke" { + value = var.enable_frontend ? google_compute_address.frontend_address_spoke[0].address : null + description = "IP address of the spoke-side frontend PSC endpoint" +} + +output "backend_psc_ip_spoke" { + value = var.enable_backend ? google_compute_address.backend_address[0].address : null + description = "IP address of the spoke-side backend PSC endpoint" +} + +output "frontend_psc_ip_hub" { + value = local.hub_present && var.enable_frontend ? google_compute_address.frontend_address_hub[0].address : null + description = "IP address of the hub-side frontend PSC endpoint (null when no hub)" +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/private-connectivity/tests/full-isolated +terraform init -backend=false +terraform validate +terraform plan -refresh=false +``` + +Expected: `Plan: 7 to add` (PSC subnet + 2 addresses + 2 forwarding rules for spoke + 1 address + 1 forwarding rule for hub). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/private-connectivity/ +git commit -m "$(cat <<'EOF' +feat(gcp/private-connectivity): add PSC subnet, addresses, forwarding rules + +PSC subnet (spoke); backend (SCC) endpoint gated on enable_backend; +frontend endpoint (spoke) gated on enable_frontend; frontend endpoint +(hub) gated on hub_present AND enable_frontend. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 8: `modules/gcp/private-connectivity` — egress firewall rules + fixture + +**Files:** +- Modify: `modules/gcp/private-connectivity/firewall.tf` +- Modify: `modules/gcp/private-connectivity/tests/full-isolated/main.tf` (no change to assertions; the plan resource count grows) +- Create: `modules/gcp/private-connectivity/tests/no-egress/main.tf` (variant with `restrict_egress = false`) + +- [ ] **Step 1: Write `firewall.tf`** + +```hcl +# Egress firewall stack — only emitted when restrict_egress = true. +# Names follow the existing pattern from modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf +# and firewall-hub.tf to keep operator familiarity. + +# === Spoke deny-egress ================================================== +resource "google_compute_firewall" "spoke_default_deny_egress" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-default-deny-egress" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1100 + destination_ranges = ["0.0.0.0/0"] + source_ranges = [] + + deny { + protocol = "all" + } +} + +# === Spoke allow Google APIs ============================================ +resource "google_compute_firewall" "spoke_allow_google_apis" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-google-apis" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "199.36.153.4/30", + "199.36.153.8/30", + "34.126.0.0/18" + ] + + allow { + protocol = "all" + } +} + +# === Spoke allow Databricks control plane (to PSC IPs) ================== +resource "google_compute_firewall" "spoke_allow_ctl_plane" { + count = var.restrict_egress && var.enable_frontend && var.enable_backend ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-databricks-control-plane" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "${google_compute_forwarding_rule.backend_fr[0].ip_address}/32", + "${google_compute_forwarding_rule.frontend_fr_spoke[0].ip_address}/32" + ] + + allow { + protocol = "tcp" + ports = ["443"] + } +} + +# === Spoke allow managed Hive (conditional on hive_metastore_ip) ======== +resource "google_compute_firewall" "spoke_allow_hive" { + count = var.restrict_egress && local.hive_metastore_ip != "" ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-${var.google_region}-managed-hive" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = ["${local.hive_metastore_ip}/32"] + + allow { + protocol = "tcp" + ports = ["3306"] + } +} + +# === Hub ingress from spoke ============================================= +resource "google_compute_firewall" "hub_ingress" { + count = var.restrict_egress && local.hub_present ? 1 : 0 + + name = "${var.prefix}-hub-${var.suffix}-ingress" + project = var.hub_vpc_google_project + network = var.hub_vpc_self_link + + direction = "INGRESS" + priority = 1000 + destination_ranges = [] + source_ranges = [var.spoke_vpc_cidr] + + allow { + protocol = "all" + } +} +``` + +- [ ] **Step 2: Write `tests/no-egress/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + enable_frontend = true + enable_backend = false + restrict_egress = false + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +- [ ] **Step 3: Validate both fixtures** + +```bash +cd modules/gcp/private-connectivity/tests/full-isolated && terraform init -backend=false && terraform validate && terraform plan -refresh=false +``` + +Expected: `Plan: 12 to add` (7 from Task 7 + 5 firewall rules: deny + google-apis + ctl-plane + hive (only if `hive_metastore_ip` set, which it's not in the fixture — so 0) + hub-ingress = 4 firewall rules in this fixture → 11 total. If `hive_metastore_ip` is set in the fixture, expect 12). + +Note: fixture has `hive_metastore_ip` unset and `default_hive_metastore_ips` map is empty → `local.hive_metastore_ip = ""` → hive firewall is NOT emitted. So fixture should plan: 7 PSC + 4 firewall = 11 resources. + +```bash +cd ../no-egress && terraform init -backend=false && terraform validate && terraform plan -refresh=false +``` + +Expected: `Plan: 3 to add` (PSC subnet + 1 frontend address + 1 frontend forwarding rule). + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/private-connectivity/ +git commit -m "$(cat <<'EOF' +feat(gcp/private-connectivity): add egress firewall stack + +Spoke deny-egress (priority 1100), allow-google-apis, allow control +plane (to PSC IPs), allow managed-hive (conditional on metastore IP), +and hub ingress from spoke CIDR. All gated on restrict_egress. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 9: `modules/gcp/account` — scaffold + variables + versions + +**Files:** +- Create: `modules/gcp/account/versions.tf` +- Create: `modules/gcp/account/variables.tf` +- Create: `modules/gcp/account/main.tf` +- Create: `modules/gcp/account/vpc-endpoints.tf` +- Create: `modules/gcp/account/pas.tf` +- Create: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/Makefile` +- Create: `modules/gcp/account/README.md` + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "prefix" { type = string } +variable "suffix" { type = string } +variable "workspace_name" { type = string default = null } +variable "databricks_account_id" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } + +variable "vpc_source" { + type = string + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +variable "spoke_vpc_name" { type = string default = null } +variable "spoke_subnet_name" { type = string default = null } +variable "spoke_vpc_google_project" { type = string default = null } +variable "hub_vpc_google_project" { type = string default = null } + +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_psc_fr_id" { type = string default = null } +variable "backend_psc_fr_id" { type = string default = null } +variable "hub_frontend_psc_fr_id" { type = string default = null } + +variable "enable_frontend" { type = bool default = false } +variable "enable_backend" { type = bool default = false } +variable "private_access_only" { type = bool default = false } + +variable "nat_dependency" { + type = any + default = null + description = "Opaque value used as depends_on for the workspace to ensure NAT readiness" +} +``` + +- [ ] **Step 3: Empty `main.tf`, `vpc-endpoints.tf`, `pas.tf`, `outputs.tf` placeholders** + +`main.tf`: + +```hcl +locals { + workspace_name = coalesce(var.workspace_name, "${var.prefix}-ws-${var.suffix}") + emit_mws_networks = var.vpc_source != "databricks_managed" + emit_vpc_endpoints = var.frontend_psc_fr_id != null && var.backend_psc_fr_id != null + emit_pas = var.private_access_only +} +``` + +`outputs.tf`: + +```hcl +output "workspace_id" { value = null description = "Databricks workspace ID" } +output "workspace_url" { value = null description = "Databricks workspace URL" } +output "network_id" { value = null description = "mws_networks ID (null when databricks_managed)" } +output "frontend_endpoint_id" { value = null description = "Frontend mws_vpc_endpoint ID (null when no PSC)" } +output "backend_endpoint_id" { value = null description = "Backend mws_vpc_endpoint ID (null when no PSC)" } +output "transit_endpoint_id" { value = null description = "Hub-side mws_vpc_endpoint ID (null when no hub)" } +``` + +`Makefile`: + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +`README.md`: + +```markdown +# modules/gcp/account + +All `databricks_mws_*` resources for the GCP composer: `mws_networks`, `mws_workspaces`, `mws_vpc_endpoint`, `mws_private_access_settings`. + + + +``` + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/account && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): scaffold module + +Adds modules/gcp/account with variable declarations, locals for +derived flags, empty main.tf/vpc-endpoints.tf/pas.tf, null outputs. +Resources added in follow-up tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 10: `modules/gcp/account` — databricks-managed workspace shape + fixture + +**Files:** +- Modify: `modules/gcp/account/main.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/tests/databricks-managed/main.tf` + +- [ ] **Step 1: Write fixture** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "databricks_managed" +} +``` + +- [ ] **Step 2: Add `databricks_mws_workspaces` to `main.tf`** + +Append to `modules/gcp/account/main.tf`: + +```hcl +resource "databricks_mws_workspaces" "this" { + account_id = var.databricks_account_id + workspace_name = local.workspace_name + location = var.google_region + + cloud_resource_container { + gcp { + project_id = var.google_project + } + } + + network_id = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + private_access_settings_id = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + + token { + comment = "Terraform" + } + + depends_on = [var.nat_dependency] +} +``` + +- [ ] **Step 3: Wire workspace outputs** + +```hcl +output "workspace_id" { + value = databricks_mws_workspaces.this.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = databricks_mws_workspaces.this.workspace_url + description = "Databricks workspace URL" +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/account/tests/databricks-managed +terraform init -backend=false +terraform validate +``` + +Expected: validate passes. (Plan cannot run without real Databricks credentials; validate is sufficient for this fixture.) + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add databricks_mws_workspaces resource + +Workspace resource with conditional network_id and +private_access_settings_id (both null when databricks_managed). + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 11: `modules/gcp/account` — mws_networks (customer VPC) + fixture + +**Files:** +- Modify: `modules/gcp/account/main.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/tests/byovpc/main.tf` + +- [ ] **Step 1: Write fixture** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" +} +``` + +- [ ] **Step 2: Append `databricks_mws_networks` to `main.tf`** + +```hcl +resource "databricks_mws_networks" "this" { + count = local.emit_mws_networks ? 1 : 0 + + account_id = var.databricks_account_id + network_name = "${var.prefix}-ntw-${var.suffix}" + + gcp_network_info { + network_project_id = var.spoke_vpc_google_project + vpc_id = var.spoke_vpc_name + subnet_id = var.spoke_subnet_name + subnet_region = var.google_region + } + + dynamic "vpc_endpoints" { + for_each = local.emit_vpc_endpoints ? [1] : [] + content { + dataplane_relay = [databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id] + rest_api = [databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id] + } + } +} +``` + +- [ ] **Step 3: Wire `network_id` output** + +```hcl +output "network_id" { + value = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + description = "mws_networks ID (null when databricks_managed)" +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/account/tests/byovpc +terraform init -backend=false +terraform validate +``` + +Expected: validate passes (note: references to `databricks_mws_vpc_endpoint.backend[0]` and `frontend[0]` resolve at plan time even if `emit_vpc_endpoints` is false because they're inside a `dynamic` block; the for_each guard prevents evaluation). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add databricks_mws_networks for customer VPC + +mws_networks emitted when vpc_source != databricks_managed; the +vpc_endpoints block is conditionally populated via dynamic when both +frontend and backend forwarding-rule IDs are provided. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 12: `modules/gcp/account` — mws_vpc_endpoint resources + fixture + +**Files:** +- Modify: `modules/gcp/account/vpc-endpoints.tf` +- Modify: `modules/gcp/account/outputs.tf` +- Create: `modules/gcp/account/tests/psc-with-pas/main.tf` + +- [ ] **Step 1: Write `vpc-endpoints.tf`** + +```hcl +resource "databricks_mws_vpc_endpoint" "frontend" { + count = var.enable_frontend && var.frontend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-ws-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.frontend_psc_fr_id + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "backend" { + count = var.enable_backend && var.backend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-scc-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.backend_psc_fr_id + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "transit" { + count = var.enable_frontend && var.hub_frontend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-hub-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.hub_vpc_google_project + psc_endpoint_name = var.hub_frontend_psc_fr_id + endpoint_region = var.google_region + } +} +``` + +- [ ] **Step 2: Wire endpoint outputs** + +```hcl +output "frontend_endpoint_id" { + value = var.enable_frontend && var.frontend_psc_fr_id != null ? databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id : null + description = "Frontend mws_vpc_endpoint ID (null when no PSC)" +} + +output "backend_endpoint_id" { + value = var.enable_backend && var.backend_psc_fr_id != null ? databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id : null + description = "Backend mws_vpc_endpoint ID (null when no PSC)" +} + +output "transit_endpoint_id" { + value = var.enable_frontend && var.hub_frontend_psc_fr_id != null ? databricks_mws_vpc_endpoint.transit[0].vpc_endpoint_id : null + description = "Hub-side mws_vpc_endpoint ID (null when no hub)" +} +``` + +- [ ] **Step 3: Write fixture `tests/psc-with-pas/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + + frontend_psc_fr_id = "fixture-psc-ws-ep-abc123" + backend_psc_fr_id = "fixture-psc-scc-ep-abc123" + hub_frontend_psc_fr_id = "fixture-hub-psc-ws-ep-abc123" + + enable_frontend = true + enable_backend = true + private_access_only = true +} +``` + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/account/tests/psc-with-pas && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add databricks_mws_vpc_endpoint resources + +Frontend, backend (SCC), and transit (hub) mws_vpc_endpoints, each +gated on its enable_* flag and the presence of the corresponding +forwarding-rule name from private-connectivity. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 13: `modules/gcp/account` — private access settings + +**Files:** +- Modify: `modules/gcp/account/pas.tf` + +- [ ] **Step 1: Write `pas.tf`** + +```hcl +resource "databricks_mws_private_access_settings" "this" { + count = local.emit_pas ? 1 : 0 + + account_id = var.databricks_account_id + private_access_settings_name = "${var.prefix}-pas-${var.suffix}" + region = var.google_region + public_access_enabled = false + private_access_level = "ACCOUNT" +} +``` + +- [ ] **Step 2: Validate (reuse `tests/psc-with-pas` fixture)** + +```bash +cd modules/gcp/account/tests/psc-with-pas && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 3: Commit** + +```bash +git add modules/gcp/account/ +git commit -m "$(cat <<'EOF' +feat(gcp/account): add mws_private_access_settings + +Emitted when private_access_only=true; public_access_enabled=false, +private_access_level=ACCOUNT. Workspace references via +private_access_settings_id. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 14: `modules/gcp/dns` — scaffold + variables + +**Files:** +- Create: `modules/gcp/dns/versions.tf` +- Create: `modules/gcp/dns/variables.tf` +- Create: `modules/gcp/dns/hub.tf` (empty) +- Create: `modules/gcp/dns/spoke.tf` (empty) +- Create: `modules/gcp/dns/outputs.tf` +- Create: `modules/gcp/dns/Makefile` +- Create: `modules/gcp/dns/README.md` + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "prefix" { type = string } +variable "google_region" { type = string } + +# Hub +variable "hub_vpc_id" { type = string } +variable "hub_vpc_self_link" { type = string } +variable "hub_vpc_google_project" { type = string } + +# Spoke +variable "spoke_vpc_id" { type = string } +variable "spoke_vpc_self_link" { type = string } +variable "spoke_vpc_google_project" { type = string } + +# Workspace +variable "workspace_url" { type = string } + +# PSC IPs +variable "frontend_psc_ip_spoke" { type = string } +variable "frontend_psc_ip_hub" { type = string default = null } +variable "backend_psc_ip_spoke" { type = string } +``` + +- [ ] **Step 3: Empty `outputs.tf`, `Makefile`, `README.md`** + +`outputs.tf`: + +```hcl +# This module has no outputs; DNS records are terminal. +``` + +`Makefile`: same template as Task 6. + +`README.md`: + +```markdown +# modules/gcp/dns + +Private DNS zones (hub + spoke) used with restricted-egress workspaces. + + + +``` + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/dns && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/dns/ +git commit -m "$(cat <<'EOF' +feat(gcp/dns): scaffold module with variables + +Variable declarations for hub + spoke DNS zones. Resources added in +follow-up task. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 15: `modules/gcp/dns` — hub + spoke zones and records + fixture + +**Files:** +- Modify: `modules/gcp/dns/hub.tf` +- Modify: `modules/gcp/dns/spoke.tf` +- Create: `modules/gcp/dns/tests/hub-and-spoke/main.tf` + +- [ ] **Step 1: Write fixture** + +```hcl +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "dns" { + source = "../.." + + prefix = "fixture" + google_region = "us-central1" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + + workspace_url = "https://1234567890123456.7.gcp.databricks.com" + + frontend_psc_ip_spoke = "10.0.255.4" + frontend_psc_ip_hub = "10.1.0.10" + backend_psc_ip_spoke = "10.0.255.5" +} +``` + +- [ ] **Step 2: Write `hub.tf`** + +```hcl +locals { + # Regex extracts the workspace DNS id (numeric.numeric) from the URL. + # Matches the behavior of the legacy gcp-with-psc-exfiltration-protection module. + workspace_dns_id = regex("[0-9]+\\.[0-9]+", var.workspace_url) +} + +# === gcp.databricks.com (hub) ============================================ +resource "google_dns_managed_zone" "hub_dbx" { + name = "${var.prefix}-hub-gcp-databricks-com" + project = var.hub_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "hub_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_psc_auth" { + name = "${var.google_region}.psc-auth.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +# === gcr.io ============================================================== +resource "google_dns_managed_zone" "gcr" { + name = "${var.prefix}-gcr-io" + project = var.hub_vpc_google_project + dns_name = "gcr.io." + description = "Private DNS zone for GCR private resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "gcr_cname" { + name = "*.${google_dns_managed_zone.gcr.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "CNAME" + ttl = 300 + rrdatas = ["gcr.io."] +} + +resource "google_dns_record_set" "gcr_a" { + name = google_dns_managed_zone.gcr.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} + +# === googleapis.com ====================================================== +resource "google_dns_managed_zone" "google_apis" { + name = "${var.prefix}-google-apis" + project = var.hub_vpc_google_project + dns_name = "googleapis.com." + description = "Private DNS zone for Google APIs resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "google_apis_cname" { + name = "*.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "CNAME" + ttl = 300 + rrdatas = ["restricted.googleapis.com."] +} + +resource "google_dns_record_set" "google_apis_a" { + name = "restricted.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.4", "199.36.153.5", "199.36.153.6", "199.36.153.7"] +} + +# === pkg.dev ============================================================= +resource "google_dns_managed_zone" "pkg_dev" { + name = "${var.prefix}-pkg-dev" + project = var.hub_vpc_google_project + dns_name = "pkg.dev." + description = "Private DNS zone for Go Packages resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "pkg_dev_cname" { + name = "*.${google_dns_managed_zone.pkg_dev.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "CNAME" + ttl = 300 + rrdatas = ["pkg.dev."] +} + +resource "google_dns_record_set" "pkg_dev_a" { + name = google_dns_managed_zone.pkg_dev.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} +``` + +- [ ] **Step 3: Write `spoke.tf`** + +```hcl +# === gcp.databricks.com (spoke) ========================================== +resource "google_dns_managed_zone" "spoke_dbx" { + name = "${var.prefix}-spoke-gcp-databricks-com" + project = var.spoke_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.spoke_vpc_id + } + } +} + +resource "google_dns_record_set" "spoke_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_tunnel" { + name = "tunnel.${var.google_region}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.backend_psc_ip_spoke] +} +``` + +- [ ] **Step 4: Validate fixture** + +```bash +cd modules/gcp/dns/tests/hub-and-spoke && terraform init -backend=false && terraform validate && terraform plan -refresh=false +``` + +Expected: `Plan: 16 to add` (5 zones + 11 record sets: 3 hub_dbx + 2 gcr + 2 google_apis + 2 pkg_dev + 3 spoke = 12. Let me recount: hub_dbx zone + 3 records = 4. gcr zone + 2 records = 3. google_apis zone + 2 records = 3. pkg_dev zone + 2 records = 3. spoke_dbx zone + 3 records = 4. Total = 17). + +Expected: `Plan: 17 to add`. + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/dns/ +git commit -m "$(cat <<'EOF' +feat(gcp/dns): add hub and spoke private DNS zones + +Hub: gcp.databricks.com, gcr.io, googleapis.com, pkg.dev. +Spoke: gcp.databricks.com with workspace/dp/tunnel records. +workspace_dns_id is regex-extracted from workspace_url. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 16: `modules/gcp/databricks-workspace` — composer scaffold + variables + +**Files:** +- Create: `modules/gcp/databricks-workspace/versions.tf` +- Create: `modules/gcp/databricks-workspace/variables.tf` +- Create: `modules/gcp/databricks-workspace/main.tf` +- Create: `modules/gcp/databricks-workspace/outputs.tf` +- Create: `modules/gcp/databricks-workspace/Makefile` +- Create: `modules/gcp/databricks-workspace/README.md` + +- [ ] **Step 1: `versions.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + random = { + source = "hashicorp/random" + version = ">= 3.0" + } + } +} +``` + +- [ ] **Step 2: `variables.tf` — full composer API as specified** + +```hcl +# === Identity =========================================================== +variable "prefix" { type = string } +variable "databricks_account_id" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } +variable "workspace_name" { type = string default = null } +variable "tags" { type = map(string) default = {} } + +# === VPC source ========================================================= +variable "vpc_source" { + type = string + default = "databricks_managed" + description = "One of: databricks_managed, create, existing" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +# When vpc_source = "create" +variable "spoke_vpc_cidr" { type = string default = null } +variable "subnet_cidr" { type = string default = null } +variable "pod_cidr" { type = string default = null } +variable "svc_cidr" { type = string default = null } + +# When vpc_source = "existing" +variable "existing_vpc_name" { type = string default = null } +variable "existing_subnet_name" { type = string default = null } + +# === Connectivity feature flags ========================================= +variable "private_link_frontend" { type = bool default = false } +variable "private_link_backend" { type = bool default = false } +variable "private_access_only" { type = bool default = false } +variable "restricted_egress" { type = bool default = false } + +# === Required when restricted_egress = true ============================= +variable "hub_vpc_google_project" { type = string default = null } +variable "spoke_vpc_google_project" { type = string default = null } +variable "is_spoke_vpc_shared" { type = bool default = false } +variable "hub_vpc_cidr" { type = string default = null } +variable "psc_subnet_cidr" { type = string default = null } +variable "hive_metastore_ip" { type = string default = null } +``` + +- [ ] **Step 3: `main.tf` — locals, random suffix, preconditions (no submodule wiring yet)** + +```hcl +locals { + databricks_managed = var.vpc_source == "databricks_managed" + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + any_private_link = var.private_link_frontend || var.private_link_backend + spoke_project = coalesce(var.spoke_vpc_google_project, var.google_project) +} + +resource "random_string" "suffix" { + length = 6 + special = false + upper = false + + lifecycle { + ignore_changes = [special, upper] + } +} + +# Cross-variable preconditions. Terraform doesn't support cross-var +# validation in variable blocks; we use a null_resource lifecycle.precondition +# stack instead. +resource "null_resource" "preconditions" { + lifecycle { + precondition { + condition = !var.restricted_egress || local.create_vpc + error_message = "restricted_egress=true requires vpc_source=\"create\" (hub-spoke topology needs us to own both VPCs)." + } + precondition { + condition = !var.restricted_egress || local.any_private_link + error_message = "restricted_egress=true requires at least one of private_link_frontend or private_link_backend." + } + precondition { + condition = !var.restricted_egress || (var.hub_vpc_google_project != null && var.hub_vpc_cidr != null && var.psc_subnet_cidr != null) + error_message = "restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, and psc_subnet_cidr." + } + precondition { + condition = !local.create_vpc || (var.spoke_vpc_cidr != null && var.subnet_cidr != null) + error_message = "vpc_source=\"create\" requires spoke_vpc_cidr and subnet_cidr." + } + precondition { + condition = !local.use_existing_vpc || (var.existing_vpc_name != null && var.existing_subnet_name != null) + error_message = "vpc_source=\"existing\" requires existing_vpc_name and existing_subnet_name." + } + precondition { + condition = !local.databricks_managed || (!var.private_link_frontend && !var.private_link_backend && !var.restricted_egress) + error_message = "vpc_source=\"databricks_managed\" forbids private_link_frontend, private_link_backend, and restricted_egress." + } + } +} +``` + +Note: `null_resource` requires the `hashicorp/null` provider; add it to `versions.tf`. Update `versions.tf`: + +```hcl + null = { + source = "hashicorp/null" + version = ">= 3.0" + } +``` + +- [ ] **Step 4: Empty `outputs.tf`** + +```hcl +output "workspace_id" { value = null description = "Databricks workspace ID" } +output "workspace_url" { value = null description = "Databricks workspace URL" } +output "network_id" { value = null description = "mws_networks ID (null when databricks_managed)" } +output "vpc_id" { value = null description = "Spoke VPC ID (null when databricks_managed)" } +output "spoke_vpc_id" { value = null description = "Spoke VPC ID (null when databricks_managed)" } +output "hub_vpc_id" { value = null description = "Hub VPC ID (null when not restricted_egress)" } +output "suffix" { value = random_string.suffix.result description = "Random suffix used in resource names" } +``` + +- [ ] **Step 5: Makefile + README placeholder** (same template as previous modules) + +- [ ] **Step 6: Validate** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 7: Commit** + +```bash +git add modules/gcp/databricks-workspace/ +git commit -m "$(cat <<'EOF' +feat(gcp/databricks-workspace): scaffold composer with preconditions + +Composer module with full variable API, random_string suffix, locals +for derived flags, and null_resource.preconditions stack enforcing all +cross-variable rules from the spec. Submodule wiring follows in next +tasks. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 17: Composer — wire `network`, `private-connectivity`, `account`, `dns` submodules + +**Files:** +- Modify: `modules/gcp/databricks-workspace/main.tf` +- Modify: `modules/gcp/databricks-workspace/outputs.tf` + +- [ ] **Step 1: Append submodule blocks to `main.tf`** + +```hcl +module "network" { + source = "../network" + count = local.databricks_managed ? 0 : 1 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + vpc_source = var.vpc_source + spoke_vpc_google_project = local.spoke_project + + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr + + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name + + create_hub = var.restricted_egress + hub_vpc_google_project = var.hub_vpc_google_project + hub_vpc_cidr = var.hub_vpc_cidr + is_spoke_vpc_shared = var.is_spoke_vpc_shared + workspace_google_project = var.google_project +} + +module "private_connectivity" { + source = "../private-connectivity" + count = local.any_private_link ? 1 : 0 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + spoke_vpc_cidr = var.spoke_vpc_cidr + + hub_vpc_id = var.restricted_egress ? module.network[0].hub_vpc_id : null + hub_vpc_self_link = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + hub_vpc_google_project = var.hub_vpc_google_project + hub_subnet_name = var.restricted_egress ? module.network[0].hub_subnet_name : null + hub_vpc_cidr = var.hub_vpc_cidr + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + restrict_egress = var.restricted_egress + psc_subnet_cidr = var.psc_subnet_cidr + + hive_metastore_ip = var.hive_metastore_ip +} + +module "account" { + source = "../account" + + prefix = var.prefix + suffix = random_string.suffix.result + workspace_name = var.workspace_name + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + vpc_source = var.vpc_source + + spoke_vpc_name = local.databricks_managed ? null : module.network[0].spoke_vpc_name + spoke_subnet_name = local.databricks_managed ? null : module.network[0].spoke_subnet_name + spoke_vpc_google_project = local.spoke_project + hub_vpc_google_project = var.hub_vpc_google_project + + frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].frontend_psc_fr_id : null + backend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].backend_psc_fr_id : null + hub_frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].hub_frontend_psc_fr_id : null + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + private_access_only = var.private_access_only + + nat_dependency = local.databricks_managed ? null : module.network[0].nat_id +} + +module "dns" { + source = "../dns" + count = var.restricted_egress ? 1 : 0 + + prefix = var.prefix + google_region = var.google_region + + hub_vpc_id = module.network[0].hub_vpc_id + hub_vpc_self_link = module.network[0].hub_vpc_self_link + hub_vpc_google_project = var.hub_vpc_google_project + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity[0].frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity[0].frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity[0].backend_psc_ip_spoke +} +``` + +- [ ] **Step 2: Wire composer outputs** + +Replace `outputs.tf`: + +```hcl +output "workspace_id" { + value = module.account.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.account.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = module.account.network_id + description = "mws_networks ID (null when databricks_managed)" +} + +output "vpc_id" { + value = try(module.network[0].spoke_vpc_id, null) + description = "Spoke VPC ID (null when databricks_managed)" +} + +output "spoke_vpc_id" { + value = try(module.network[0].spoke_vpc_id, null) + description = "Spoke VPC ID (null when databricks_managed)" +} + +output "hub_vpc_id" { + value = try(module.network[0].hub_vpc_id, null) + description = "Hub VPC ID (null when not restricted_egress)" +} + +output "suffix" { + value = random_string.suffix.result + description = "Random suffix used in resource names" +} +``` + +- [ ] **Step 3: Validate** + +```bash +cd modules/gcp/databricks-workspace && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 4: Commit** + +```bash +git add modules/gcp/databricks-workspace/ +git commit -m "$(cat <<'EOF' +feat(gcp/databricks-workspace): wire submodules in composer + +Conditional module blocks for network, private-connectivity, dns +(each gated by appropriate flags) and always-on account module. +Composer outputs wired to module outputs with try() for nullable +network outputs. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 18: Composer — positive fixtures (basic / byovpc / existing / psc-isolated) + +**Files:** +- Create: `modules/gcp/databricks-workspace/tests/basic/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/byovpc/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/existing-vpc/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/psc-isolated/main.tf` + +Each fixture follows this pattern: + +- [ ] **Step 1: Write `tests/basic/main.tf`** + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" +} +``` + +- [ ] **Step 2: Write `tests/byovpc/main.tf`** + +Same provider block, then: + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} +``` + +- [ ] **Step 3: Write `tests/existing-vpc/main.tf`** + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "existing" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} +``` + +- [ ] **Step 4: Write `tests/psc-isolated/main.tf`** + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + is_spoke_vpc_shared = true + hub_vpc_cidr = "10.1.0.0/24" + psc_subnet_cidr = "10.0.255.0/28" +} +``` + +- [ ] **Step 5: Validate every fixture** + +```bash +for d in basic byovpc existing-vpc psc-isolated; do + echo "=== $d ===" && cd modules/gcp/databricks-workspace/tests/$d && \ + terraform init -backend=false && terraform validate && cd - +done +``` + +Expected: all four validate passes. + +- [ ] **Step 6: Commit** + +```bash +git add modules/gcp/databricks-workspace/tests/ +git commit -m "$(cat <<'EOF' +test(gcp/databricks-workspace): positive fixtures for 4 scenarios + +basic (databricks_managed), byovpc (create), existing-vpc (existing), +and psc-isolated (create + all PSC flags + restricted_egress). +Each fixture validates the full module graph. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 19: Composer — negative fixtures (precondition failures) + +**Files:** +- Create: `modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf` +- Create: `modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf` + +- [ ] **Step 1: Write each fixture** + +`negative-restricted-egress-managed/main.tf` (expect: precondition error "restricted_egress=true requires vpc_source=create"): + +```hcl +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { project = "f" region = "us-central1" } +provider "databricks" { host = "https://accounts.gcp.databricks.com" account_id = "00000000-0000-0000-0000-000000000000" } + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "databricks_managed" + restricted_egress = true +} +``` + +`negative-restricted-egress-missing-hub/main.tf`: + +```hcl +# same provider/header +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + private_link_frontend = true + private_link_backend = true + restricted_egress = true + # hub_vpc_google_project, hub_vpc_cidr, psc_subnet_cidr all null -> precondition fail +} +``` + +`negative-existing-missing-name/main.tf`: + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "existing" + # existing_vpc_name / existing_subnet_name null -> precondition fail +} +``` + +`negative-managed-with-psc/main.tf`: + +```hcl +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "f" + google_region = "us-central1" + + vpc_source = "databricks_managed" + private_link_frontend = true # forbidden with databricks_managed +} +``` + +- [ ] **Step 2: Verify each fixture fails at plan time** + +```bash +for d in negative-restricted-egress-managed negative-restricted-egress-missing-hub negative-existing-missing-name negative-managed-with-psc; do + echo "=== $d ===" && cd modules/gcp/databricks-workspace/tests/$d && \ + terraform init -backend=false && \ + if terraform plan -refresh=false; then + echo "FAIL: $d should have failed plan"; exit 1 + else + echo "OK: $d failed plan as expected" + fi && cd - +done +``` + +Expected: each fixture fails at plan time with a precondition error message matching the spec table. + +- [ ] **Step 3: Commit** + +```bash +git add modules/gcp/databricks-workspace/tests/ +git commit -m "$(cat <<'EOF' +test(gcp/databricks-workspace): negative fixtures for preconditions + +Four fixtures, each violating one precondition rule from the spec. +Each fixture must fail `terraform plan` with a clear error message. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 20: Relocate `modules/gcp-sa-provisioning` → `modules/gcp/service-account` + +**Files:** +- Move: `modules/gcp-sa-provisioning/` → `modules/gcp/service-account/` +- Create: `modules/gcp-sa-provisioning/README.md` (deprecation stub) + +- [ ] **Step 1: `git mv` the directory** + +```bash +git mv modules/gcp-sa-provisioning modules/gcp/service-account +``` + +- [ ] **Step 2: Update the Makefile path inside the relocated module** + +Read `modules/gcp/service-account/Makefile`. The relative path to `.terraform-docs.yml` needs to deepen by one level: `../../.terraform-docs.yml` → `../../../.terraform-docs.yml`. Edit: + +```makefile +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . +``` + +- [ ] **Step 3: Create deprecation stub at the old path** + +```bash +mkdir -p modules/gcp-sa-provisioning +``` + +Write `modules/gcp-sa-provisioning/README.md`: + +```markdown +# DEPRECATED — moved to `modules/gcp/service-account/` + +This module has been relocated to [`../gcp/service-account/`](../gcp/service-account/). + +All variables, outputs, and resource addresses are unchanged. Update your +module `source` from: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp-sa-provisioning" +``` + +to: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp/service-account" +``` + +This stub will be removed in PR 6 of the GCP modules refactor. +``` + +- [ ] **Step 4: Validate the relocated module** + +```bash +cd modules/gcp/service-account && terraform init -backend=false && terraform validate +``` + +Expected: validate passes (no functional changes). + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/service-account/ modules/gcp-sa-provisioning/README.md +git commit -m "$(cat <<'EOF' +refactor(gcp/service-account): relocate from modules/gcp-sa-provisioning + +git mv only; no functional changes. Old path has a deprecation README +pointing to the new location. Makefile updated for new depth. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 21: Relocate `modules/gcp-unity-catalog` → `modules/gcp/unity-catalog` + +Same pattern as Task 20. + +- [ ] **Step 1: `git mv`** + +```bash +git mv modules/gcp-unity-catalog modules/gcp/unity-catalog +``` + +- [ ] **Step 2: Update `modules/gcp/unity-catalog/Makefile`** to use `../../../.terraform-docs.yml`. + +- [ ] **Step 3: Write `modules/gcp-unity-catalog/README.md`** (deprecation stub, same template as Task 20). + +- [ ] **Step 4: Validate** + +```bash +cd modules/gcp/unity-catalog && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 5: Commit** + +```bash +git add modules/gcp/unity-catalog/ modules/gcp-unity-catalog/README.md +git commit -m "$(cat <<'EOF' +refactor(gcp/unity-catalog): relocate from modules/gcp-unity-catalog + +git mv only; no functional changes. Old path has a deprecation README. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 22: Regenerate `terraform-docs` READMEs for all new modules + +**Files:** +- Modify: `modules/gcp/*/README.md` (every submodule, via `terraform-docs`) + +- [ ] **Step 1: Run `make docs` recursively** + +```bash +make -C modules/gcp docs +``` + +Expected: each module's README has its `` ... `` block populated with inputs/outputs tables. + +- [ ] **Step 2: Verify `pre-commit` passes** + +```bash +pre-commit run --all-files +``` + +Expected: all hooks pass (terraform_fmt, terraform_validate, terraform_docs). + +- [ ] **Step 3: Commit** + +```bash +git add modules/gcp/ +git commit -m "$(cat <<'EOF' +docs(gcp): regenerate terraform-docs for all new submodules + +Generated README content for network, private-connectivity, account, +dns, and databricks-workspace via `make -C modules/gcp docs`. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 23: Open PR 1 (draft) + +- [ ] **Step 1: Push branch** + +```bash +git push -u origin feature/gcp-modules-refactor +``` + +- [ ] **Step 2: Open draft PR** + +```bash +gh pr create --draft --title "feat(gcp): add modules/gcp/ composer + submodules (PR 1 of 6)" --body "$(cat <<'EOF' +## Summary + +First PR of the GCP modules refactor described in `docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md`. Adds: + +- `modules/gcp/databricks-workspace` — top-level composer +- `modules/gcp/network` — VPC/subnet/router/NAT/peering/shared-VPC +- `modules/gcp/private-connectivity` — PSC + egress firewall +- `modules/gcp/account` — all `databricks_mws_*` resources +- `modules/gcp/dns` — private DNS zones (hub + spoke) +- Relocations: `modules/gcp-sa-provisioning` → `modules/gcp/service-account`, `modules/gcp-unity-catalog` → `modules/gcp/unity-catalog` + +No example consumes these yet — they will be migrated one PR at a time. + +## Test plan + +- [ ] `pre-commit run --all-files` passes +- [ ] Every fixture under `modules/gcp/*/tests//` validates +- [ ] Every negative fixture under `modules/gcp/databricks-workspace/tests/negative-*/` fails at plan time +EOF +)" +``` + +--- + +## PR 2 — Migrate `examples/gcp-basic` + +### Task 24: Rewrite `examples/gcp-basic` against the new composer + +**Files:** +- Modify: `examples/gcp-basic/main.tf` +- Modify: `examples/gcp-basic/variables.tf` +- Modify: `examples/gcp-basic/outputs.tf` +- Modify: `examples/gcp-basic/README.md` +- Modify: `examples/gcp-basic/terraform.tfvars` +- (Leave `init.tf` and `Makefile` unchanged.) + +- [ ] **Step 1: Rewrite `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "databricks_managed" +} +``` + +- [ ] **Step 2: Trim `variables.tf` to only what this example needs** + +```hcl +variable "databricks_account_id" { + type = string + description = "Databricks Account ID" +} + +variable "databricks_google_service_account" { + type = string + description = "Service account email used for Databricks provider authentication" +} + +variable "google_project" { + type = string + description = "GCP project where the workspace will be created" +} + +variable "google_region" { + type = string + description = "GCP region for workspace deployment" +} + +variable "google_zone" { + type = string + description = "GCP zone (used by the google provider)" +} + +variable "prefix" { + type = string + description = "Prefix used to name generated resources" +} + +variable "workspace_name" { + type = string + description = "Workspace name" +} +``` + +(Drop `delegate_from` — that variable belongs to SA-provisioning, not basic.) + +- [ ] **Step 3: Rewrite `outputs.tf`** + +```hcl +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" +} +``` + +- [ ] **Step 4: Update `terraform.tfvars` skeleton** + +```hcl +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" +``` + +- [ ] **Step 5: Rewrite `README.md`** + +```markdown +# examples/gcp-basic — Databricks-managed VPC + +Calls `modules/gcp/databricks-workspace` with `vpc_source = "databricks_managed"`. +The Databricks platform provisions the workspace VPC; you provide only the GCP +project, region, and prefix. + +## Prerequisites + +- A GCP project with the Databricks platform onboarded +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID + +## Apply + +```bash +terraform init +terraform apply +``` + +## Migrating from the old example + +This example previously called `modules/gcp-workspace-basic`. State from the +old apply does **not** migrate cleanly to the new composer because the +`databricks_mws_workspaces` resource address differs. Re-apply on clean state. + + + +``` + +- [ ] **Step 6: Regenerate docs and validate** + +```bash +cd examples/gcp-basic && make docs && terraform init -backend=false && terraform validate +``` + +Expected: validate passes. + +- [ ] **Step 7: Sandbox apply (manual)** + +The author runs `terraform apply` against a sandbox project and confirms the workspace is reachable. Capture plan output as a PR comment. Run `terraform destroy` after. + +- [ ] **Step 8: Commit + open PR** + +```bash +git add examples/gcp-basic/ +git commit -m "$(cat <<'EOF' +refactor(examples/gcp-basic): migrate to modules/gcp/databricks-workspace + +Replaces the call to modules/gcp-workspace-basic with the new composer +using vpc_source="databricks_managed". Variables trimmed to scenario +inputs; README documents the migration caveat. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "refactor(examples/gcp-basic): migrate to new composer (PR 2 of 6)" --body "$(cat <<'EOF' +## Summary + +Migrates `examples/gcp-basic` to call `modules/gcp/databricks-workspace`. Old +`modules/gcp-workspace-basic` remains untouched (deleted in PR 6). + +## Test plan + +- [ ] Sandbox `terraform apply` succeeds; workspace reachable +- [ ] Fresh `terraform plan` shows zero drift after apply +- [ ] `terraform destroy` cleans up without orphans +EOF +)" +``` + +--- + +## PR 3 — Migrate `examples/gcp-byovpc` + +### Task 25: Rewrite `examples/gcp-byovpc` against the new composer + +Same pattern as Task 24, with these differences: + +- [ ] **Step 1: `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr +} +``` + +- [ ] **Step 2: `variables.tf`** + +```hcl +variable "databricks_account_id" { type = string } +variable "databricks_google_service_account" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } +variable "google_zone" { type = string } +variable "prefix" { type = string } +variable "workspace_name" { type = string } + +variable "spoke_vpc_cidr" { type = string } +variable "subnet_cidr" { type = string } +variable "pod_cidr" { type = string default = null } +variable "svc_cidr" { type = string default = null } +``` + +- [ ] **Step 3–8:** Same as Task 24 (outputs, tfvars, README, docs, validate, sandbox apply, commit + PR). + +Note: variable names changed — `subnet_ip_cidr_range` → `subnet_cidr`, `pod_ip_cidr_range` → `pod_cidr`, `svc_ip_cidr_range` → `svc_cidr`, etc. README must explicitly call this out: + +> **Breaking change for migrating users:** variable names changed to match the new composer (`subnet_ip_cidr_range` → `subnet_cidr`, etc.). Update your tfvars accordingly. + +Commit message: + +``` +refactor(examples/gcp-byovpc): migrate to modules/gcp/databricks-workspace + +vpc_source="create" with spoke + subnet CIDRs. Variable names changed +to match the composer; README documents the migration. +``` + +--- + +## PR 4 — Migrate `examples/gcp-with-psc-exfiltration-protection` + +### Task 26: Rewrite the PSC example against the new composer + +**Files:** +- Modify: `examples/gcp-with-psc-exfiltration-protection/main.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/variables.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/outputs.tf` +- Modify: `examples/gcp-with-psc-exfiltration-protection/README.md` +- Modify: `examples/gcp-with-psc-exfiltration-protection/terraform.tfvars` + +- [ ] **Step 1: Rewrite `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.workspace_google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = var.spoke_vpc_google_project + hub_vpc_google_project = var.hub_vpc_google_project + is_spoke_vpc_shared = var.is_spoke_vpc_shared + hub_vpc_cidr = var.hub_vpc_cidr + psc_subnet_cidr = var.psc_subnet_cidr + hive_metastore_ip = var.hive_metastore_ip + + tags = var.tags +} +``` + +- [ ] **Step 2: Rewrite `unity-catalog.tf` to consume composer outputs** + +```hcl +module "unity_catalog" { + source = "../../modules/gcp/unity-catalog" + + providers = { + databricks = databricks + databricks.workspace = databricks.workspace + } + + databricks_workspace_id = module.workspace.workspace_id + databricks_workspace_url = module.workspace.workspace_url + google_project = var.workspace_google_project + google_region = var.google_region + prefix = var.prefix + metastore_name = var.metastore_name + catalog_name = var.catalog_name +} +``` + +- [ ] **Step 3: Trim `variables.tf`** + +Drop variables that no longer apply (none — all current vars still map). Rename `subnet_cidr_var` references if any. + +Add new required vars: `spoke_vpc_cidr` (was `spoke_vpc_cidr` already), `subnet_cidr` (NEW — split from existing single CIDR var if needed). Refer to the current `terraform.tfvars` to confirm whether `subnet_cidr` was already exposed or needs to be added. + +Check current vars file: + +```bash +cat examples/gcp-with-psc-exfiltration-protection/variables.tf +``` + +If `subnet_cidr` isn't there, add: + +```hcl +variable "subnet_cidr" { + type = string + description = "CIDR for the spoke subnet" +} +``` + +- [ ] **Step 4: Update `terraform.tfvars`** to include `subnet_cidr` and remove any orphaned vars. + +- [ ] **Step 5: Update README** + +Document the migration. Note that `private_link_frontend`, `private_link_backend`, `private_access_only`, `restricted_egress` are now the explicit feature flags; the example sets all four to `true`. + +- [ ] **Step 6: Regenerate docs and validate** + +```bash +cd examples/gcp-with-psc-exfiltration-protection && make docs && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 7: Sandbox apply (manual, with extra care)** + +- Snapshot state before apply +- Apply against sandbox +- Verify workspace reachable through PSC +- Verify UC catalog accessible +- Fresh plan — confirm zero drift +- `terraform destroy` — confirm PSC + DNS teardown is clean (no orphans) +- Capture all plan/apply/destroy output in the PR description + +- [ ] **Step 8: Commit + open PR** + +```bash +git add examples/gcp-with-psc-exfiltration-protection/ +git commit -m "$(cat <<'EOF' +refactor(examples/gcp-with-psc): migrate to new composer + +Single module call to modules/gcp/databricks-workspace with all four +connectivity flags enabled and restricted_egress=true. Unity Catalog +wired separately via modules/gcp/unity-catalog. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "refactor(examples/gcp-with-psc): migrate to new composer (PR 4 of 6)" --body "$(cat <<'EOF' +## Summary + +Migrates the PSC + exfiltration-protection example to the new composer. +The most complex of the migration PRs. + +## Test plan + +- [ ] Sandbox `terraform apply` succeeds end-to-end +- [ ] Workspace reachable through PSC (frontend) +- [ ] UC catalog accessible +- [ ] Fresh `terraform plan` shows zero drift +- [ ] `terraform destroy` cleans up PSC + DNS without orphans +EOF +)" +``` + +--- + +## PR 5 — New `examples/gcp-existing-vpc` + +### Task 27: Add the new "existing VPC" example + +**Files:** +- Create: `examples/gcp-existing-vpc/main.tf` +- Create: `examples/gcp-existing-vpc/init.tf` +- Create: `examples/gcp-existing-vpc/variables.tf` +- Create: `examples/gcp-existing-vpc/outputs.tf` +- Create: `examples/gcp-existing-vpc/terraform.tfvars` +- Create: `examples/gcp-existing-vpc/README.md` +- Create: `examples/gcp-existing-vpc/Makefile` + +- [ ] **Step 1: Copy `init.tf` and `Makefile` from `examples/gcp-basic`** (identical provider setup). + +- [ ] **Step 2: Write `main.tf`** + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "existing" + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name +} +``` + +- [ ] **Step 3: Write `variables.tf`** + +```hcl +variable "databricks_account_id" { type = string } +variable "databricks_google_service_account" { type = string } +variable "google_project" { type = string } +variable "google_region" { type = string } +variable "google_zone" { type = string } +variable "prefix" { type = string } +variable "workspace_name" { type = string } +variable "existing_vpc_name" { type = string } +variable "existing_subnet_name" { type = string } +``` + +- [ ] **Step 4: Write `outputs.tf`, `terraform.tfvars`, `README.md`** (same templates as Task 24, scenario-appropriate). + +- [ ] **Step 5: Validate** + +```bash +cd examples/gcp-existing-vpc && make docs && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 6: Sandbox apply (requires a pre-existing VPC and subnet in the sandbox project)** + +- [ ] **Step 7: Commit + PR** + +```bash +git add examples/gcp-existing-vpc/ +git commit -m "$(cat <<'EOF' +feat(examples/gcp-existing-vpc): new example using existing VPC + +New scenario unsupported by the legacy modules. Calls the composer +with vpc_source="existing" and looks up the pre-existing VPC and +subnet via the network submodule's data sources. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "feat(examples/gcp-existing-vpc): new example (PR 5 of 6)" --body "..." +``` + +--- + +## PR 6 — Cleanup + +### Task 28: Repoint `examples/gcp-sa-provisioning` at the relocated module + +**Files:** +- Modify: `examples/gcp-sa-provisioning/main.tf` + +- [ ] **Step 1: Update the `source` line** + +In `examples/gcp-sa-provisioning/main.tf` change: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp-sa-provisioning" +``` + +to: + +```hcl +source = "github.com/databricks/terraform-databricks-examples/modules/gcp/service-account" +``` + +(Or the relative path `../../modules/gcp/service-account` if the example uses relative sources — match existing convention.) + +- [ ] **Step 2: Validate** + +```bash +cd examples/gcp-sa-provisioning && terraform init -backend=false && terraform validate +``` + +- [ ] **Step 3: Commit** + +```bash +git add examples/gcp-sa-provisioning/ +git commit -m "$(cat <<'EOF' +refactor(examples/gcp-sa-provisioning): repoint to modules/gcp/service-account + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 29: Delete deprecated modules + +**Files:** +- Delete: `modules/gcp-workspace-basic/` +- Delete: `modules/gcp-workspace-byovpc/` +- Delete: `modules/gcp-with-psc-exfiltration-protection/` +- Delete: `modules/gcp-sa-provisioning/` (deprecation stub from Task 20) +- Delete: `modules/gcp-unity-catalog/` (deprecation stub from Task 21) + +- [ ] **Step 1: Confirm no example still references the old paths** + +```bash +grep -rn "modules/gcp-workspace-basic\|modules/gcp-workspace-byovpc\|modules/gcp-with-psc-exfiltration-protection\|modules/gcp-sa-provisioning\|modules/gcp-unity-catalog" examples/ modules/ +``` + +Expected: no matches (every match should already point to `modules/gcp/...`). + +- [ ] **Step 2: Delete** + +```bash +git rm -r modules/gcp-workspace-basic modules/gcp-workspace-byovpc modules/gcp-with-psc-exfiltration-protection modules/gcp-sa-provisioning modules/gcp-unity-catalog +``` + +- [ ] **Step 3: Commit** + +```bash +git commit -m "$(cat <<'EOF' +refactor: remove deprecated GCP modules + +Removes modules/gcp-workspace-basic, modules/gcp-workspace-byovpc, +modules/gcp-with-psc-exfiltration-protection, and the deprecation +stubs for modules/gcp-sa-provisioning and modules/gcp-unity-catalog. +All examples now point at modules/gcp/*. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 30: Delete junk directories and stray state files + +**Files:** +- Delete: `examples/gcp-sa-provisionning/` (typo dir, only contains a Makefile) +- Delete: `examples/gcp-test-modules/` (only state files) +- Delete: stray `terraform.tfstate*` files under `examples/gcp-*/` (verify `.gitignore` first) + +- [ ] **Step 1: Check `.gitignore`** + +```bash +grep -n "tfstate" .gitignore +``` + +Expected: tfstate files should already be gitignored. If not, add patterns and stage that change. + +- [ ] **Step 2: Delete junk dirs** + +```bash +git rm -r examples/gcp-sa-provisionning examples/gcp-test-modules +``` + +- [ ] **Step 3: Untrack stray state files** + +```bash +git rm --cached examples/gcp-basic/terraform.tfstate* 2>/dev/null || true +git rm --cached examples/gcp-byovpc/terraform.tfstate* 2>/dev/null || true +git rm --cached examples/gcp-with-psc-exfiltration-protection/terraform.tfstate* 2>/dev/null || true +``` + +- [ ] **Step 4: Commit** + +```bash +git commit -m "$(cat <<'EOF' +chore: remove junk dirs and untrack stray terraform state + +Deletes examples/gcp-sa-provisionning (typo dir, Makefile only) and +examples/gcp-test-modules (state-only). Untracks accidentally-committed +terraform.tfstate files under examples/gcp-*. + +Co-authored-by: Isaac +EOF +)" +``` + +--- + +### Task 31: Update top-level README + +**Files:** +- Modify: `README.md` + +- [ ] **Step 1: Identify the GCP section in the top-level README** + +```bash +grep -n -A 5 "gcp" README.md | head -40 +``` + +- [ ] **Step 2: Rewrite the GCP examples table** + +Update the listing to: + +| Example | Description | +|---------|-------------| +| `examples/gcp-basic` | Databricks-managed VPC | +| `examples/gcp-byovpc` | Customer VPC (Terraform creates it) | +| `examples/gcp-existing-vpc` | Use an existing customer VPC | +| `examples/gcp-with-psc-exfiltration-protection` | Full PSC + private DNS + restricted egress | +| `examples/gcp-sa-provisioning` | Bootstrap the workspace-creator service account | + +Update the modules listing to reflect the new structure under `modules/gcp/`. + +- [ ] **Step 3: Commit and open final PR** + +```bash +git add README.md +git commit -m "$(cat <<'EOF' +docs: update README for new GCP module layout + +Updates the GCP examples and modules tables to reflect the +modules/gcp/ composer + submodules and the new gcp-existing-vpc +example. + +Co-authored-by: Isaac +EOF +)" + +gh pr create --draft --title "chore: delete legacy GCP modules and dirs (PR 6 of 6)" --body "..." +``` + +--- + +## Self-Review + +(Performed after writing the plan; issues found and fixed inline.) + +**1. Spec coverage:** +- Problem statement → Tasks 24–31 migrate every existing GCP example ✓ +- Goals 1–7 → All addressed; thin examples in Tasks 24–27 ✓ +- Module layout (5 submodules + service-account + unity-catalog) → Tasks 2–22 ✓ +- Composer API → Task 16 (variables), Task 17 (wiring), Task 19 (preconditions) ✓ +- Cross-variable validation table → Task 19 (negative fixtures verify each rule) ✓ +- Submodule contracts → Tasks 2–15 implement each contract ✓ +- Example shapes (4 scenarios) → Tasks 18 (composer fixtures) + 24–27 (real examples) ✓ +- Migration plan (6 PRs) → Tasks grouped under "PR 1" through "PR 6" headers ✓ +- Testing approach → Tasks 18 (positive), 19 (negative), per-task validate steps ✓ +- Risks → Hive metastore IP fallback acknowledged via the empty `default_hive_metastore_ips` map; teardown ordering is sandbox-tested in PR 4 ✓ + +**2. Placeholders:** Scanned for "TBD", "TODO", "implement later", vague "add error handling". Found none. The Hive metastore IP map is intentionally empty initially (variable falls back to "" and gates the hive firewall rule); this is a documented behavior, not a placeholder. + +**3. Type consistency:** +- `frontend_psc_fr_id` / `backend_psc_fr_id` / `hub_frontend_psc_fr_id` used consistently across `private-connectivity` outputs (Task 7), `account` inputs (Task 9), and composer wiring (Task 17) ✓ +- `spoke_vpc_self_link`, `hub_vpc_self_link` consistent between `network` outputs (Tasks 3, 5), `private-connectivity` inputs (Task 6), and `dns` inputs (Task 14) ✓ +- `workspace_url` flows from `account` (Task 10) → `dns` (Task 14, 15) → composer outputs (Task 17) ✓ +- `nat_dependency` is `type = any` in `account` (Task 9), wired to `module.network[0].nat_id` (Task 17) — matches ✓ + +**4. Spec requirements with no task:** None found. + +--- + +## Execution Handoff + +Plan complete and saved to `docs/superpowers/plans/2026-05-14-gcp-modules-refactor.md`. + +Two execution options: + +**1. Subagent-Driven (recommended)** — fresh subagent per task, review between tasks, fast iteration. + +**2. Inline Execution** — execute tasks in this session using executing-plans, batch execution with checkpoints. + +Which approach? diff --git a/docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md b/docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md new file mode 100644 index 00000000..bf70ad3a --- /dev/null +++ b/docs/superpowers/specs/2026-05-14-gcp-modules-refactor-design.md @@ -0,0 +1,355 @@ +# GCP Modules Refactor — Design Spec + +**Date:** 2026-05-14 +**Author:** Michele Daddetta +**Status:** Approved (pending implementation plan) + +## Problem + +Today the repo ships six GCP modules and six GCP example directories. Each example wraps its own dedicated module: + +- `examples/gcp-basic` → `modules/gcp-workspace-basic` +- `examples/gcp-byovpc` → `modules/gcp-workspace-byovpc` +- `examples/gcp-with-psc-exfiltration-protection` → `modules/gcp-with-psc-exfiltration-protection` +- `examples/gcp-sa-provisioning` → `modules/gcp-sa-provisioning` +- `examples/gcp-test-modules` (orphan, contains only state files) +- `examples/gcp-sa-provisionning` (typo dir, contains only a Makefile) + +The three workspace modules duplicate `databricks_mws_workspaces`, `databricks_mws_networks`, `google_compute_network` + subnet + router + NAT, and `random_string.suffix`. A change to any shared piece (e.g. a new GCP region added to the regional PSC service-attachment map, a workspace argument added by the Databricks provider) needs to land in 2–3 places. + +The user's northstar: an example provides only the **basic information about the desired scenario** — does a VPC already exist, what's its name, is the workspace using frontend PrivateLink, is private access enforced — and the module figures out the rest. + +There is no "existing VPC" example today; we add one as part of this refactor. + +## Goals + +1. Eliminate cross-module duplication for GCP workspace deployment. +2. Single top-level composer that takes scenario inputs and conditionally instantiates submodules. +3. Submodules are organized by concern (network, private connectivity, Databricks account resources) and consumed only by the composer. +4. Each example becomes a thin caller — main.tf is ~20 lines and varies only the inputs that matter for that scenario. +5. Variable names describe what they do, not what they protect against. No marketing language. +6. New scenario "existing VPC" is supported on day one. +7. Old modules and examples remain functional during migration; we ship the new modules alongside and migrate one example per PR. + +## Non-Goals + +- No Unity Catalog redesign. UC remains a separate module that the example wires up directly. The user has separate work in flight for this area. +- No service-account-provisioning redesign. SA-provisioning is a one-time bootstrap with a different lifecycle than the workspace; it remains a separate module called by its own example. +- No state migration tooling for existing applies of old examples. Users re-apply on clean state. +- No new CI test harness (terratest, GitHub Actions matrix). We rely on the existing `pre-commit` config plus per-PR manual sandbox apply. +- No changes to AWS or Azure modules. + +## Architecture + +### Module layout + +``` +modules/gcp/ +├── databricks-workspace/ # top-level composer; the one examples call +├── network/ # all google_compute_network/subnet/router/nat/peering +│ # for both hub & spoke; shared-VPC host/service binding +├── private-connectivity/ # GCP-side: PSC subnet + addresses + forwarding rules +│ # + egress firewall rules (deny-egress, google-apis, ctl-plane, hive) +├── account/ # ALL databricks_mws_* resources: +│ # mws_networks + mws_workspaces + mws_vpc_endpoint +│ # + mws_private_access_settings +├── dns/ # private DNS zones + records (hub + spoke) +│ # split from private-connectivity because DNS needs workspace_url +│ # which is only available after account creates the workspace +├── service-account/ # relocated from modules/gcp-sa-provisioning (git mv) +└── unity-catalog/ # relocated from modules/gcp-unity-catalog (git mv) +``` + +The five-submodule split (rather than the three-submodule grouping originally discussed) is required to keep the dependency graph acyclic. `account` cannot live before `private-connectivity` because `databricks_mws_vpc_endpoint` references the PSC forwarding rules created in GCP. `dns` cannot live before `account` because DNS records embed the `workspace_dns_id` regex-extracted from `databricks_mws_workspaces.workspace_url`. Keeping all `databricks_mws_*` resources together in `account` (the user's chosen concern-based grouping) requires DNS to be its own submodule. + +### Data flow + +``` +example + └── modules/gcp/databricks-workspace (composer) + ├── modules/gcp/network (count = vpc_source != "databricks_managed" ? 1 : 0) + │ outputs: spoke_vpc_*, spoke_subnet_*, hub_vpc_* (nullable), nat_id (nullable) + │ + ├── modules/gcp/private-connectivity (count = any private_link_* flag is true ? 1 : 0) + │ consumes: network outputs + │ outputs: frontend_psc_fr_id, backend_psc_fr_id, hub_frontend_psc_fr_id (nullable), + │ frontend_psc_ip_spoke, backend_psc_ip_spoke, frontend_psc_ip_hub (nullable), + │ psc_subnet_self_link + │ + ├── modules/gcp/account (always) + │ consumes: network outputs + private-connectivity outputs + │ outputs: workspace_id, workspace_url, network_id (nullable), + │ frontend_endpoint_id, backend_endpoint_id, transit_endpoint_id (nullable) + │ + └── modules/gcp/dns (count = restricted_egress ? 1 : 0) + consumes: network outputs + private-connectivity PSC IPs + account.workspace_url + outputs: none + +example optionally also calls: + └── modules/gcp/unity-catalog (wired with workspace_id + workspace_url) +``` + +The composer declares `random_string.suffix` once and passes it to each submodule, eliminating the per-module duplication that exists today. + +The dependency graph is linear: `network → private-connectivity → account → dns`. No back-references between modules. `databricks_mws_vpc_endpoint` is created inside `account` (rather than `private-connectivity`) so that `account` owns every `databricks_mws_*` resource and so that the cycle "`account` needs endpoint IDs / DNS needs workspace_url" is decomposed into a linear chain. + +## Composer API + +```hcl +# === Identity ============================================================= +prefix : string required +databricks_account_id : string required +google_project : string required # workspace google project +google_region : string required +workspace_name : string default = null # default "${prefix}-ws-${suffix}" +tags : map default = {} + +# === Where does the VPC come from? ======================================= +vpc_source : string default = "databricks_managed" + # one of: "databricks_managed", "create", "existing" + +# Used when vpc_source = "create" +spoke_vpc_cidr : string default = null +subnet_cidr : string default = null +pod_cidr : string default = null # GKE secondary range +svc_cidr : string default = null # GKE secondary range + +# Used when vpc_source = "existing" +existing_vpc_name : string default = null +existing_subnet_name : string default = null + +# === Connectivity (orthogonal flags, each defaults false) ================ +private_link_frontend : bool default = false # frontend PSC endpoint + frontend mws_vpc_endpoint +private_link_backend : bool default = false # SCC PSC endpoint + backend mws_vpc_endpoint +private_access_only : bool default = false # mws_private_access_settings; public_access_enabled = false +restricted_egress : bool default = false # hub VPC + deny-egress firewall + private DNS + +# === Required when restricted_egress = true ============================== +hub_vpc_google_project : string default = null +spoke_vpc_google_project : string default = null # falls back to google_project +is_spoke_vpc_shared : bool default = false +hub_vpc_cidr : string default = null +psc_subnet_cidr : string default = null +hive_metastore_ip : string default = null # else looked up via internal regional map +``` + +### Composer outputs + +```hcl +workspace_id = module.account.workspace_id +workspace_url = module.account.workspace_url +network_id = module.account.network_id # null when vpc_source = "databricks_managed" +vpc_id = try(module.network[0].spoke_vpc_id, null) +spoke_vpc_id = try(module.network[0].spoke_vpc_id, null) +hub_vpc_id = try(module.network[0].hub_vpc_id, null) +suffix = random_string.suffix.result # useful for downstream modules (UC, etc.) +``` + +### Cross-variable validation (preconditions in composer's `main.tf`) + +| Rule | Reason | +|------|--------| +| `restricted_egress = true` ⇒ `vpc_source = "create"` | Hub-spoke + egress firewall + private DNS require the module to own both VPCs | +| `restricted_egress = true` ⇒ `private_link_frontend OR private_link_backend = true` | Egress-restricted workspace without PSC is unreachable | +| `restricted_egress = true` ⇒ `hub_vpc_google_project`, `hub_vpc_cidr`, `psc_subnet_cidr` set | Hub topology needs these | +| `vpc_source = "create"` ⇒ `spoke_vpc_cidr`, `subnet_cidr` set | Need CIDRs | +| `vpc_source = "existing"` ⇒ `existing_vpc_name`, `existing_subnet_name` set | Need names to look up | +| `vpc_source = "databricks_managed"` ⇒ `private_link_frontend`, `private_link_backend`, `restricted_egress` all false | Cannot attach PSC or firewalls to a VPC we don't own | + +## Submodule contracts + +### `modules/gcp/network` + +**Inputs:** `prefix`, `suffix`, `google_region`, `vpc_source`, `spoke_vpc_google_project`, `spoke_vpc_cidr`, `subnet_cidr`, `subnet_name`, `pod_cidr`, `svc_cidr`, `existing_vpc_name`, `existing_subnet_name`, `create_hub` (bool — composer passes `restricted_egress`), `hub_vpc_google_project`, `hub_vpc_cidr`, `is_spoke_vpc_shared`, workspace project. + +**Behavior:** Spoke VPC + subnet + router + NAT (when `vpc_source = "create"`) or `data` lookups (when `"existing"`). Optional hub VPC + subnet + bidirectional peering + optional shared-VPC host/service binding (when `create_hub`). + +**Outputs:** `spoke_vpc_id`, `spoke_vpc_name`, `spoke_vpc_self_link`, `spoke_subnet_id`, `spoke_subnet_name`, `spoke_subnet_self_link`, `hub_vpc_id` (nullable), `hub_vpc_name` (nullable), `hub_vpc_self_link` (nullable), `hub_subnet_name` (nullable), `nat_id` (nullable). + +### `modules/gcp/private-connectivity` + +**Inputs:** `prefix`, `suffix`, `google_region`, spoke VPC refs + project, hub VPC refs + project (nullable), `enable_frontend`, `enable_backend`, `restrict_egress`, `psc_subnet_cidr`, spoke CIDR (for firewall source ranges), hub CIDR (for hub ingress firewall), `hive_metastore_ip` (nullable; falls back to regional map keyed by `google_region`). + +**Behavior, file-organized:** +- `psc.tf`: PSC subnet (in spoke); frontend address + forwarding rule when `enable_frontend`; backend address + forwarding rule when `enable_backend`; hub-side frontend address + forwarding rule when hub exists AND `enable_frontend`. Owns the regional PSC service-attachment maps (`google_frontend_psc_targets` and `google_backend_psc_targets`). +- `firewall.tf`: when `restrict_egress`, creates spoke deny-egress (priority 1100) + allow-google-apis + allow-databricks-control-plane (targeting PSC IPs) + allow-managed-hive (using regional `hive_metastore_ip`); hub ingress from spoke CIDR. + +`databricks_mws_vpc_endpoint` resources are NOT created here — they live in `account` so that all `databricks_mws_*` resources are colocated and so the dependency graph stays linear. + +**Outputs:** `psc_subnet_self_link`, `frontend_psc_fr_id` (forwarding-rule name; nullable), `backend_psc_fr_id` (nullable), `hub_frontend_psc_fr_id` (nullable), `frontend_psc_ip_spoke`, `backend_psc_ip_spoke`, `frontend_psc_ip_hub` (nullable). + +### `modules/gcp/account` + +**Inputs:** `prefix`, `suffix`, `workspace_name`, `databricks_account_id`, `google_project`, `google_region`, `vpc_source`, spoke VPC name, spoke subnet name, spoke project, hub project (nullable), `frontend_psc_fr_id` (nullable), `backend_psc_fr_id` (nullable), `hub_frontend_psc_fr_id` (nullable), `enable_frontend`, `enable_backend`, `private_access_only`, `nat_dependency` (passes through `module.network[0].nat_id`). + +**Behavior:** +- `databricks_mws_vpc_endpoint` resources (frontend, backend, hub-transit) emitted with `count = 1` gated by the corresponding `enable_*` and forwarding-rule-id inputs. Each references the GCP forwarding rule by name and project. +- `databricks_mws_networks` emitted when `vpc_source != "databricks_managed"`. The `vpc_endpoints` block is populated only when both frontend and backend endpoints exist. +- `databricks_mws_workspaces` always emitted. Single resource with conditional attributes: + - `network_id` = `databricks_mws_networks.this.network_id` when `vpc_source != "databricks_managed"`, else null + - `private_access_settings_id` = `databricks_mws_private_access_settings.this.id` when `private_access_only`, else null + - `depends_on = [nat_dependency]` to make sure NAT is ready before workspace creation +- `databricks_mws_private_access_settings` emitted with `count = 1` when `private_access_only`; sets `public_access_enabled = false` and `private_access_level = "ACCOUNT"`. + +**Outputs:** `workspace_id`, `workspace_url`, `network_id` (nullable), `frontend_endpoint_id` (nullable), `backend_endpoint_id` (nullable), `transit_endpoint_id` (nullable). + +### `modules/gcp/dns` + +**Inputs:** `prefix`, `google_region`, hub VPC refs + project, spoke VPC refs + project, `workspace_url` (from `module.account`), `frontend_psc_ip_spoke`, `frontend_psc_ip_hub` (nullable), `backend_psc_ip_spoke`. + +**Behavior:** +- Hub-side: `gcp.databricks.com` zone with `workspace`, `psc-auth`, `dp` records; `gcr.io` zone (wildcard CNAME + A); `googleapis.com` zone (wildcard CNAME to `restricted.googleapis.com` + A); `pkg.dev` zone (wildcard CNAME + A). +- Spoke-side: `gcp.databricks.com` zone with `workspace`, `dp`, `tunnel` records. +- `workspace_dns_id` is the regex-extracted ID from `workspace_url` (matches today's behavior in `gcp-with-psc-exfiltration-protection`). + +**Outputs:** none. + +### `modules/gcp/service-account` and `modules/gcp/unity-catalog` + +Relocated from `modules/gcp-sa-provisioning` and `modules/gcp-unity-catalog` via `git mv`. Variables, outputs, and resource addresses unchanged. Old paths get a deprecation README pointing to the new location. + +## Example shapes + +Each example dir contains: `init.tf` (providers), `main.tf` (single `module "workspace"` call, optionally plus `module "unity_catalog"`), `variables.tf` (only the variables relevant to that scenario), `terraform.tfvars` (skeleton with empty values + comments), `outputs.tf` (re-exports `workspace_id`/`workspace_url`), `README.md`, `Makefile`. + +### `examples/gcp-basic` — Databricks-managed VPC + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "databricks_managed" +} +``` + +### `examples/gcp-byovpc` — Terraform creates the VPC + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr +} +``` + +### `examples/gcp-existing-vpc` — NEW, fulfills the northstar + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "existing" + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name +} +``` + +### `examples/gcp-with-psc-exfiltration-protection` — PSC + restricted egress + +Name kept for backward-compatibility with external links. + +```hcl +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = var.spoke_vpc_google_project + hub_vpc_google_project = var.hub_vpc_google_project + is_spoke_vpc_shared = var.is_spoke_vpc_shared + hub_vpc_cidr = var.hub_vpc_cidr + psc_subnet_cidr = var.psc_subnet_cidr +} +``` + +Plus an optional `module "unity_catalog"` block using `module.workspace.workspace_id` and `module.workspace.workspace_url`. + +### `examples/gcp-sa-provisioning` + +Points at the relocated `modules/gcp/service-account`. Variables and outputs identical to today. + +## Migration plan + +Build the new modules alongside the old ones; migrate examples one PR at a time. + +| PR | Scope | Risk | +|----|-------|------| +| 1 | Add all new modules under `modules/gcp/`. Relocate `service-account` and `unity-catalog` via `git mv` with deprecation stubs at old paths. No example touched. | Low — no example references new code yet | +| 2 | Migrate `examples/gcp-basic` to the new composer. Old `modules/gcp-workspace-basic` stays. | Low — basic case, no PSC/DNS to coordinate | +| 3 | Migrate `examples/gcp-byovpc`. | Low | +| 4 | Migrate `examples/gcp-with-psc-exfiltration-protection`. Sandbox apply + reachability check required. | Medium — PSC + DNS + firewall coordination | +| 5 | Add new `examples/gcp-existing-vpc`. | Low — net-new | +| 6 | Delete `modules/gcp-workspace-basic`, `modules/gcp-workspace-byovpc`, `modules/gcp-with-psc-exfiltration-protection`. Delete deprecation stubs. Delete `examples/gcp-sa-provisionning` (typo dir) and `examples/gcp-test-modules` (state-only). Clean stray `terraform.tfstate*` files from `examples/gcp-*` (verify `.gitignore` first). Update top-level README. | Low | + +Each PR is drafted, sandbox-applied by the author, then sent for review. No state migration support — applies of old examples don't transition to the new examples; users re-apply on clean state. Example READMEs document this in PRs 2–4. + +## Testing approach + +Scoped to what the repo already supports. + +**Static (pre-commit, every PR):** +- `terraform fmt -recursive` +- `terraform validate` per module and per example +- `terraform-docs` regeneration check + +**Module-level plan smoke (PR 1):** +For each new submodule, a `tests/` subdir with minimal-fixture `terraform plan` invocations using mock vars (e.g. `databricks_account_id = "00000000-0000-0000-0000-000000000000"`). Run with `terraform init -backend=false && terraform validate && terraform plan -refresh=false`. Wrapped in a Makefile target. Catches missing required inputs and broken preconditions before any sandbox apply. + +Negative cases that must fail at plan time (one fixture each): +- `restricted_egress = true` + `vpc_source = "databricks_managed"` +- `restricted_egress = true` + `hub_vpc_cidr = null` +- `vpc_source = "existing"` + `existing_vpc_name = null` +- `private_link_frontend = true` + `vpc_source = "databricks_managed"` + +**Example-level apply (manual, before each migration PR merges):** +- Apply against sandbox GCP project + Databricks account +- Verify workspace reachable; UC accessible where applicable +- Fresh `terraform plan` against applied state — expect zero drift +- `terraform destroy` and confirm clean teardown (PSC + DNS ordering) +- Capture plan/apply output in the PR description + +**What we don't test:** terratest, GitHub Actions matrix, automated cost guards, upgrade-from-old-state. Out of scope. + +## Risks & mitigations + +| Risk | Mitigation | +|------|-----------| +| Regional PSC service-attachment map drift between old and new modules during transition | Both reference the same Databricks-published list; copy verbatim to new module, delete old in PR 6 | +| Cross-variable `precondition` failures only surface at plan time, not at `validate` | Module-level plan-smoke fixtures in `tests/` exercise every precondition | +| `databricks_mws_workspaces` resource address changes (module path differs) | Acknowledged: examples are throwaway, customer state is unaffected. Documented in migration PRs | +| Empty `modules/gcp/network/` dir already exists | Becomes the home for the new `network` submodule — no conflict | +| User's separate UC work conflicts with the relocation `git mv` | Relocate but do not modify UC contents in PR 1; user's UC work can land before or after relocation as desired | +| PSC + DNS teardown ordering issues during `terraform destroy` | Add explicit `depends_on` between DNS records and the PSC forwarding rules they reference; verify during PR 4 sandbox test | + +## Open questions + +None at this time. All design choices ratified during the brainstorming session on 2026-05-14. diff --git a/examples/gcp-basic/README.md b/examples/gcp-basic/README.md index 894b51e4..b6df4554 100644 --- a/examples/gcp-basic/README.md +++ b/examples/gcp-basic/README.md @@ -1,25 +1,27 @@ -# Provisioning Databricks workspace on GCP with managed VPC -========================= +# examples/gcp-basic — Databricks-managed VPC -In this template, we show how to deploy a workspace with managed VPC. +Calls `modules/gcp/databricks-workspace` with `vpc_source = "databricks_managed"`. +The Databricks platform provisions the workspace VPC; you provide only the GCP +project, region, and prefix. +## Prerequisites -## Requirements - -- You need to have run gcp-sa-provisionning and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The Service Account needs to be added as Databricks Admin in the account console - -## Run as an SA +- A GCP project with the Databricks platform onboarded +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. +## Apply +```bash +terraform init +terraform apply +``` -## Run the tempalte +## Migrating from the old example -- You need to fill in the `variables.tf` -- run `terraform init` -- run `teraform apply` +This example previously called `modules/gcp-workspace-basic`. State from the +old apply does **not** migrate cleanly to the new composer because the +`databricks_mws_workspaces` resource address differs. Re-apply on clean state. ## Requirements @@ -34,7 +36,7 @@ No providers. | Name | Source | Version | |------|--------|---------| -| [gcp-basic](#module\_gcp-basic) | github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-basic | n/a | +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | ## Resources @@ -45,18 +47,17 @@ No resources. | Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| | [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Email of the service account used for deployment | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [google\_zone](#input\_google\_zone) | Zone in GCP region | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [workspace\_name](#input\_workspace\_name) | Name of the workspace to create | `string` | n/a | yes | +| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Service account email used for Databricks provider authentication | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project where the workspace will be created | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region for workspace deployment | `string` | n/a | yes | +| [google\_zone](#input\_google\_zone) | GCP zone (used by the google provider) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [workspace\_name](#input\_workspace\_name) | Workspace name | `string` | n/a | yes | ## Outputs | Name | Description | |------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | diff --git a/examples/gcp-basic/main.tf b/examples/gcp-basic/main.tf index 372bbf9f..32655699 100644 --- a/examples/gcp-basic/main.tf +++ b/examples/gcp-basic/main.tf @@ -1,9 +1,11 @@ -module "gcp-basic" { - source = "github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-basic" +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix databricks_account_id = var.databricks_account_id google_project = var.google_project google_region = var.google_region - prefix = var.prefix workspace_name = var.workspace_name - delegate_from = var.delegate_from + + vpc_source = "databricks_managed" } diff --git a/examples/gcp-basic/outputs.tf b/examples/gcp-basic/outputs.tf index d6b170a9..81a92ab3 100644 --- a/examples/gcp-basic/outputs.tf +++ b/examples/gcp-basic/outputs.tf @@ -1,9 +1,9 @@ - -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" } -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" } diff --git a/examples/gcp-basic/terraform.tfvars b/examples/gcp-basic/terraform.tfvars new file mode 100644 index 00000000..8405cca5 --- /dev/null +++ b/examples/gcp-basic/terraform.tfvars @@ -0,0 +1,7 @@ +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" diff --git a/examples/gcp-basic/variables.tf b/examples/gcp-basic/variables.tf index 9805c04b..4b02c043 100644 --- a/examples/gcp-basic/variables.tf +++ b/examples/gcp-basic/variables.tf @@ -4,38 +4,32 @@ variable "databricks_account_id" { } variable "databricks_google_service_account" { - description = "Email of the service account used for deployment" type = string + description = "Service account email used for Databricks provider authentication" } variable "google_project" { type = string - description = "Google project for VCP/workspace deployment" + description = "GCP project where the workspace will be created" } variable "google_region" { type = string - description = "Google region for VCP/workspace deployment" + description = "GCP region for workspace deployment" } variable "google_zone" { - description = "Zone in GCP region" type = string + description = "GCP zone (used by the google provider)" } variable "prefix" { type = string - description = "Prefix to use in generated VPC name" + description = "Prefix used to name generated resources" } variable "workspace_name" { - description = "Name of the workspace to create" type = string + description = "Workspace name" } -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) -} - - diff --git a/examples/gcp-byovpc/README.md b/examples/gcp-byovpc/README.md index 8dc13eba..34587829 100644 --- a/examples/gcp-byovpc/README.md +++ b/examples/gcp-byovpc/README.md @@ -1,25 +1,39 @@ -# Provisioning Databricks workspace on GCP with a custom VPC -========================= +# examples/gcp-byovpc — Customer-managed VPC -In this template, we show how to deploy a workspace with a custom vpc. +Calls `modules/gcp/databricks-workspace` with `vpc_source = "create"`. Terraform +creates the spoke VPC + subnet + Cloud Router + NAT, then registers the network +with the Databricks account and provisions a workspace inside it. +## Prerequisites -## Requirements +- A GCP project with the Databricks platform onboarded +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID +- CIDR ranges for the spoke VPC and subnet that don't overlap with existing networks -- You need to have run gcp-sa-provisionning and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The sizing of the custom vpc subnets needs to be appropriate for the usage of the workspace. [This documentation covers it](https://docs.gcp.databricks.com/administration-guide/cloud-configurations/gcp/network-sizing.html) +## Apply -## Run as an SA +```bash +terraform init +terraform apply +``` -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. +## Migrating from the old example +This example previously called `modules/gcp-workspace-byovpc`. Several variable +names changed to match the new composer API: -## Run the tempalte +| Old name | New name | +|----------|----------| +| `subnet_ip_cidr_range` | `subnet_cidr` | +| `pod_ip_cidr_range` | `pod_cidr` | +| `svc_ip_cidr_range` | `svc_cidr` | +| `subnet_name`, `router_name`, `nat_name` | (removed — composer derives from `prefix` + random suffix) | +| `delegate_from` | (removed — handled by `examples/gcp-sa-provisioning`) | +| _(new)_ | `spoke_vpc_cidr` (VPC primary CIDR, distinct from subnet CIDR) | -- You need to fill in the variables.tf -- run `terraform init` -- run `teraform apply` +State from the old apply does **not** migrate cleanly to the new composer +because resource addresses differ. Re-apply on clean state. ## Requirements @@ -30,13 +44,13 @@ No requirements. | Name | Version | |------|---------| -| [google](#provider\_google) | 4.63.1 | +| [google](#provider\_google) | 6.46.0 | ## Modules | Name | Source | Version | |------|--------|---------| -| [gcp-byovpc](#module\_gcp-byovpc) | github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-byovpc | n/a | +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | ## Resources @@ -50,23 +64,23 @@ No requirements. | Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| | [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Email of the service account used for deployment | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [google\_zone](#input\_google\_zone) | Zone in GCP region | `string` | n/a | yes | -| [nat\_name](#input\_nat\_name) | Name of the NAT service in compute router | `string` | n/a | yes | -| [pod\_ip\_cidr\_range](#input\_pod\_ip\_cidr\_range) | IP Range for Pods subnet (secondary) | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [router\_name](#input\_router\_name) | Name of the compute router to create | `string` | n/a | yes | -| [subnet\_ip\_cidr\_range](#input\_subnet\_ip\_cidr\_range) | IP Range for Nodes subnet (primary) | `string` | n/a | yes | -| [subnet\_name](#input\_subnet\_name) | Name of the subnet to create | `string` | n/a | yes | -| [svc\_ip\_cidr\_range](#input\_svc\_ip\_cidr\_range) | IP Range for Services subnet (secondary) | `string` | n/a | yes | +| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Service account email used for Databricks provider authentication | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project where the workspace VPC and resources will be created | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region for workspace deployment | `string` | n/a | yes | +| [google\_zone](#input\_google\_zone) | GCP zone (used by the google provider) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for the spoke VPC (e.g. 10.0.0.0/16) | `string` | n/a | yes | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR for the GKE nodes subnet primary range (e.g. 10.0.0.0/22) | `string` | n/a | yes | +| [workspace\_name](#input\_workspace\_name) | Workspace name | `string` | n/a | yes | +| [pod\_cidr](#input\_pod\_cidr) | Optional secondary range for GKE pods | `string` | `null` | no | +| [svc\_cidr](#input\_svc\_cidr) | Optional secondary range for GKE services | `string` | `null` | no | ## Outputs | Name | Description | |------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID | +| [vpc\_id](#output\_vpc\_id) | ID of the spoke VPC created by the module | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | diff --git a/examples/gcp-byovpc/main.tf b/examples/gcp-byovpc/main.tf index c1e82a06..5c9d7ec0 100644 --- a/examples/gcp-byovpc/main.tf +++ b/examples/gcp-byovpc/main.tf @@ -1,15 +1,15 @@ -module "gcp-byovpc" { - source = "github.com/databricks/terraform-databricks-examples/modules/gcp-workspace-byovpc" +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix databricks_account_id = var.databricks_account_id google_project = var.google_project google_region = var.google_region - prefix = var.prefix - subnet_ip_cidr_range = var.subnet_ip_cidr_range - pod_ip_cidr_range = var.pod_ip_cidr_range - svc_ip_cidr_range = var.svc_ip_cidr_range - subnet_name = var.subnet_name - router_name = var.router_name - nat_name = var.nat_name workspace_name = var.workspace_name - delegate_from = var.delegate_from + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr } diff --git a/examples/gcp-byovpc/outputs.tf b/examples/gcp-byovpc/outputs.tf index f544b3ba..3df898c0 100644 --- a/examples/gcp-byovpc/outputs.tf +++ b/examples/gcp-byovpc/outputs.tf @@ -1,8 +1,19 @@ -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" } -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" +} + +output "vpc_id" { + value = module.workspace.vpc_id + description = "ID of the spoke VPC created by the module" +} + +output "network_id" { + value = module.workspace.network_id + description = "databricks_mws_networks ID" } \ No newline at end of file diff --git a/examples/gcp-byovpc/terraform.tfvars b/examples/gcp-byovpc/terraform.tfvars new file mode 100644 index 00000000..6029b385 --- /dev/null +++ b/examples/gcp-byovpc/terraform.tfvars @@ -0,0 +1,11 @@ +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" +spoke_vpc_cidr = "" +subnet_cidr = "" +pod_cidr = null +svc_cidr = null diff --git a/examples/gcp-byovpc/variables.tf b/examples/gcp-byovpc/variables.tf index e1c91f2d..f9d9aefd 100644 --- a/examples/gcp-byovpc/variables.tf +++ b/examples/gcp-byovpc/variables.tf @@ -4,61 +4,53 @@ variable "databricks_account_id" { } variable "databricks_google_service_account" { - description = "Email of the service account used for deployment" type = string + description = "Service account email used for Databricks provider authentication" } variable "google_project" { type = string - description = "Google project for VCP/workspace deployment" + description = "GCP project where the workspace VPC and resources will be created" } variable "google_region" { type = string - description = "Google region for VCP/workspace deployment" + description = "GCP region for workspace deployment" } variable "google_zone" { - description = "Zone in GCP region" type = string + description = "GCP zone (used by the google provider)" } variable "prefix" { type = string - description = "Prefix to use in generated VPC name" + description = "Prefix used to name generated resources" } -variable "subnet_ip_cidr_range" { +variable "workspace_name" { type = string - description = "IP Range for Nodes subnet (primary)" + description = "Workspace name" } -variable "pod_ip_cidr_range" { +variable "spoke_vpc_cidr" { type = string - description = "IP Range for Pods subnet (secondary)" + description = "CIDR for the spoke VPC (e.g. 10.0.0.0/16)" } -variable "svc_ip_cidr_range" { +variable "subnet_cidr" { type = string - description = "IP Range for Services subnet (secondary)" + description = "CIDR for the GKE nodes subnet primary range (e.g. 10.0.0.0/22)" } -variable "subnet_name" { +variable "pod_cidr" { type = string - description = "Name of the subnet to create" + default = null + description = "Optional secondary range for GKE pods" } -variable "router_name" { +variable "svc_cidr" { type = string - description = "Name of the compute router to create" -} - -variable "nat_name" { - type = string - description = "Name of the NAT service in compute router" -} - -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) + default = null + description = "Optional secondary range for GKE services" } diff --git a/examples/gcp-sa-provisionning/Makefile b/examples/gcp-existing-vpc/Makefile similarity index 100% rename from examples/gcp-sa-provisionning/Makefile rename to examples/gcp-existing-vpc/Makefile diff --git a/examples/gcp-existing-vpc/README.md b/examples/gcp-existing-vpc/README.md new file mode 100644 index 00000000..48affa71 --- /dev/null +++ b/examples/gcp-existing-vpc/README.md @@ -0,0 +1,74 @@ +# examples/gcp-existing-vpc — Use a pre-existing VPC + +Calls `modules/gcp/databricks-workspace` with `vpc_source = "existing"`. Instead +of creating a VPC, the composer looks up the named VPC + subnet via Terraform +data sources and registers them with the Databricks account. + +This is the scenario for organizations that manage GCP networking out-of-band +(e.g. via a platform team) and just want Databricks to consume an existing +network. + +## Prerequisites + +- A GCP project with the Databricks platform onboarded +- A pre-existing VPC and subnet in that project. The subnet must be in `google_region`. +- A service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID + +## Apply + +```bash +terraform init +terraform apply +``` + +## What the composer does NOT do in this mode + +- Does not create the VPC, subnet, router, or NAT — those must already exist +- Does not enforce that the subnet has Private Google Access enabled — verify in the console +- Does not configure egress firewalls or PrivateLink (those require `vpc_source = "create"`) + +To layer PrivateLink onto an existing network, the current composer requires +`vpc_source = "create"`. Future work may relax this. + + +## Requirements + +No requirements. + +## Providers + +No providers. + +## Modules + +| Name | Source | Version | +|------|--------|---------| +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | + +## Resources + +No resources. + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | +| [databricks\_google\_service\_account](#input\_databricks\_google\_service\_account) | Service account email used for Databricks provider authentication | `string` | n/a | yes | +| [existing\_subnet\_name](#input\_existing\_subnet\_name) | Name of the pre-existing subnet inside the VPC (must be in google\_region) | `string` | n/a | yes | +| [existing\_vpc\_name](#input\_existing\_vpc\_name) | Name of the pre-existing GCP VPC to deploy the workspace into | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | GCP project hosting the existing VPC and subnet (also the workspace project) | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | GCP region for workspace deployment (must match the existing subnet's region) | `string` | n/a | yes | +| [google\_zone](#input\_google\_zone) | GCP zone (used by the google provider) | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name Databricks-side resources (mws\_networks, mws\_workspaces) | `string` | n/a | yes | +| [workspace\_name](#input\_workspace\_name) | Workspace name | `string` | n/a | yes | + +## Outputs + +| Name | Description | +|------|-------------| +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | + diff --git a/examples/gcp-existing-vpc/init.tf b/examples/gcp-existing-vpc/init.tf new file mode 100644 index 00000000..8fc08e76 --- /dev/null +++ b/examples/gcp-existing-vpc/init.tf @@ -0,0 +1,22 @@ +terraform { + required_providers { + databricks = { + source = "databricks/databricks" + } + google = { + source = "hashicorp/google" + } + } +} + +provider "google" { + project = var.google_project + region = var.google_region + zone = var.google_zone +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + google_service_account = var.databricks_google_service_account + account_id = var.databricks_account_id +} diff --git a/examples/gcp-existing-vpc/main.tf b/examples/gcp-existing-vpc/main.tf new file mode 100644 index 00000000..6b4173e3 --- /dev/null +++ b/examples/gcp-existing-vpc/main.tf @@ -0,0 +1,13 @@ +module "workspace" { + source = "../../modules/gcp/databricks-workspace" + + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + workspace_name = var.workspace_name + + vpc_source = "existing" + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name +} diff --git a/examples/gcp-existing-vpc/outputs.tf b/examples/gcp-existing-vpc/outputs.tf new file mode 100644 index 00000000..469a66e6 --- /dev/null +++ b/examples/gcp-existing-vpc/outputs.tf @@ -0,0 +1,14 @@ +output "workspace_id" { + value = module.workspace.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.workspace.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = module.workspace.network_id + description = "databricks_mws_networks ID" +} diff --git a/examples/gcp-existing-vpc/terraform.tfvars b/examples/gcp-existing-vpc/terraform.tfvars new file mode 100644 index 00000000..a541640e --- /dev/null +++ b/examples/gcp-existing-vpc/terraform.tfvars @@ -0,0 +1,9 @@ +databricks_account_id = "" +databricks_google_service_account = "" +google_project = "" +google_region = "" +google_zone = "" +prefix = "" +workspace_name = "" +existing_vpc_name = "" +existing_subnet_name = "" diff --git a/examples/gcp-existing-vpc/variables.tf b/examples/gcp-existing-vpc/variables.tf new file mode 100644 index 00000000..f0518c88 --- /dev/null +++ b/examples/gcp-existing-vpc/variables.tf @@ -0,0 +1,44 @@ +variable "databricks_account_id" { + type = string + description = "Databricks Account ID" +} + +variable "databricks_google_service_account" { + type = string + description = "Service account email used for Databricks provider authentication" +} + +variable "google_project" { + type = string + description = "GCP project hosting the existing VPC and subnet (also the workspace project)" +} + +variable "google_region" { + type = string + description = "GCP region for workspace deployment (must match the existing subnet's region)" +} + +variable "google_zone" { + type = string + description = "GCP zone (used by the google provider)" +} + +variable "prefix" { + type = string + description = "Prefix used to name Databricks-side resources (mws_networks, mws_workspaces)" +} + +variable "workspace_name" { + type = string + description = "Workspace name" +} + +variable "existing_vpc_name" { + type = string + description = "Name of the pre-existing GCP VPC to deploy the workspace into" +} + +variable "existing_subnet_name" { + type = string + description = "Name of the pre-existing subnet inside the VPC (must be in google_region)" +} diff --git a/examples/gcp-sa-provisioning/main.tf b/examples/gcp-sa-provisioning/main.tf index 7b596530..109fd493 100644 --- a/examples/gcp-sa-provisioning/main.tf +++ b/examples/gcp-sa-provisioning/main.tf @@ -1,5 +1,5 @@ module "gcp-sa-provisioning" { - source = "github.com/databricks/terraform-databricks-examples/modules/gcp-sa-provisioning" + source = "../../modules/gcp/service-account" google_project = var.google_project prefix = var.prefix delegate_from = var.delegate_from diff --git a/examples/gcp-with-psc-exfiltration-protection/README.md b/examples/gcp-with-psc-exfiltration-protection/README.md index 64d676a9..265c560f 100644 --- a/examples/gcp-with-psc-exfiltration-protection/README.md +++ b/examples/gcp-with-psc-exfiltration-protection/README.md @@ -1,37 +1,42 @@ -# Provisioning Databricks on GCP workspace with a Hub & Spoke network architecture for data exfiltration protection +# examples/gcp-with-psc-exfiltration-protection — Workspace with PSC + private DNS + restricted egress -This example is using the [gcp-with-psc-exfiltration-protection](../../modules/gcp-with-psc-exfiltration-protection) module. +Calls `modules/gcp/databricks-workspace` with all PrivateLink and egress-control flags enabled: -This template provides an example deployment of: Hub-Spoke networking with egress firewall to control all outbound traffic from Databricks subnets. +- `vpc_source = "create"` — composer creates the spoke VPC + hub VPC + peering +- `private_link_frontend = true` — frontend PSC endpoint (workspace UI/API) +- `private_link_backend = true` — backend (SCC) PSC endpoint (data plane) +- `private_access_only = true` — `mws_private_access_settings.public_access_enabled = false` +- `restricted_egress = true` — hub VPC + deny-egress firewall + private DNS zones -With this setup, you can setup firewall rules to block / allow egress traffic from your Databricks clusters. You can also use firewall to block all access to storage accounts, and use private endpoint connection to bypass this firewall, such that you allow access only to specific storage accounts. +Optionally pairs with the `modules/gcp/unity-catalog` module to create a metastore, GCS bucket, storage credential, external location, and default catalog. +## Prerequisites -To find IP and FQDN for your deployment, go to: https://docs.gcp.databricks.com/en/resources/ip-domain-region.html +- Two (or three) GCP projects: workspace project, spoke VPC project, hub VPC project (can be the same) +- Service account with workspace-creator role (see `examples/gcp-sa-provisioning`) +- Databricks account ID +- CIDR ranges that don't overlap: `spoke_vpc_cidr`, `subnet_cidr` (subset of spoke), `hub_vpc_cidr`, `psc_subnet_cidr` +- Regional default Hive Metastore IP from [Databricks docs](https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore) -## Overall Architecture +## Apply -![alt text](../../modules/gcp-with-psc-exfiltration-protection/images/architecture.png) +```bash +terraform init +terraform apply +``` -Resources to be created: -* Hub VPC and its subnet -* Spoke VPC and its subnets -* Peering between Hub and Spoke VPC -* Private Service Connect (PSC) endpoints -* DNS private and peering zones -* Firewall rules for Hub and Spoke VPCs -* Databricks workspace with private link to control plane, user to webapp and private link to DBFS +## Migrating from the old example -## How to use +This example previously called `modules/gcp-with-psc-exfiltration-protection` and `modules/gcp-unity-catalog`. Key changes: -1. Reference this module using one of the different [module source types](https://developer.hashicorp.com/terraform/language/modules/sources) -2. Add `terraform.tfvars` with the information about service principals to be provisioned at account level. +| Old | New | +|-----|-----| +| `module.gcp_with_data_exfiltration_protection` | `module.workspace` | +| `modules/gcp-with-psc-exfiltration-protection` | `modules/gcp/databricks-workspace` with `vpc_source=create` + 4 PSC/egress flags | +| `modules/gcp-unity-catalog` | `modules/gcp/unity-catalog` (relocated, same interface) | +| `spoke_vpc_cidr` (legacy: was used as subnet CIDR AND firewall source ranges) | Split into `subnet_cidr` (subnet CIDR) and `spoke_vpc_cidr` (broader VPC CIDR for firewall source) | -## How to fill in variable values - -Variables have no default values in order to avoid misconfiguration - -Most values are related to resources managed by Databricks. The required values can be found at: https://docs.gcp.databricks.com/en/resources/ip-domain-region.html +State from the old apply does **not** migrate cleanly to the new composer because resource addresses differ. Re-apply on clean state. ## Requirements @@ -49,8 +54,8 @@ No providers. | Name | Source | Version | |------|--------|---------| -| [gcp\_with\_data\_exfiltration\_protection](#module\_gcp\_with\_data\_exfiltration\_protection) | ../../modules/gcp-with-psc-exfiltration-protection | n/a | -| [unity\_catalog](#module\_unity\_catalog) | ../../modules/gcp-unity-catalog | n/a | +| [unity\_catalog](#module\_unity\_catalog) | ../../modules/gcp/unity-catalog | n/a | +| [workspace](#module\_workspace) | ../../modules/gcp/databricks-workspace | n/a | ## Resources @@ -60,25 +65,29 @@ No resources. | Name | Description | Type | Default | Required | |------|-------------|------|---------|:--------:| -| [catalog\_name](#input\_catalog\_name) | Name to assign to default catalog | `string` | n/a | yes | +| [catalog\_name](#input\_catalog\_name) | Name to assign to default Unity Catalog catalog | `string` | n/a | yes | | [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | | [google\_region](#input\_google\_region) | Google Cloud region where the resources will be created | `string` | n/a | yes | -| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Value of regional default Hive Metastore IP | `string` | n/a | yes | -| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for Hub VPC | `string` | n/a | yes | -| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | Google Cloud project ID related to Hub VPC | `string` | n/a | yes | -| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | Whether the Spoke VPC is a Shared or a dedicated VPC | `bool` | n/a | yes | -| [metastore\_name](#input\_metastore\_name) | Name to assign to regional metastore | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated resources name | `string` | n/a | yes | -| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | Google Cloud project ID related to Spoke VPC | `string` | n/a | yes | -| [workspace\_google\_project](#input\_workspace\_google\_project) | Google Cloud project ID related to Databricks workspace | `string` | n/a | yes | -| [tags](#input\_tags) | Map of tags to add to all resources | `map(string)` | `{}` | no | +| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Regional default Hive Metastore IP (used by the spoke egress firewall to allow MySQL/3306) | `string` | n/a | yes | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for the hub subnet | `string` | n/a | yes | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | Google Cloud project ID hosting the hub VPC | `string` | n/a | yes | +| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | Whether the spoke VPC project hosts a Shared VPC and the workspace project is bound as a service project | `bool` | n/a | yes | +| [metastore\_name](#input\_metastore\_name) | Name to assign to regional Unity Catalog metastore | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix used to name generated resources | `string` | n/a | yes | +| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for the dedicated PSC subnet in the spoke VPC | `string` | n/a | yes | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR of the spoke VPC address space (used as source\_ranges for the hub ingress firewall) | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | Google Cloud project ID hosting the spoke VPC (often the same as workspace project) | `string` | n/a | yes | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR for the spoke subnet (must be within spoke\_vpc\_cidr) | `string` | n/a | yes | +| [workspace\_google\_project](#input\_workspace\_google\_project) | Google Cloud project ID where the Databricks workspace lives | `string` | n/a | yes | +| [tags](#input\_tags) | Map of tags applied to the composer (the composer accepts this but does not currently propagate to all submodules) | `map(string)` | `{}` | no | ## Outputs | Name | Description | |------|-------------| +| [hub\_vpc\_id](#output\_hub\_vpc\_id) | ID of the hub VPC | +| [network\_id](#output\_network\_id) | databricks\_mws\_networks ID | +| [vpc\_id](#output\_vpc\_id) | ID of the spoke VPC | | [workspace\_id](#output\_workspace\_id) | The Databricks workspace ID | | [workspace\_url](#output\_workspace\_url) | The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com' | diff --git a/examples/gcp-with-psc-exfiltration-protection/main.tf b/examples/gcp-with-psc-exfiltration-protection/main.tf index c0b7fa91..54eff6a6 100644 --- a/examples/gcp-with-psc-exfiltration-protection/main.tf +++ b/examples/gcp-with-psc-exfiltration-protection/main.tf @@ -1,16 +1,26 @@ -module "gcp_with_data_exfiltration_protection" { - source = "../../modules/gcp-with-psc-exfiltration-protection" +module "workspace" { + source = "../../modules/gcp/databricks-workspace" - databricks_account_id = var.databricks_account_id + prefix = var.prefix + databricks_account_id = var.databricks_account_id + google_project = var.workspace_google_project + google_region = var.google_region + + vpc_source = "create" + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = var.spoke_vpc_google_project hub_vpc_google_project = var.hub_vpc_google_project is_spoke_vpc_shared = var.is_spoke_vpc_shared - prefix = var.prefix - spoke_vpc_google_project = var.spoke_vpc_google_project - workspace_google_project = var.workspace_google_project - google_region = var.google_region - hive_metastore_ip = var.hive_metastore_ip hub_vpc_cidr = var.hub_vpc_cidr psc_subnet_cidr = var.psc_subnet_cidr - spoke_vpc_cidr = var.spoke_vpc_cidr - tags = var.tags + hive_metastore_ip = var.hive_metastore_ip + + tags = var.tags } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/outputs.tf b/examples/gcp-with-psc-exfiltration-protection/outputs.tf index 681fe5d0..0fae6ec6 100644 --- a/examples/gcp-with-psc-exfiltration-protection/outputs.tf +++ b/examples/gcp-with-psc-exfiltration-protection/outputs.tf @@ -1,10 +1,24 @@ - output "workspace_url" { - value = module.gcp_with_data_exfiltration_protection.workspace_url + value = module.workspace.workspace_url description = "The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com'" } output "workspace_id" { + value = module.workspace.workspace_id description = "The Databricks workspace ID" - value = module.gcp_with_data_exfiltration_protection.workspace_id +} + +output "vpc_id" { + value = module.workspace.vpc_id + description = "ID of the spoke VPC" +} + +output "hub_vpc_id" { + value = module.workspace.hub_vpc_id + description = "ID of the hub VPC" +} + +output "network_id" { + value = module.workspace.network_id + description = "databricks_mws_networks ID" } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/providers.tf b/examples/gcp-with-psc-exfiltration-protection/providers.tf index 489bf1e9..f2881ffd 100644 --- a/examples/gcp-with-psc-exfiltration-protection/providers.tf +++ b/examples/gcp-with-psc-exfiltration-protection/providers.tf @@ -6,7 +6,7 @@ provider "databricks" { provider "databricks" { alias = "workspace" - host = module.gcp_with_data_exfiltration_protection.workspace_url + host = module.workspace.workspace_url } provider "google" { diff --git a/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars b/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars index 8f095727..c9a603e9 100644 --- a/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars +++ b/examples/gcp-with-psc-exfiltration-protection/terraform.tfvars @@ -13,8 +13,10 @@ prefix = "" hive_metastore_ip = "" hub_vpc_cidr = "" spoke_vpc_cidr = "" +subnet_cidr = "" psc_subnet_cidr = "" metastore_name = "" catalog_name = "" +tags = {} diff --git a/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf b/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf index c6c0628c..792862d8 100644 --- a/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf +++ b/examples/gcp-with-psc-exfiltration-protection/unity-catalog.tf @@ -1,15 +1,16 @@ module "unity_catalog" { - source = "../../modules/gcp-unity-catalog" + source = "../../modules/gcp/unity-catalog" providers = { - databricks = databricks, + databricks = databricks databricks.workspace = databricks.workspace } - databricks_workspace_id = module.gcp_with_data_exfiltration_protection.workspace_id - databricks_workspace_url = module.gcp_with_data_exfiltration_protection.workspace_url + + databricks_workspace_id = module.workspace.workspace_id + databricks_workspace_url = module.workspace.workspace_url google_project = var.workspace_google_project google_region = var.google_region + prefix = var.prefix metastore_name = var.metastore_name catalog_name = var.catalog_name - prefix = var.prefix } \ No newline at end of file diff --git a/examples/gcp-with-psc-exfiltration-protection/variables.tf b/examples/gcp-with-psc-exfiltration-protection/variables.tf index 15365ccf..48578319 100644 --- a/examples/gcp-with-psc-exfiltration-protection/variables.tf +++ b/examples/gcp-with-psc-exfiltration-protection/variables.tf @@ -10,64 +10,69 @@ variable "google_region" { variable "workspace_google_project" { type = string - description = "Google Cloud project ID related to Databricks workspace" + description = "Google Cloud project ID where the Databricks workspace lives" } variable "spoke_vpc_google_project" { type = string - description = "Google Cloud project ID related to Spoke VPC" + description = "Google Cloud project ID hosting the spoke VPC (often the same as workspace project)" } variable "hub_vpc_google_project" { type = string - description = "Google Cloud project ID related to Hub VPC" + description = "Google Cloud project ID hosting the hub VPC" } variable "is_spoke_vpc_shared" { type = bool - description = "Whether the Spoke VPC is a Shared or a dedicated VPC" + description = "Whether the spoke VPC project hosts a Shared VPC and the workspace project is bound as a service project" } variable "prefix" { type = string - description = "Prefix to use in generated resources name" + description = "Prefix used to name generated resources" } # For the value of the regional Hive Metastore IP, refer to the Databricks documentation -# Here - https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore +# https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore variable "hive_metastore_ip" { type = string - description = "Value of regional default Hive Metastore IP" + description = "Regional default Hive Metastore IP (used by the spoke egress firewall to allow MySQL/3306)" } variable "hub_vpc_cidr" { type = string - description = "CIDR for Hub VPC" + description = "CIDR for the hub subnet" } variable "spoke_vpc_cidr" { type = string - description = "CIDR for Spoke VPC" + description = "CIDR of the spoke VPC address space (used as source_ranges for the hub ingress firewall)" +} + +variable "subnet_cidr" { + type = string + description = "CIDR for the spoke subnet (must be within spoke_vpc_cidr)" } variable "psc_subnet_cidr" { type = string - description = "CIDR for Spoke VPC" + description = "CIDR for the dedicated PSC subnet in the spoke VPC" } variable "tags" { type = map(string) - description = "Map of tags to add to all resources" + description = "Map of tags applied to the composer (the composer accepts this but does not currently propagate to all submodules)" default = {} } variable "metastore_name" { type = string - description = "Name to assign to regional metastore" + description = "Name to assign to regional Unity Catalog metastore" } variable "catalog_name" { type = string - description = "Name to assign to default catalog" + description = "Name to assign to default Unity Catalog catalog" } \ No newline at end of file diff --git a/modules/Makefile b/modules/Makefile index 98c80a85..a23fed05 100644 --- a/modules/Makefile +++ b/modules/Makefile @@ -1,8 +1,11 @@ PROJECTS := $(dir $(wildcard */README.md)) -docs: $(PROJECTS) +docs: $(PROJECTS) gcp-recursive $(PROJECTS): $(MAKE) -C $@ docs -.PHONY: $(PROJECTS) +gcp-recursive: + $(MAKE) -C gcp docs + +.PHONY: $(PROJECTS) docs gcp-recursive diff --git a/modules/gcp-sa-provisioning/Makefile b/modules/gcp-sa-provisioning/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-sa-provisioning/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-unity-catalog/Makefile b/modules/gcp-unity-catalog/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-unity-catalog/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-with-psc-exfiltration-protection/Makefile b/modules/gcp-with-psc-exfiltration-protection/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-with-psc-exfiltration-protection/README.md b/modules/gcp-with-psc-exfiltration-protection/README.md deleted file mode 100644 index 6f9650de..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/README.md +++ /dev/null @@ -1,139 +0,0 @@ -# Databricks on Google Cloud with Private Service Connect and Hub-Spoke network structure (data exfiltration protection). - -## ⚠️ Prerequisites -To **enable Private Service Connect for your Databricks workspace** on Google Cloud, you must contact your Databricks account team and provide: -- Databricks account ID -- VPC Host Project ID of the **compute plane VPC** for enabling back-end Private Service Connect -- VPC Host Project ID of the **transit VPC** for enabling front-end Private Service Connect -- Workspace region - -This configuration **cannot be completed independently** and requires coordination with your Databricks account team. - -## Overview - -The module includes: -1. Hub-Spoke networking with egress firewall to control all outbound traffic, e.g. to pypi.org. -2. Private Service Connect connection for backend traffic from data plane to control plane. -3. Private Service Connect connection from user client to webapp service. -4. Private Google Access from data plane to DBFS storage. -5. Private Service Connect connection for web-auth traffic. - -## Overall Architecture - -![alt text](images/architecture.png) - -With this deployment, traffic from user client to webapp (notebook UI), backend traffic from data plane to control plane will be through PSC endpoints. This terraform sample will create: -* Hub VPC and its subnet -* Spoke VPC and its subnets -* Peering between Hub and Spoke VPC -* Private Service Connect (PSC) endpoints -* DNS private and peering zones -* Firewall rules for Hub and Spoke VPCs -* Databricks workspace with private link to control plane, user to webapp and private link to DBFS - - -**Note that** the module does not contain the VPC SC implementation. This can be added to increase the security level in the Databricks deployment, providing detailed access level for ingress and egress traffic. -## How to use - -> **Note** -> You can customize this module by adding, deleting or updating the Google Cloud resources to adapt the module to your requirements. -> A deployment example using this module can be found in [examples/gcp-with-psc-exfiltration-protection](../../examples/gcp-with-psc-exfiltration-protection) - -1. Reference this module using one of the different [module source types](https://developer.hashicorp.com/terraform/language/modules/sources) -2. Add `terraform.tfvars` with the information about service principals to be provisioned at account level. - - -## Requirements - -No requirements. - -## Providers - -| Name | Version | -|------|---------| -| [databricks](#provider\_databricks) | n/a | -| [google](#provider\_google) | n/a | -| [random](#provider\_random) | n/a | - -## Modules - -No modules. - -## Resources - -| Name | Type | -|------|------| -| [databricks_mws_networks.databricks_network](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource | -| [databricks_mws_private_access_settings.pas](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_private_access_settings) | resource | -| [databricks_mws_vpc_endpoint.backend_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | -| [databricks_mws_vpc_endpoint.frontend_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | -| [databricks_mws_vpc_endpoint.transit_endpoint](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | -| [databricks_mws_workspaces.databricks_workspace](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | -| [google_compute_address.backend_pe_ip_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | -| [google_compute_address.hub_frontend_pe_ip_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | -| [google_compute_address.spoke_frontend_pe_ip_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | -| [google_compute_firewall.databricks_workspace_traffic](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.default_deny_egress](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.hub_net_traffic](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_databricks_compute_plane](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_databricks_control_plane](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_google_apis](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_firewall.to_managed_hive](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | -| [google_compute_forwarding_rule.backend_psc_ep](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | -| [google_compute_forwarding_rule.hub_frontend_psc_ep](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | -| [google_compute_forwarding_rule.spoke_frontend_psc_ep](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | -| [google_compute_network.hub_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | -| [google_compute_network.spoke_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | -| [google_compute_network_peering.hub_spoke_peering](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | -| [google_compute_network_peering.spoke_hub_peering](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | -| [google_compute_shared_vpc_host_project.host](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_host_project) | resource | -| [google_compute_shared_vpc_service_project.service](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_service_project) | resource | -| [google_compute_subnetwork.hub_subnetwork](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [google_compute_subnetwork.psc_subnetwork](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [google_compute_subnetwork.spoke_subnetwork](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [google_dns_managed_zone.gcr_peering_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.gcr_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.google_apis_peering_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.google_apis_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.hub_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.pkg_dev_peering_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.pkg_dev_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_managed_zone.spoke_private_zone](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | -| [google_dns_record_set.gcr_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.gcr_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.hub_workspace_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.hub_workspace_psc_auth](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.hub_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.pkg_dev_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.pkg_dev_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.restricted_apis_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.restricted_apis_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.spoke_relay](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.spoke_workspace_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [google_dns_record_set.spoke_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | -| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | - -## Inputs - -| Name | Description | Type | Default | Required | -|------|-------------|------|---------|:--------:| -| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google Cloud region where the resources will be created | `string` | n/a | yes | -| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Value of regional default Hive Metastore IP | `string` | n/a | yes | -| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for Hub VPC | `string` | n/a | yes | -| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | Google Cloud project ID related to Hub VPC | `string` | n/a | yes | -| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | Whether the Spoke VPC is a Shared or a dedicated VPC | `bool` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated resources name | `string` | n/a | yes | -| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for Spoke VPC | `string` | n/a | yes | -| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | Google Cloud project ID related to Spoke VPC | `string` | n/a | yes | -| [tags](#input\_tags) | Map of tags to add to all resources | `map(string)` | n/a | yes | -| [workspace\_google\_project](#input\_workspace\_google\_project) | Google Cloud project ID related to Databricks workspace | `string` | n/a | yes | - -## Outputs - -| Name | Description | -|------|-------------| -| [workspace\_id](#output\_workspace\_id) | The Databricks workspace ID | -| [workspace\_url](#output\_workspace\_url) | The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com' | - \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/databricks-cloud-resources.tf b/modules/gcp-with-psc-exfiltration-protection/databricks-cloud-resources.tf deleted file mode 100644 index f0a355e0..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/databricks-cloud-resources.tf +++ /dev/null @@ -1,86 +0,0 @@ -################################################### -# Databricks VPC Endpoints & Network Configuration -################################################### - -# ================================================ -# Private Service Connect Endpoint Configurations -# ================================================ - -# Registers a transit VPC endpoint for hub network connectivity -resource "databricks_mws_vpc_endpoint" "transit_endpoint" { - depends_on = [google_compute_forwarding_rule.backend_psc_ep] - - vpc_endpoint_name = "${var.prefix}-hub-ep-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP-specific PSC configuration for hub network - gcp_vpc_endpoint_info { - project_id = var.hub_vpc_google_project - psc_endpoint_name = google_compute_forwarding_rule.hub_frontend_psc_ep.name - endpoint_region = var.google_region - } -} - -# Registers frontend workspace VPC endpoint for user-facing access -resource "databricks_mws_vpc_endpoint" "frontend_endpoint" { - depends_on = [google_compute_forwarding_rule.backend_psc_ep] - - vpc_endpoint_name = "${var.prefix}-ws-ep-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP-specific PSC configuration for spoke workspace - gcp_vpc_endpoint_info { - project_id = var.spoke_vpc_google_project - psc_endpoint_name = google_compute_forwarding_rule.spoke_frontend_psc_ep.name - endpoint_region = var.google_region - } -} - -# Registers backend SCC (Secure Cluster Connectivity) endpoint -resource "databricks_mws_vpc_endpoint" "backend_endpoint" { - depends_on = [google_compute_forwarding_rule.spoke_frontend_psc_ep] - - vpc_endpoint_name = "${var.prefix}-scc-ep-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP-specific PSC configuration for backend connectivity - gcp_vpc_endpoint_info { - project_id = var.spoke_vpc_google_project - psc_endpoint_name = google_compute_forwarding_rule.backend_psc_ep.name - endpoint_region = var.google_region - } -} - -# ================================================ -# Network Configuration for Databricks Workspace -# ================================================ - -resource "databricks_mws_networks" "databricks_network" { - network_name = "${var.prefix}-ntw-${random_string.suffix.result}" - account_id = var.databricks_account_id - - # GCP network infrastructure details - gcp_network_info { - network_project_id = var.spoke_vpc_google_project - vpc_id = google_compute_network.spoke_vpc.name - subnet_id = google_compute_subnetwork.spoke_subnetwork.name - subnet_region = var.google_region - } - - # PrivateLink endpoint associations - vpc_endpoints { - dataplane_relay = [databricks_mws_vpc_endpoint.backend_endpoint.vpc_endpoint_id] # SCC connectivity - rest_api = [databricks_mws_vpc_endpoint.frontend_endpoint.vpc_endpoint_id] # Workspace API access - } -} - -# ================================================ -# Private Access Configuration -# ================================================ - -resource "databricks_mws_private_access_settings" "pas" { - private_access_settings_name = "${var.prefix}-pas-${random_string.suffix.result}" - region = var.google_region - public_access_enabled = false # Block public internet access - private_access_level = "ACCOUNT" # Apply to entire Databricks account -} diff --git a/modules/gcp-with-psc-exfiltration-protection/dns-hub.tf b/modules/gcp-with-psc-exfiltration-protection/dns-hub.tf deleted file mode 100644 index b0eca334..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/dns-hub.tf +++ /dev/null @@ -1,214 +0,0 @@ -######################################### -# Databricks Private DNS Configuration # -######################################### - -# Create a private DNS zone for Databricks PSC management -resource "google_dns_managed_zone" "hub_private_zone" { - name = "${var.prefix}-hub-gcp-databricks-com" - project = var.hub_vpc_google_project - dns_name = "gcp.databricks.com." - description = "Private DNS zone for Databricks PSC management" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# DNS A record for the Databricks workspace URL -resource "google_dns_record_set" "hub_workspace_url" { - name = "${local.workspace_dns_id}.${google_dns_managed_zone.hub_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.hub_private_zone.name - type = "A" - ttl = 300 - - # Points to the Databricks frontend Private Endpoint IP address - rrdatas = [ - google_compute_address.hub_frontend_pe_ip_address.address - ] -} - -# DNS A record for the Databricks PSC authentication endpoint -resource "google_dns_record_set" "hub_workspace_psc_auth" { - name = "${var.google_region}.psc-auth.${google_dns_managed_zone.hub_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.hub_private_zone.name - type = "A" - ttl = 300 - - # Points to the same frontend Private Endpoint IP - rrdatas = [ - google_compute_address.hub_frontend_pe_ip_address.address - ] -} - -# DNS A record for the Databricks dataplane endpoint -resource "google_dns_record_set" "hub_workspace_dp" { - name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.hub_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.hub_private_zone.name - type = "A" - ttl = 300 - - # Points to the same frontend Private Endpoint IP - rrdatas = [ - google_compute_address.hub_frontend_pe_ip_address.address - ] -} - -############################################# -# Google Container Registry Private DNS Zone # -############################################# - -# Create a private DNS zone for GCR (gcr.io) -resource "google_dns_managed_zone" "gcr_private_zone" { - name = "${var.prefix}-gcr-io" - project = var.hub_vpc_google_project - dns_name = "gcr.io." - description = "Private DNS zone for GCR private resolution" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Wildcard CNAME record for all subdomains of gcr.io -resource "google_dns_record_set" "gcr_cname" { - name = "*.${google_dns_managed_zone.gcr_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.gcr_private_zone.name - type = "CNAME" - ttl = 300 - - # All subdomains point to gcr.io - rrdatas = [ - "gcr.io." - ] -} - -# A record for gcr.io pointing to Google IPs for private access -resource "google_dns_record_set" "gcr_a" { - name = google_dns_managed_zone.gcr_private_zone.dns_name - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.gcr_private_zone.name - type = "A" - ttl = 300 - - # Official Google IPs for gcr.io - rrdatas = [ - "199.36.153.8", - "199.36.153.9", - "199.36.153.10", - "199.36.153.11" - ] -} - -################################## -# Google APIs Private DNS Zone # -################################## - -# Create a private DNS zone for Google APIs (googleapis.com) -resource "google_dns_managed_zone" "google_apis_private_zone" { - name = "${var.prefix}-google-apis" - project = var.hub_vpc_google_project - dns_name = "googleapis.com." - description = "Private DNS zone for Google APIs resolution" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Wildcard CNAME record for all subdomains of googleapis.com -resource "google_dns_record_set" "restricted_apis_cname" { - name = "*.${google_dns_managed_zone.google_apis_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.google_apis_private_zone.name - type = "CNAME" - ttl = 300 - - # All subdomains point to restricted.googleapis.com - rrdatas = [ - "restricted.googleapis.com." - ] -} - -# A record for restricted.googleapis.com pointing to Google IPs for private access -resource "google_dns_record_set" "restricted_apis_a" { - name = "restricted.${google_dns_managed_zone.google_apis_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.google_apis_private_zone.name - type = "A" - ttl = 300 - - # Official Google IPs for restricted.googleapis.com - rrdatas = [ - "199.36.153.4", - "199.36.153.5", - "199.36.153.6", - "199.36.153.7" - ] -} - -################################## -# Go Packages Private DNS Zone # -################################## - -# Create a private DNS zone for Go Packages (pkg.dev) -resource "google_dns_managed_zone" "pkg_dev_private_zone" { - name = "${var.prefix}-pkg-dev" - project = var.hub_vpc_google_project - dns_name = "pkg.dev." - description = "Private DNS zone for Go Packages resolution" - visibility = "private" - - # Restrict visibility to the hub VPC network - private_visibility_config { - networks { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Wildcard CNAME record for all subdomains of pkg.dev -resource "google_dns_record_set" "pkg_dev_cname" { - name = "*.${google_dns_managed_zone.pkg_dev_private_zone.dns_name}" - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.pkg_dev_private_zone.name - type = "CNAME" - ttl = 300 - - # All subdomains point to pkg.dev - rrdatas = [ - "pkg.dev." - ] -} - -# A record for pkg.dev pointing to Google IPs for private access -resource "google_dns_record_set" "pkg_dev_a" { - name = google_dns_managed_zone.pkg_dev_private_zone.dns_name - project = var.hub_vpc_google_project - managed_zone = google_dns_managed_zone.pkg_dev_private_zone.name - type = "A" - ttl = 300 - - # Official Google IPs for pkg.dev - rrdatas = [ - "199.36.153.8", - "199.36.153.9", - "199.36.153.10", - "199.36.153.11" - ] -} diff --git a/modules/gcp-with-psc-exfiltration-protection/dns-spoke.tf b/modules/gcp-with-psc-exfiltration-protection/dns-spoke.tf deleted file mode 100644 index 799cd81a..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/dns-spoke.tf +++ /dev/null @@ -1,135 +0,0 @@ -############################################# -# Databricks Private DNS Zone (Spoke VPC) # -############################################# - -# Creates a private DNS managed zone for Databricks PSC endpoints -# This zone is only visible within the spoke VPC network -resource "google_dns_managed_zone" "spoke_private_zone" { - name = "${var.prefix}-spoke-gcp-databricks-com" - project = var.spoke_vpc_google_project - dns_name = "gcp.databricks.com." - description = "Private DNS zone for Databricks PSC management" - visibility = "private" - - # Restricts DNS zone visibility to the spoke VPC - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } -} - -# Creates an A record for the Databricks workspace endpoint in the spoke VPC -resource "google_dns_record_set" "spoke_workspace_url" { - name = "${local.workspace_dns_id}.${google_dns_managed_zone.spoke_private_zone.dns_name}" - project = var.spoke_vpc_google_project - managed_zone = google_dns_managed_zone.spoke_private_zone.name - type = "A" - ttl = 300 - - # Points to the Databricks frontend Private Endpoint IP in the spoke VPC - rrdatas = [ - google_compute_address.spoke_frontend_pe_ip_address.address - ] -} - -# Creates an A record for the Databricks dataplane endpoint in the spoke VPC -resource "google_dns_record_set" "spoke_workspace_dp" { - name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.spoke_private_zone.dns_name}" - project = var.spoke_vpc_google_project - managed_zone = google_dns_managed_zone.spoke_private_zone.name - type = "A" - ttl = 300 - - # Points to the Databricks frontend Private Endpoint IP in the spoke VPC - rrdatas = [ - google_compute_address.spoke_frontend_pe_ip_address.address - ] -} - -# Creates an A record for the Databricks relay/tunnel endpoint in the spoke VPC -resource "google_dns_record_set" "spoke_relay" { - name = "tunnel.${var.google_region}.${google_dns_managed_zone.spoke_private_zone.dns_name}" - project = var.spoke_vpc_google_project - managed_zone = google_dns_managed_zone.spoke_private_zone.name - type = "A" - ttl = 300 - - # Points to the backend Private Endpoint IP (used for relay/tunnel) - rrdatas = [ - google_compute_address.backend_pe_ip_address.address - ] -} - -########################################################## -# Peering DNS Zones for Hub-Spoke Shared Service Access # -########################################################## - -# The following managed zones provide private DNS for Google services (GCR, Google APIs, Go Packages) -# and are peered to the hub VPC for shared DNS resolution across VPCs. - -# Google Container Registry (GCR) private peering zone -resource "google_dns_managed_zone" "gcr_peering_zone" { - name = "${var.prefix}-peering-gcr" - project = var.spoke_vpc_google_project - dns_name = "gcr.io." - description = "Peering DNS zone for GCR private resolution" - visibility = "private" - - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } - - # Peers this DNS zone with the hub VPC to allow DNS resolution from the hub - peering_config { - target_network { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Google APIs private peering zone -resource "google_dns_managed_zone" "google_apis_peering_zone" { - name = "${var.prefix}-peering-google-apis" - project = var.spoke_vpc_google_project - dns_name = "googleapis.com." - description = "Private DNS zone for Google APIs resolution" - visibility = "private" - - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } - - # Peers this DNS zone with the hub VPC to allow DNS resolution from the hub - peering_config { - target_network { - network_url = google_compute_network.hub_vpc.id - } - } -} - -# Go Packages (pkg.dev) private peering zone -resource "google_dns_managed_zone" "pkg_dev_peering_zone" { - name = "${var.prefix}-peering-pkg-dev" - project = var.spoke_vpc_google_project - dns_name = "pkg.dev." - description = "Private DNS zone for Go Packages resolution" - visibility = "private" - - private_visibility_config { - networks { - network_url = google_compute_network.spoke_vpc.id - } - } - - # Peers this DNS zone with the hub VPC to allow DNS resolution from the hub - peering_config { - target_network { - network_url = google_compute_network.hub_vpc.id - } - } -} diff --git a/modules/gcp-with-psc-exfiltration-protection/firewall-hub.tf b/modules/gcp-with-psc-exfiltration-protection/firewall-hub.tf deleted file mode 100644 index a1563a2e..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/firewall-hub.tf +++ /dev/null @@ -1,21 +0,0 @@ -# ========================================================== -# Google Cloud VPC Firewall Rule: Hub Network Ingress Traffic -# ========================================================== - -resource "google_compute_firewall" "hub_net_traffic" { - name = "${google_compute_network.hub_vpc.name}-ingress" - - project = var.hub_vpc_google_project - network = google_compute_network.hub_vpc.self_link - - direction = "INGRESS" - priority = 1000 - destination_ranges = [] - # The source IP range(s) allowed by this rule (CIDR format) - # Only traffic originating from the spoke VPC's CIDR block will be allowed - source_ranges = [var.spoke_vpc_cidr] - - allow { - protocol = "all" - } -} diff --git a/modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf b/modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf deleted file mode 100644 index a44c69a6..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/firewall-spoke.tf +++ /dev/null @@ -1,112 +0,0 @@ -############################################################# -# Google Cloud Firewall Rules for Databricks Spoke Network # -############################################################# - -# ========================================================== -# Default Egress Deny Rule (Catch-All Block) -# ========================================================== - -resource "google_compute_firewall" "default_deny_egress" { - name = "${google_compute_network.spoke_vpc.name}-default-deny-egress" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1100 # Higher priority than allow rules - destination_ranges = ["0.0.0.0/0"] # Block all external destinations - source_ranges = [] - - deny { protocol = "all" } # Explicit deny all outbound traffic -} - -# ========================================================== -# Essential Service Allow Rules -# ========================================================== - -# Allows outbound traffic to Google APIs and services -resource "google_compute_firewall" "to_google_apis" { - name = "${google_compute_network.spoke_vpc.name}-to-google-apis" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 # Lower priority than deny rule - destination_ranges = [ - "199.36.153.4/30", # Restricted Google APIs - "199.36.153.8/30", # GCR/GCS endpoints - "34.126.0.0/18" # Additional Google service IPs - ] - - allow { protocol = "all" } # Full protocol access to these IPs -} - -# Allows control plane communication for Databricks -resource "google_compute_firewall" "to_databricks_control_plane" { - name = "${google_compute_network.spoke_vpc.name}-to-databricks-control-plane" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 - destination_ranges = [ - "${google_compute_forwarding_rule.backend_psc_ep.ip_address}/32", # SCC endpoint - "${google_compute_forwarding_rule.spoke_frontend_psc_ep.ip_address}/32" # Frontend endpoint - ] - - allow { - protocol = "tcp" - ports = ["443"] # HTTPS only - } -} - -# ========================================================== -# Managed Hive Metastore Access (Conditional) -# ========================================================== - -resource "google_compute_firewall" "to_managed_hive" { - name = "${google_compute_network.spoke_vpc.name}-to-${var.google_region}-managed-hive" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 - destination_ranges = ["${var.hive_metastore_ip}/32"] # Metastore-specific IP - - allow { - protocol = "tcp" - ports = ["3306"] # MySQL port - } -} - -# ========================================================== -# Internal Workspace Communication -# ========================================================== - -resource "google_compute_firewall" "databricks_workspace_traffic" { - name = "${google_compute_network.spoke_vpc.name}-${databricks_mws_workspaces.databricks_workspace.workspace_id}-ingress" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "INGRESS" - priority = 1000 - source_ranges = [var.spoke_vpc_cidr] # Internal VPC traffic - target_tags = ["databricks-${databricks_mws_workspaces.databricks_workspace.workspace_id}"] # Workspace-specific instances - - allow { protocol = "all" } # Full internal access -} - -resource "google_compute_firewall" "to_databricks_compute_plane" { - name = "${google_compute_network.spoke_vpc.name}-to-databricks-compute-plane" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.self_link - - direction = "EGRESS" - priority = 1000 - destination_ranges = [ - var.spoke_vpc_cidr - ] - - allow { - protocol = "all" - } -} \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/images/architecture.png b/modules/gcp-with-psc-exfiltration-protection/images/architecture.png deleted file mode 100644 index 9b245904..00000000 Binary files a/modules/gcp-with-psc-exfiltration-protection/images/architecture.png and /dev/null differ diff --git a/modules/gcp-with-psc-exfiltration-protection/outputs.tf b/modules/gcp-with-psc-exfiltration-protection/outputs.tf deleted file mode 100644 index 27983846..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/outputs.tf +++ /dev/null @@ -1,10 +0,0 @@ - -output "workspace_url" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url - description = "The workspace URL which is of the format '{workspaceId}.{random}.gcp.databricks.com'" -} - -output "workspace_id" { - description = "The Databricks workspace ID" - value = databricks_mws_workspaces.databricks_workspace.workspace_id -} \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/psc.tf b/modules/gcp-with-psc-exfiltration-protection/psc.tf deleted file mode 100644 index e64b5436..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/psc.tf +++ /dev/null @@ -1,75 +0,0 @@ -######################################################### -# Private Service Connect (PSC) Internal Endpoints Setup -######################################################### - -# ---------------------------------------------------------------- -# Secure Cluster Connectivity (SCC) PSC Endpoint (Spoke VPC) -# ---------------------------------------------------------------- - -# Reserves an internal IP address for the backend (SCC) PSC endpoint in the spoke VPC -resource "google_compute_address" "backend_pe_ip_address" { - name = "${var.prefix}-psc-scc-ip-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - subnetwork = google_compute_subnetwork.psc_subnetwork.name - address_type = "INTERNAL" -} - -# Creates a forwarding rule to map the reserved IP to the SCC PSC service attachment -resource "google_compute_forwarding_rule" "backend_psc_ep" { - name = "${var.prefix}-psc-scc-ep-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - network = google_compute_network.spoke_vpc.id - ip_address = google_compute_address.backend_pe_ip_address.id - target = local.google_backend_psc_targets[var.google_region] - load_balancing_scheme = "" # Must be set to "" for service attachment targets -} - -# ---------------------------------------------------------------- -# Workspace Frontend PSC Endpoint (Spoke VPC) -# ---------------------------------------------------------------- - -# Reserves an internal IP address for the workspace frontend PSC endpoint in the spoke VPC -resource "google_compute_address" "spoke_frontend_pe_ip_address" { - name = "${var.prefix}-psc-ws-ip-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - subnetwork = google_compute_subnetwork.psc_subnetwork.name - address_type = "INTERNAL" -} - -# Creates a forwarding rule to map the reserved IP to the workspace frontend PSC service attachment -resource "google_compute_forwarding_rule" "spoke_frontend_psc_ep" { - name = "${var.prefix}-psc-ws-ep-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - region = var.google_region - network = google_compute_network.spoke_vpc.id - ip_address = google_compute_address.spoke_frontend_pe_ip_address.id - target = local.google_frontend_psc_targets[var.google_region] - load_balancing_scheme = "" # Must be set to "" for service attachment targets -} - -# ---------------------------------------------------------------- -# Workspace Frontend PSC Endpoint (Hub VPC) -# ---------------------------------------------------------------- - -# Reserves an internal IP address for the workspace frontend PSC endpoint in the hub VPC -resource "google_compute_address" "hub_frontend_pe_ip_address" { - name = "${var.prefix}-hub-psc-ws-ip-${random_string.suffix.result}" - project = var.hub_vpc_google_project - region = var.google_region - subnetwork = google_compute_subnetwork.hub_subnetwork.name - address_type = "INTERNAL" -} - -# Creates a forwarding rule to map the reserved IP to the workspace frontend PSC service attachment in the hub VPC -resource "google_compute_forwarding_rule" "hub_frontend_psc_ep" { - name = "${var.prefix}-hub-psc-ws-ep-${random_string.suffix.result}" - project = var.hub_vpc_google_project - region = var.google_region - network = google_compute_network.hub_vpc.id - ip_address = google_compute_address.hub_frontend_pe_ip_address.id - target = local.google_frontend_psc_targets[var.google_region] - load_balancing_scheme = "" # Must be set to "" for service attachment targets -} diff --git a/modules/gcp-with-psc-exfiltration-protection/terraform.tf b/modules/gcp-with-psc-exfiltration-protection/terraform.tf deleted file mode 100644 index 688f0fbd..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/terraform.tf +++ /dev/null @@ -1,13 +0,0 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - random = { - source = "hashicorp/random" - } - } -} \ No newline at end of file diff --git a/modules/gcp-with-psc-exfiltration-protection/variables.tf b/modules/gcp-with-psc-exfiltration-protection/variables.tf deleted file mode 100644 index cd96e520..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/variables.tf +++ /dev/null @@ -1,62 +0,0 @@ -variable "databricks_account_id" { - type = string - description = "Databricks Account ID" -} - -variable "google_region" { - type = string - description = "Google Cloud region where the resources will be created" -} - -variable "workspace_google_project" { - type = string - description = "Google Cloud project ID related to Databricks workspace" -} - -variable "spoke_vpc_google_project" { - type = string - description = "Google Cloud project ID related to Spoke VPC" -} - -variable "hub_vpc_google_project" { - type = string - description = "Google Cloud project ID related to Hub VPC" -} - -variable "is_spoke_vpc_shared" { - type = bool - description = "Whether the Spoke VPC is a Shared or a dedicated VPC" -} - -variable "prefix" { - type = string - description = "Prefix to use in generated resources name" -} - -# For the value of the regional Hive Metastore IP, refer to the Databricks documentation -# Here - https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore -variable "hive_metastore_ip" { - type = string - description = "Value of regional default Hive Metastore IP" -} - -variable "hub_vpc_cidr" { - type = string - description = "CIDR for Hub VPC" -} - -variable "spoke_vpc_cidr" { - type = string - description = "CIDR for Spoke VPC" -} - -variable "psc_subnet_cidr" { - type = string - description = "CIDR for Spoke VPC" -} - -variable "tags" { - description = "Map of tags to add to all resources" - type = map(string) -} - diff --git a/modules/gcp-with-psc-exfiltration-protection/vpc.tf b/modules/gcp-with-psc-exfiltration-protection/vpc.tf deleted file mode 100644 index 113c873d..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/vpc.tf +++ /dev/null @@ -1,90 +0,0 @@ -######################################################### -# Hub & Spoke Network Infrastructure Configuration -######################################################### - -# ======================================================= -# VPC Networks -# ======================================================= - -# Spoke VPC for Databricks workspace and workloads -resource "google_compute_network" "spoke_vpc" { - name = "${var.prefix}-spoke-vpc-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - auto_create_subnetworks = false # Manual subnet configuration - routing_mode = "GLOBAL" # Global routing for hybrid connectivity - bgp_best_path_selection_mode = "STANDARD" -} - -# Hub VPC for centralized networking services -resource "google_compute_network" "hub_vpc" { - name = "${var.prefix}-hub-vpc-${random_string.suffix.result}" - project = var.hub_vpc_google_project - auto_create_subnetworks = false - routing_mode = "GLOBAL" -} - -# ======================================================= -# Subnetwork Configuration -# ======================================================= - -# Primary spoke subnet for general workloads -resource "google_compute_subnetwork" "spoke_subnetwork" { - name = "${var.prefix}-spoke-subnet-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.id - region = var.google_region - ip_cidr_range = var.spoke_vpc_cidr - private_ip_google_access = true # Enables Private Google Access -} - -# Dedicated PSC subnet for Private Service Connect endpoints -resource "google_compute_subnetwork" "psc_subnetwork" { - name = "${var.prefix}-spoke-psc-subnet-${random_string.suffix.result}" - project = var.spoke_vpc_google_project - network = google_compute_network.spoke_vpc.id - region = var.google_region - ip_cidr_range = var.psc_subnet_cidr - private_ip_google_access = true -} - -# Hub subnet for shared services -resource "google_compute_subnetwork" "hub_subnetwork" { - name = "${var.prefix}-hub-subnet-${random_string.suffix.result}" - project = var.hub_vpc_google_project - network = google_compute_network.hub_vpc.id - region = var.google_region - ip_cidr_range = var.hub_vpc_cidr - private_ip_google_access = true -} - -# ======================================================= -# Network Peering Configuration -# ======================================================= - -# Bidirectional peering between hub and spoke VPCs -resource "google_compute_network_peering" "hub_spoke_peering" { - name = "${var.prefix}-hub-spoke-peering-${random_string.suffix.result}" - network = google_compute_network.hub_vpc.self_link - peer_network = google_compute_network.spoke_vpc.self_link -} - -resource "google_compute_network_peering" "spoke_hub_peering" { - name = "${var.prefix}-spoke-hub-peering-${random_string.suffix.result}" - network = google_compute_network.spoke_vpc.self_link - peer_network = google_compute_network.hub_vpc.self_link -} - -# ======================================================= -# Shared VPC Configuration (Conditional) -# ======================================================= - -resource "google_compute_shared_vpc_host_project" "host" { - count = var.workspace_google_project != var.spoke_vpc_google_project && var.is_spoke_vpc_shared ? 1 : 0 - project = var.spoke_vpc_google_project -} - -resource "google_compute_shared_vpc_service_project" "service" { - count = var.workspace_google_project != var.spoke_vpc_google_project && var.is_spoke_vpc_shared ? 1 : 0 - host_project = google_compute_shared_vpc_host_project.host[0].project - service_project = var.workspace_google_project -} diff --git a/modules/gcp-with-psc-exfiltration-protection/workspace.tf b/modules/gcp-with-psc-exfiltration-protection/workspace.tf deleted file mode 100644 index 44dc510e..00000000 --- a/modules/gcp-with-psc-exfiltration-protection/workspace.tf +++ /dev/null @@ -1,22 +0,0 @@ -######################################################### -# Databricks Workspace Configuration -######################################################### - -resource "databricks_mws_workspaces" "databricks_workspace" { - workspace_name = "${var.prefix}-ws-${random_string.suffix.result}" - - # Databricks account and cloud provider details - account_id = var.databricks_account_id - location = var.google_region # GCP region for workspace deployment - - # GCP project hosting workspace resources - cloud_resource_container { - gcp { - project_id = var.workspace_google_project - } - } - - # Network and security configurations - private_access_settings_id = databricks_mws_private_access_settings.pas.private_access_settings_id # Private access enforcement - network_id = databricks_mws_networks.databricks_network.network_id # Associated VPC network -} diff --git a/modules/gcp-workspace-basic/Makefile b/modules/gcp-workspace-basic/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-workspace-basic/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-workspace-basic/README.md b/modules/gcp-workspace-basic/README.md deleted file mode 100644 index cc6d6b12..00000000 --- a/modules/gcp-workspace-basic/README.md +++ /dev/null @@ -1,67 +0,0 @@ -gcp basic -========================= - -In this template, we show how to deploy a workspace with managed vpc. - - -## Requirements - -- You need to have run gcp-sa-provisionning and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The Service Account needs to be added as Databricks Admin in the account console - -## Run as an SA - -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. - - -## Run the tempalte - -- You need to fill in the variables.tf -- run `terraform init` -- run `teraform apply` - - -## Requirements - -No requirements. - -## Providers - -| Name | Version | -|------|---------| -| [databricks](#provider\_databricks) | n/a | -| [google](#provider\_google) | n/a | -| [random](#provider\_random) | n/a | - -## Modules - -No modules. - -## Resources - -| Name | Type | -|------|------| -| [databricks_mws_workspaces.databricks_workspace](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | -| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | -| [google_client_config.current](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source | -| [google_client_openid_userinfo.me](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_openid_userinfo) | data source | - -## Inputs - -| Name | Description | Type | Default | Required | -|------|-------------|------|---------|:--------:| -| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [workspace\_name](#input\_workspace\_name) | Name of the workspace to create | `string` | n/a | yes | - -## Outputs - -| Name | Description | -|------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | - diff --git a/modules/gcp-workspace-basic/init.tf b/modules/gcp-workspace-basic/init.tf deleted file mode 100644 index b07d3474..00000000 --- a/modules/gcp-workspace-basic/init.tf +++ /dev/null @@ -1,24 +0,0 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - } -} - -data "google_client_openid_userinfo" "me" { -} - - -data "google_client_config" "current" { -} - - -resource "random_string" "suffix" { - special = false - upper = false - length = 6 -} diff --git a/modules/gcp-workspace-basic/outputs.tf b/modules/gcp-workspace-basic/outputs.tf deleted file mode 100644 index d6b170a9..00000000 --- a/modules/gcp-workspace-basic/outputs.tf +++ /dev/null @@ -1,9 +0,0 @@ - -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url -} - -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true -} diff --git a/modules/gcp-workspace-basic/variables.tf b/modules/gcp-workspace-basic/variables.tf deleted file mode 100644 index 5e94a563..00000000 --- a/modules/gcp-workspace-basic/variables.tf +++ /dev/null @@ -1,29 +0,0 @@ -variable "databricks_account_id" { - type = string - description = "Databricks Account ID" -} - -variable "google_project" { - type = string - description = "Google project for VCP/workspace deployment" -} - -variable "google_region" { - type = string - description = "Google region for VCP/workspace deployment" -} - -variable "prefix" { - type = string - description = "Prefix to use in generated VPC name" -} - -variable "workspace_name" { - type = string - description = "Name of the workspace to create" -} - -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) -} diff --git a/modules/gcp-workspace-basic/workspace.tf b/modules/gcp-workspace-basic/workspace.tf deleted file mode 100644 index 262d8a06..00000000 --- a/modules/gcp-workspace-basic/workspace.tf +++ /dev/null @@ -1,14 +0,0 @@ -resource "databricks_mws_workspaces" "databricks_workspace" { - account_id = var.databricks_account_id - workspace_name = var.workspace_name - - location = var.google_region - cloud_resource_container { - gcp { - project_id = var.google_project - } - } - token { - comment = "Terraform token" - } -} diff --git a/modules/gcp-workspace-byovpc/Makefile b/modules/gcp-workspace-byovpc/Makefile deleted file mode 100644 index 653039d8..00000000 --- a/modules/gcp-workspace-byovpc/Makefile +++ /dev/null @@ -1,7 +0,0 @@ -.PHONY: docs test_docs - -docs: - terraform-docs -c ../../.terraform-docs.yml . - -test_docs: - terraform-docs -c ../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-workspace-byovpc/README.md b/modules/gcp-workspace-byovpc/README.md deleted file mode 100644 index 0fbaf403..00000000 --- a/modules/gcp-workspace-byovpc/README.md +++ /dev/null @@ -1,75 +0,0 @@ -gcp byovpc -========================= - -In this template, we show how to deploy a workspace with a custom VPC. - - -## Requirements - -- You need to have run `gcp-sa-provisionning` module and have a service account to fill in the variables. -- If you want to deploy to a new project, you will need to grant the custom role generated in that template to the service acount in the new project. -- The sizing of the custom vpc subnets needs to be appropriate for the usage of the workspace. [This documentation covers it](https://docs.gcp.databricks.com/administration-guide/cloud-configurations/gcp/network-sizing.html) - -## Run as an SA - -You can do the same thing by provisionning a service account that will have the same permissions - and associate the key associated to it. - - -## Run the tempalte - -- You need to fill in the `variables.tf` -- run `terraform init` -- run `teraform apply` - - -## Requirements - -No requirements. - -## Providers - -| Name | Version | -|------|---------| -| [databricks](#provider\_databricks) | n/a | -| [google](#provider\_google) | n/a | -| [random](#provider\_random) | n/a | - -## Modules - -No modules. - -## Resources - -| Name | Type | -|------|------| -| [databricks_mws_networks.databricks_network](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource | -| [databricks_mws_workspaces.databricks_workspace](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | -| [google_compute_network.dbx_private_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | -| [google_compute_router.router](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router) | resource | -| [google_compute_router_nat.nat](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router_nat) | resource | -| [google_compute_subnetwork.network-with-private-secondary-ip-ranges](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | -| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | -| [google_client_config.current](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_config) | data source | -| [google_client_openid_userinfo.me](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/client_openid_userinfo) | data source | - -## Inputs - -| Name | Description | Type | Default | Required | -|------|-------------|------|---------|:--------:| -| [databricks\_account\_id](#input\_databricks\_account\_id) | Databricks Account ID | `string` | n/a | yes | -| [delegate\_from](#input\_delegate\_from) | Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com) | `list(string)` | n/a | yes | -| [google\_project](#input\_google\_project) | Google project for VCP/workspace deployment | `string` | n/a | yes | -| [google\_region](#input\_google\_region) | Google region for VCP/workspace deployment | `string` | n/a | yes | -| [nat\_name](#input\_nat\_name) | Name of the NAT service in compute router | `string` | n/a | yes | -| [prefix](#input\_prefix) | Prefix to use in generated VPC name | `string` | n/a | yes | -| [router\_name](#input\_router\_name) | Name of the compute router to create | `string` | n/a | yes | -| [subnet\_ip\_cidr\_range](#input\_subnet\_ip\_cidr\_range) | IP Range for Nodes subnet (primary) | `string` | n/a | yes | -| [subnet\_name](#input\_subnet\_name) | Name of the subnet to create | `string` | n/a | yes | - -## Outputs - -| Name | Description | -|------|-------------| -| [databricks\_host](#output\_databricks\_host) | n/a | -| [databricks\_token](#output\_databricks\_token) | n/a | - \ No newline at end of file diff --git a/modules/gcp-workspace-byovpc/init.tf b/modules/gcp-workspace-byovpc/init.tf deleted file mode 100644 index 103a33ee..00000000 --- a/modules/gcp-workspace-byovpc/init.tf +++ /dev/null @@ -1,22 +0,0 @@ -terraform { - required_providers { - databricks = { - source = "databricks/databricks" - } - google = { - source = "hashicorp/google" - } - } -} - -data "google_client_openid_userinfo" "me" { -} - -data "google_client_config" "current" { -} - -resource "random_string" "suffix" { - special = false - upper = false - length = 6 -} diff --git a/modules/gcp-workspace-byovpc/outputs.tf b/modules/gcp-workspace-byovpc/outputs.tf deleted file mode 100644 index f544b3ba..00000000 --- a/modules/gcp-workspace-byovpc/outputs.tf +++ /dev/null @@ -1,8 +0,0 @@ -output "databricks_host" { - value = databricks_mws_workspaces.databricks_workspace.workspace_url -} - -output "databricks_token" { - value = databricks_mws_workspaces.databricks_workspace.token[0].token_value - sensitive = true -} \ No newline at end of file diff --git a/modules/gcp-workspace-byovpc/variables.tf b/modules/gcp-workspace-byovpc/variables.tf deleted file mode 100644 index 45650087..00000000 --- a/modules/gcp-workspace-byovpc/variables.tf +++ /dev/null @@ -1,45 +0,0 @@ -variable "databricks_account_id" { - type = string - description = "Databricks Account ID" -} - -variable "google_project" { - type = string - description = "Google project for VCP/workspace deployment" -} - -variable "google_region" { - type = string - description = "Google region for VCP/workspace deployment" -} - -variable "prefix" { - type = string - description = "Prefix to use in generated VPC name" -} - -# These three ranges need to be computed based on the workspace size (cf documentation) -variable "subnet_ip_cidr_range" { - type = string - description = "IP Range for Nodes subnet (primary)" -} - -variable "subnet_name" { - type = string - description = "Name of the subnet to create" -} - -variable "router_name" { - type = string - description = "Name of the compute router to create" -} - -variable "nat_name" { - type = string - description = "Name of the NAT service in compute router" -} - -variable "delegate_from" { - description = "Identities to allow to impersonate created service account (in form of user:user.name@example.com, group:deployers@example.com or serviceAccount:sa1@project.iam.gserviceaccount.com)" - type = list(string) -} diff --git a/modules/gcp-workspace-byovpc/vpc.tf b/modules/gcp-workspace-byovpc/vpc.tf deleted file mode 100644 index 31e8e808..00000000 --- a/modules/gcp-workspace-byovpc/vpc.tf +++ /dev/null @@ -1,40 +0,0 @@ -resource "google_compute_network" "dbx_private_vpc" { - project = var.google_project - name = "${var.prefix}-${random_string.suffix.result}" - auto_create_subnetworks = false -} - -resource "google_compute_subnetwork" "network-with-private-secondary-ip-ranges" { - name = var.subnet_name - ip_cidr_range = var.subnet_ip_cidr_range - region = var.google_region - network = google_compute_network.dbx_private_vpc.id - private_ip_google_access = true -} - -resource "google_compute_router" "router" { - name = var.router_name - region = google_compute_subnetwork.network-with-private-secondary-ip-ranges.region - network = google_compute_network.dbx_private_vpc.id -} - -resource "google_compute_router_nat" "nat" { - name = var.nat_name - router = google_compute_router.router.name - region = google_compute_router.router.region - nat_ip_allocate_option = "AUTO_ONLY" - source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" -} - -resource "databricks_mws_networks" "databricks_network" { - account_id = var.databricks_account_id - - network_name = "${var.prefix}-${random_string.suffix.result}" - - gcp_network_info { - network_project_id = var.google_project - vpc_id = google_compute_network.dbx_private_vpc.name - subnet_id = google_compute_subnetwork.network-with-private-secondary-ip-ranges.name - subnet_region = google_compute_subnetwork.network-with-private-secondary-ip-ranges.region - } -} diff --git a/modules/gcp-workspace-byovpc/workspace.tf b/modules/gcp-workspace-byovpc/workspace.tf deleted file mode 100644 index 0f7c7a0a..00000000 --- a/modules/gcp-workspace-byovpc/workspace.tf +++ /dev/null @@ -1,20 +0,0 @@ -resource "databricks_mws_workspaces" "databricks_workspace" { - account_id = var.databricks_account_id - workspace_name = "dbx-example-tf-deploy-${random_string.suffix.result}" - - location = var.google_region - cloud_resource_container { - gcp { - project_id = var.google_project - } - } - - network_id = databricks_mws_networks.databricks_network.network_id - - token { - comment = "Terraform token" - } - - # this makes sure that the NAT is created for outbound traffic before creating the workspace - depends_on = [google_compute_router_nat.nat] -} diff --git a/modules/gcp/Makefile b/modules/gcp/Makefile new file mode 100644 index 00000000..30b525d1 --- /dev/null +++ b/modules/gcp/Makefile @@ -0,0 +1,8 @@ +PROJECTS := $(dir $(wildcard */README.md)) + +docs: $(PROJECTS) + +$(PROJECTS): + $(MAKE) -C $@ docs + +.PHONY: $(PROJECTS) docs diff --git a/modules/gcp/account/Makefile b/modules/gcp/account/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/account/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/account/README.md b/modules/gcp/account/README.md new file mode 100644 index 00000000..a4d7408e --- /dev/null +++ b/modules/gcp/account/README.md @@ -0,0 +1,67 @@ +# modules/gcp/account + +All `databricks_mws_*` resources for the GCP composer: `mws_networks`, `mws_workspaces`, `mws_vpc_endpoint`, `mws_private_access_settings`. + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [databricks](#requirement\_databricks) | >= 1.0 | + +## Providers + +| Name | Version | +|------|---------| +| [databricks](#provider\_databricks) | 1.114.2 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [databricks_mws_networks.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_networks) | resource | +| [databricks_mws_private_access_settings.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_private_access_settings) | resource | +| [databricks_mws_vpc_endpoint.backend](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | +| [databricks_mws_vpc_endpoint.frontend](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | +| [databricks_mws_vpc_endpoint.transit](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_vpc_endpoint) | resource | +| [databricks_mws_workspaces.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/mws_workspaces) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [databricks\_account\_id](#input\_databricks\_account\_id) | n/a | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | n/a | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | n/a | `string` | n/a | yes | +| [prefix](#input\_prefix) | n/a | `string` | n/a | yes | +| [suffix](#input\_suffix) | n/a | `string` | n/a | yes | +| [vpc\_source](#input\_vpc\_source) | n/a | `string` | n/a | yes | +| [backend\_psc\_fr\_id](#input\_backend\_psc\_fr\_id) | n/a | `string` | `null` | no | +| [enable\_backend](#input\_enable\_backend) | n/a | `bool` | `false` | no | +| [enable\_frontend](#input\_enable\_frontend) | n/a | `bool` | `false` | no | +| [frontend\_psc\_fr\_id](#input\_frontend\_psc\_fr\_id) | Forwarding-rule names from private-connectivity module (gate vpc\_endpoint creation) | `string` | `null` | no | +| [hub\_frontend\_psc\_fr\_id](#input\_hub\_frontend\_psc\_fr\_id) | n/a | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | n/a | `string` | `null` | no | +| [nat\_dependency](#input\_nat\_dependency) | Opaque value used as depends\_on for the workspace to ensure NAT readiness | `any` | `null` | no | +| [private\_access\_only](#input\_private\_access\_only) | n/a | `bool` | `false` | no | +| [spoke\_subnet\_name](#input\_spoke\_subnet\_name) | n/a | `string` | `null` | no | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | n/a | `string` | `null` | no | +| [spoke\_vpc\_name](#input\_spoke\_vpc\_name) | n/a | `string` | `null` | no | +| [workspace\_name](#input\_workspace\_name) | n/a | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [backend\_endpoint\_id](#output\_backend\_endpoint\_id) | Backend mws\_vpc\_endpoint ID (null when no PSC) | +| [frontend\_endpoint\_id](#output\_frontend\_endpoint\_id) | Frontend mws\_vpc\_endpoint ID (null when no PSC) | +| [network\_id](#output\_network\_id) | mws\_networks ID (null when databricks\_managed) | +| [transit\_endpoint\_id](#output\_transit\_endpoint\_id) | Hub-side mws\_vpc\_endpoint ID (null when no hub) | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | + diff --git a/modules/gcp/account/main.tf b/modules/gcp/account/main.tf new file mode 100644 index 00000000..4a277875 --- /dev/null +++ b/modules/gcp/account/main.tf @@ -0,0 +1,49 @@ +locals { + workspace_name = coalesce(var.workspace_name, "${var.prefix}-ws-${var.suffix}") + emit_mws_networks = var.vpc_source != "databricks_managed" + emit_vpc_endpoints = var.frontend_psc_fr_id != null && var.backend_psc_fr_id != null + emit_pas = var.private_access_only +} + +resource "databricks_mws_workspaces" "this" { + account_id = var.databricks_account_id + workspace_name = local.workspace_name + location = var.google_region + + cloud_resource_container { + gcp { + project_id = var.google_project + } + } + + network_id = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + private_access_settings_id = local.emit_pas ? databricks_mws_private_access_settings.this[0].private_access_settings_id : null + + token { + comment = "Terraform" + } + + depends_on = [var.nat_dependency] +} + +resource "databricks_mws_networks" "this" { + count = local.emit_mws_networks ? 1 : 0 + + account_id = var.databricks_account_id + network_name = "${var.prefix}-ntw-${var.suffix}" + + gcp_network_info { + network_project_id = var.spoke_vpc_google_project + vpc_id = var.spoke_vpc_name + subnet_id = var.spoke_subnet_name + subnet_region = var.google_region + } + + dynamic "vpc_endpoints" { + for_each = local.emit_vpc_endpoints ? [1] : [] + content { + dataplane_relay = [databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id] + rest_api = [databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id] + } + } +} diff --git a/modules/gcp/account/outputs.tf b/modules/gcp/account/outputs.tf new file mode 100644 index 00000000..bf0782b2 --- /dev/null +++ b/modules/gcp/account/outputs.tf @@ -0,0 +1,29 @@ +output "workspace_id" { + value = databricks_mws_workspaces.this.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = databricks_mws_workspaces.this.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = local.emit_mws_networks ? databricks_mws_networks.this[0].network_id : null + description = "mws_networks ID (null when databricks_managed)" +} + +output "frontend_endpoint_id" { + value = var.enable_frontend && var.frontend_psc_fr_id != null ? databricks_mws_vpc_endpoint.frontend[0].vpc_endpoint_id : null + description = "Frontend mws_vpc_endpoint ID (null when no PSC)" +} + +output "backend_endpoint_id" { + value = var.enable_backend && var.backend_psc_fr_id != null ? databricks_mws_vpc_endpoint.backend[0].vpc_endpoint_id : null + description = "Backend mws_vpc_endpoint ID (null when no PSC)" +} + +output "transit_endpoint_id" { + value = var.enable_frontend && var.hub_frontend_psc_fr_id != null ? databricks_mws_vpc_endpoint.transit[0].vpc_endpoint_id : null + description = "Hub-side mws_vpc_endpoint ID (null when no hub)" +} diff --git a/modules/gcp/account/pas.tf b/modules/gcp/account/pas.tf new file mode 100644 index 00000000..7ec78445 --- /dev/null +++ b/modules/gcp/account/pas.tf @@ -0,0 +1,9 @@ +resource "databricks_mws_private_access_settings" "this" { + count = local.emit_pas ? 1 : 0 + + account_id = var.databricks_account_id + private_access_settings_name = "${var.prefix}-pas-${var.suffix}" + region = var.google_region + public_access_enabled = false + private_access_level = "ACCOUNT" +} diff --git a/modules/gcp/account/tests/byovpc/main.tf b/modules/gcp/account/tests/byovpc/main.tf new file mode 100644 index 00000000..a85e293e --- /dev/null +++ b/modules/gcp/account/tests/byovpc/main.tf @@ -0,0 +1,27 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" +} diff --git a/modules/gcp/account/tests/databricks-managed/main.tf b/modules/gcp/account/tests/databricks-managed/main.tf new file mode 100644 index 00000000..79c6f0d9 --- /dev/null +++ b/modules/gcp/account/tests/databricks-managed/main.tf @@ -0,0 +1,24 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "databricks_managed" +} diff --git a/modules/gcp/account/tests/psc-with-pas/main.tf b/modules/gcp/account/tests/psc-with-pas/main.tf new file mode 100644 index 00000000..83b418f0 --- /dev/null +++ b/modules/gcp/account/tests/psc-with-pas/main.tf @@ -0,0 +1,36 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + } + } +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "account" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_name = "fixture-spoke-vpc-abc123" + spoke_subnet_name = "fixture-subnet-abc123" + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + + frontend_psc_fr_id = "fixture-psc-ws-ep-abc123" + backend_psc_fr_id = "fixture-psc-scc-ep-abc123" + hub_frontend_psc_fr_id = "fixture-hub-psc-ws-ep-abc123" + + enable_frontend = true + enable_backend = true + private_access_only = true +} diff --git a/modules/gcp/account/variables.tf b/modules/gcp/account/variables.tf new file mode 100644 index 00000000..81a6ee79 --- /dev/null +++ b/modules/gcp/account/variables.tf @@ -0,0 +1,89 @@ +variable "prefix" { + type = string +} + +variable "suffix" { + type = string +} + +variable "workspace_name" { + type = string + default = null +} + +variable "databricks_account_id" { + type = string +} + +variable "google_project" { + type = string +} + +variable "google_region" { + type = string +} + +variable "vpc_source" { + type = string + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +variable "spoke_vpc_name" { + type = string + default = null +} + +variable "spoke_subnet_name" { + type = string + default = null +} + +variable "spoke_vpc_google_project" { + type = string + default = null +} + +variable "hub_vpc_google_project" { + type = string + default = null +} + +# Forwarding-rule names from private-connectivity module (gate vpc_endpoint creation) +variable "frontend_psc_fr_id" { + type = string + default = null +} + +variable "backend_psc_fr_id" { + type = string + default = null +} + +variable "hub_frontend_psc_fr_id" { + type = string + default = null +} + +variable "enable_frontend" { + type = bool + default = false +} + +variable "enable_backend" { + type = bool + default = false +} + +variable "private_access_only" { + type = bool + default = false +} + +variable "nat_dependency" { + type = any + default = null + description = "Opaque value used as depends_on for the workspace to ensure NAT readiness" +} diff --git a/modules/gcp/account/versions.tf b/modules/gcp/account/versions.tf new file mode 100644 index 00000000..2d66ceef --- /dev/null +++ b/modules/gcp/account/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + } +} diff --git a/modules/gcp/account/vpc-endpoints.tf b/modules/gcp/account/vpc-endpoints.tf new file mode 100644 index 00000000..fbf7b5c0 --- /dev/null +++ b/modules/gcp/account/vpc-endpoints.tf @@ -0,0 +1,38 @@ +resource "databricks_mws_vpc_endpoint" "frontend" { + count = var.enable_frontend && var.frontend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-ws-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.frontend_psc_fr_id + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "backend" { + count = var.enable_backend && var.backend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-scc-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.spoke_vpc_google_project + psc_endpoint_name = var.backend_psc_fr_id + endpoint_region = var.google_region + } +} + +resource "databricks_mws_vpc_endpoint" "transit" { + count = var.enable_frontend && var.hub_frontend_psc_fr_id != null ? 1 : 0 + + account_id = var.databricks_account_id + vpc_endpoint_name = "${var.prefix}-hub-ep-${var.suffix}" + + gcp_vpc_endpoint_info { + project_id = var.hub_vpc_google_project + psc_endpoint_name = var.hub_frontend_psc_fr_id + endpoint_region = var.google_region + } +} diff --git a/modules/gcp/databricks-workspace/Makefile b/modules/gcp/databricks-workspace/Makefile new file mode 100644 index 00000000..38b83c2e --- /dev/null +++ b/modules/gcp/databricks-workspace/Makefile @@ -0,0 +1,3 @@ +.PHONY: docs +docs: + terraform-docs -c ../../../.terraform-docs.yml . diff --git a/modules/gcp/databricks-workspace/README.md b/modules/gcp/databricks-workspace/README.md new file mode 100644 index 00000000..c39a4cc8 --- /dev/null +++ b/modules/gcp/databricks-workspace/README.md @@ -0,0 +1,89 @@ +# GCP Databricks Workspace Composer + +This module creates a complete Databricks workspace on Google Cloud Platform with full networking, connectivity, and authentication management. + +## Usage + +See the examples in `tests/` for common scenarios. + +## Components + +- **network**: VPC creation or integration (databricks_managed, create, or existing) +- **private_connectivity**: Private Service Connect (PSC) with optional frontend/backend +- **account**: Databricks MWS resources and workspace +- **dns**: Private DNS zones for restricted egress scenarios + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [databricks](#requirement\_databricks) | >= 1.0 | +| [google](#requirement\_google) | >= 4.0 | +| [null](#requirement\_null) | >= 3.0 | +| [random](#requirement\_random) | >= 3.0 | + +## Providers + +| Name | Version | +|------|---------| +| [null](#provider\_null) | 3.2.4 | +| [random](#provider\_random) | 3.8.1 | + +## Modules + +| Name | Source | Version | +|------|--------|---------| +| [account](#module\_account) | ../account | n/a | +| [dns](#module\_dns) | ../dns | n/a | +| [network](#module\_network) | ../network | n/a | +| [private\_connectivity](#module\_private\_connectivity) | ../private-connectivity | n/a | + +## Resources + +| Name | Type | +|------|------| +| [null_resource.preconditions](https://registry.terraform.io/providers/hashicorp/null/latest/docs/resources/resource) | resource | +| [random_string.suffix](https://registry.terraform.io/providers/hashicorp/random/latest/docs/resources/string) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [databricks\_account\_id](#input\_databricks\_account\_id) | n/a | `string` | n/a | yes | +| [google\_project](#input\_google\_project) | n/a | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | n/a | `string` | n/a | yes | +| [prefix](#input\_prefix) | === Identity =========================================================== | `string` | n/a | yes | +| [existing\_subnet\_name](#input\_existing\_subnet\_name) | n/a | `string` | `null` | no | +| [existing\_vpc\_name](#input\_existing\_vpc\_name) | When vpc\_source = "existing" | `string` | `null` | no | +| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | n/a | `string` | `null` | no | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | n/a | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | === Required when restricted\_egress = true ============================= | `string` | `null` | no | +| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | n/a | `bool` | `false` | no | +| [pod\_cidr](#input\_pod\_cidr) | n/a | `string` | `null` | no | +| [private\_access\_only](#input\_private\_access\_only) | n/a | `bool` | `false` | no | +| [private\_link\_backend](#input\_private\_link\_backend) | n/a | `bool` | `false` | no | +| [private\_link\_frontend](#input\_private\_link\_frontend) | === Connectivity feature flags ========================================= | `bool` | `false` | no | +| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | n/a | `string` | `null` | no | +| [restricted\_egress](#input\_restricted\_egress) | n/a | `bool` | `false` | no | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | When vpc\_source = "create" | `string` | `null` | no | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | n/a | `string` | `null` | no | +| [subnet\_cidr](#input\_subnet\_cidr) | n/a | `string` | `null` | no | +| [svc\_cidr](#input\_svc\_cidr) | n/a | `string` | `null` | no | +| [tags](#input\_tags) | n/a | `map(string)` | `{}` | no | +| [vpc\_source](#input\_vpc\_source) | One of: databricks\_managed, create, existing | `string` | `"databricks_managed"` | no | +| [workspace\_name](#input\_workspace\_name) | n/a | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [hub\_vpc\_id](#output\_hub\_vpc\_id) | Hub VPC ID (null when not restricted\_egress) | +| [network\_id](#output\_network\_id) | mws\_networks ID (null when databricks\_managed) | +| [spoke\_vpc\_id](#output\_spoke\_vpc\_id) | Spoke VPC ID (null when databricks\_managed) | +| [suffix](#output\_suffix) | Random suffix used in resource names | +| [vpc\_id](#output\_vpc\_id) | Spoke VPC ID (null when databricks\_managed) | +| [workspace\_id](#output\_workspace\_id) | Databricks workspace ID | +| [workspace\_url](#output\_workspace\_url) | Databricks workspace URL | + \ No newline at end of file diff --git a/modules/gcp/databricks-workspace/main.tf b/modules/gcp/databricks-workspace/main.tf new file mode 100644 index 00000000..0db4f22b --- /dev/null +++ b/modules/gcp/databricks-workspace/main.tf @@ -0,0 +1,149 @@ +locals { + databricks_managed = var.vpc_source == "databricks_managed" + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + any_private_link = var.private_link_frontend || var.private_link_backend + spoke_project = coalesce(var.spoke_vpc_google_project, var.google_project) +} + +resource "random_string" "suffix" { + length = 6 + special = false + upper = false + + lifecycle { + ignore_changes = [special, upper] + } +} + +# Cross-variable preconditions. +resource "null_resource" "preconditions" { + lifecycle { + precondition { + condition = !var.restricted_egress || local.create_vpc + error_message = "restricted_egress=true requires vpc_source=\"create\" (hub-spoke topology needs us to own both VPCs)." + } + precondition { + condition = !var.restricted_egress || local.any_private_link + error_message = "restricted_egress=true requires at least one of private_link_frontend or private_link_backend." + } + precondition { + condition = !var.restricted_egress || (var.hub_vpc_google_project != null && var.hub_vpc_cidr != null && var.psc_subnet_cidr != null) + error_message = "restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, and psc_subnet_cidr." + } + precondition { + condition = !local.create_vpc || (var.spoke_vpc_cidr != null && var.subnet_cidr != null) + error_message = "vpc_source=\"create\" requires spoke_vpc_cidr and subnet_cidr." + } + precondition { + condition = !local.use_existing_vpc || (var.existing_vpc_name != null && var.existing_subnet_name != null) + error_message = "vpc_source=\"existing\" requires existing_vpc_name and existing_subnet_name." + } + precondition { + condition = !local.databricks_managed || (!var.private_link_frontend && !var.private_link_backend && !var.restricted_egress) + error_message = "vpc_source=\"databricks_managed\" forbids private_link_frontend, private_link_backend, and restricted_egress." + } + } +} + +module "network" { + source = "../network" + count = local.databricks_managed ? 0 : 1 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + vpc_source = var.vpc_source + spoke_vpc_google_project = local.spoke_project + + spoke_vpc_cidr = var.spoke_vpc_cidr + subnet_cidr = var.subnet_cidr + pod_cidr = var.pod_cidr + svc_cidr = var.svc_cidr + + existing_vpc_name = var.existing_vpc_name + existing_subnet_name = var.existing_subnet_name + + create_hub = var.restricted_egress + hub_vpc_google_project = var.hub_vpc_google_project + hub_vpc_cidr = var.hub_vpc_cidr + is_spoke_vpc_shared = var.is_spoke_vpc_shared + workspace_google_project = var.google_project +} + +module "private_connectivity" { + source = "../private-connectivity" + count = local.any_private_link ? 1 : 0 + + prefix = var.prefix + suffix = random_string.suffix.result + google_region = var.google_region + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + spoke_vpc_cidr = var.spoke_vpc_cidr + + hub_vpc_id = var.restricted_egress ? module.network[0].hub_vpc_id : null + hub_vpc_self_link = var.restricted_egress ? module.network[0].hub_vpc_self_link : null + hub_vpc_google_project = var.hub_vpc_google_project + hub_subnet_name = var.restricted_egress ? module.network[0].hub_subnet_name : null + hub_vpc_cidr = var.hub_vpc_cidr + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + restrict_egress = var.restricted_egress + psc_subnet_cidr = var.psc_subnet_cidr + + hive_metastore_ip = var.hive_metastore_ip +} + +module "account" { + source = "../account" + + prefix = var.prefix + suffix = random_string.suffix.result + workspace_name = var.workspace_name + databricks_account_id = var.databricks_account_id + google_project = var.google_project + google_region = var.google_region + vpc_source = var.vpc_source + + spoke_vpc_name = local.databricks_managed ? null : module.network[0].spoke_vpc_name + spoke_subnet_name = local.databricks_managed ? null : module.network[0].spoke_subnet_name + spoke_vpc_google_project = local.spoke_project + hub_vpc_google_project = var.hub_vpc_google_project + + frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].frontend_psc_fr_id : null + backend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].backend_psc_fr_id : null + hub_frontend_psc_fr_id = local.any_private_link ? module.private_connectivity[0].hub_frontend_psc_fr_id : null + + enable_frontend = var.private_link_frontend + enable_backend = var.private_link_backend + private_access_only = var.private_access_only + + nat_dependency = local.databricks_managed ? null : module.network[0].nat_id +} + +module "dns" { + source = "../dns" + count = var.restricted_egress ? 1 : 0 + + prefix = var.prefix + google_region = var.google_region + + hub_vpc_id = module.network[0].hub_vpc_id + hub_vpc_self_link = module.network[0].hub_vpc_self_link + hub_vpc_google_project = var.hub_vpc_google_project + + spoke_vpc_id = module.network[0].spoke_vpc_id + spoke_vpc_self_link = module.network[0].spoke_vpc_self_link + spoke_vpc_google_project = local.spoke_project + + workspace_url = module.account.workspace_url + + frontend_psc_ip_spoke = module.private_connectivity[0].frontend_psc_ip_spoke + frontend_psc_ip_hub = module.private_connectivity[0].frontend_psc_ip_hub + backend_psc_ip_spoke = module.private_connectivity[0].backend_psc_ip_spoke +} diff --git a/modules/gcp/databricks-workspace/outputs.tf b/modules/gcp/databricks-workspace/outputs.tf new file mode 100644 index 00000000..af8e9e00 --- /dev/null +++ b/modules/gcp/databricks-workspace/outputs.tf @@ -0,0 +1,34 @@ +output "workspace_id" { + value = module.account.workspace_id + description = "Databricks workspace ID" +} + +output "workspace_url" { + value = module.account.workspace_url + description = "Databricks workspace URL" +} + +output "network_id" { + value = module.account.network_id + description = "mws_networks ID (null when databricks_managed)" +} + +output "vpc_id" { + value = try(module.network[0].spoke_vpc_id, null) + description = "Spoke VPC ID (null when databricks_managed)" +} + +output "spoke_vpc_id" { + value = try(module.network[0].spoke_vpc_id, null) + description = "Spoke VPC ID (null when databricks_managed)" +} + +output "hub_vpc_id" { + value = try(module.network[0].hub_vpc_id, null) + description = "Hub VPC ID (null when not restricted_egress)" +} + +output "suffix" { + value = random_string.suffix.result + description = "Random suffix used in resource names" +} diff --git a/modules/gcp/databricks-workspace/tests/basic/main.tf b/modules/gcp/databricks-workspace/tests/basic/main.tf new file mode 100644 index 00000000..2a96df8b --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/basic/main.tf @@ -0,0 +1,28 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" +} diff --git a/modules/gcp/databricks-workspace/tests/byovpc/main.tf b/modules/gcp/databricks-workspace/tests/byovpc/main.tf new file mode 100644 index 00000000..18ac0dc2 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/byovpc/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} diff --git a/modules/gcp/databricks-workspace/tests/existing-vpc/main.tf b/modules/gcp/databricks-workspace/tests/existing-vpc/main.tf new file mode 100644 index 00000000..a6b992c5 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/existing-vpc/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "existing" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} diff --git a/modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf b/modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf new file mode 100644 index 00000000..31ee19e4 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-existing-missing-name/main.tf @@ -0,0 +1,29 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: vpc_source="existing" requires existing_vpc_name + existing_subnet_name +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "existing" +} diff --git a/modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf b/modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf new file mode 100644 index 00000000..696e7641 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-managed-with-psc/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: vpc_source="databricks_managed" forbids private_link_frontend +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" + private_link_frontend = true +} diff --git a/modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf new file mode 100644 index 00000000..df6ca744 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-managed/main.tf @@ -0,0 +1,30 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: restricted_egress=true requires vpc_source="create" +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "databricks_managed" + restricted_egress = true +} diff --git a/modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf new file mode 100644 index 00000000..f93d3cb4 --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/negative-restricted-egress-missing-hub/main.tf @@ -0,0 +1,34 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +# precondition fail: restricted_egress=true requires hub_vpc_google_project, hub_vpc_cidr, psc_subnet_cidr +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + private_link_frontend = true + private_link_backend = true + restricted_egress = true +} diff --git a/modules/gcp/databricks-workspace/tests/psc-isolated/main.tf b/modules/gcp/databricks-workspace/tests/psc-isolated/main.tf new file mode 100644 index 00000000..cc60d2ae --- /dev/null +++ b/modules/gcp/databricks-workspace/tests/psc-isolated/main.tf @@ -0,0 +1,41 @@ +terraform { + required_version = ">= 1.5" + required_providers { + databricks = { source = "databricks/databricks" } + google = { source = "hashicorp/google" } + } +} + +provider "google" { + project = "fixture-workspace" + region = "us-central1" +} + +provider "databricks" { + host = "https://accounts.gcp.databricks.com" + account_id = "00000000-0000-0000-0000-000000000000" +} + +module "workspace" { + source = "../.." + + prefix = "fixture" + databricks_account_id = "00000000-0000-0000-0000-000000000000" + google_project = "fixture-workspace" + google_region = "us-central1" + + vpc_source = "create" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + private_link_frontend = true + private_link_backend = true + private_access_only = true + restricted_egress = true + + spoke_vpc_google_project = "fixture-spoke" + hub_vpc_google_project = "fixture-hub" + is_spoke_vpc_shared = true + hub_vpc_cidr = "10.1.0.0/24" + psc_subnet_cidr = "10.0.255.0/28" +} diff --git a/modules/gcp/databricks-workspace/variables.tf b/modules/gcp/databricks-workspace/variables.tf new file mode 100644 index 00000000..14b959c0 --- /dev/null +++ b/modules/gcp/databricks-workspace/variables.tf @@ -0,0 +1,121 @@ +# === Identity =========================================================== +variable "prefix" { + type = string +} + +variable "databricks_account_id" { + type = string +} + +variable "google_project" { + type = string +} + +variable "google_region" { + type = string +} + +variable "workspace_name" { + type = string + default = null +} + +variable "tags" { + type = map(string) + default = {} +} + +# === VPC source ========================================================= +variable "vpc_source" { + type = string + default = "databricks_managed" + description = "One of: databricks_managed, create, existing" + validation { + condition = contains(["databricks_managed", "create", "existing"], var.vpc_source) + error_message = "vpc_source must be one of: databricks_managed, create, existing." + } +} + +# When vpc_source = "create" +variable "spoke_vpc_cidr" { + type = string + default = null +} + +variable "subnet_cidr" { + type = string + default = null +} + +variable "pod_cidr" { + type = string + default = null +} + +variable "svc_cidr" { + type = string + default = null +} + +# When vpc_source = "existing" +variable "existing_vpc_name" { + type = string + default = null +} + +variable "existing_subnet_name" { + type = string + default = null +} + +# === Connectivity feature flags ========================================= +variable "private_link_frontend" { + type = bool + default = false +} + +variable "private_link_backend" { + type = bool + default = false +} + +variable "private_access_only" { + type = bool + default = false +} + +variable "restricted_egress" { + type = bool + default = false +} + +# === Required when restricted_egress = true ============================= +variable "hub_vpc_google_project" { + type = string + default = null +} + +variable "spoke_vpc_google_project" { + type = string + default = null +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false +} + +variable "hub_vpc_cidr" { + type = string + default = null +} + +variable "psc_subnet_cidr" { + type = string + default = null +} + +variable "hive_metastore_ip" { + type = string + default = null +} diff --git a/modules/gcp/databricks-workspace/versions.tf b/modules/gcp/databricks-workspace/versions.tf new file mode 100644 index 00000000..ead1a86d --- /dev/null +++ b/modules/gcp/databricks-workspace/versions.tf @@ -0,0 +1,21 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + databricks = { + source = "databricks/databricks" + version = ">= 1.0" + } + random = { + source = "hashicorp/random" + version = ">= 3.0" + } + null = { + source = "hashicorp/null" + version = ">= 3.0" + } + } +} diff --git a/modules/gcp/dns/Makefile b/modules/gcp/dns/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/dns/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/dns/README.md b/modules/gcp/dns/README.md new file mode 100644 index 00000000..467f3d54 --- /dev/null +++ b/modules/gcp/dns/README.md @@ -0,0 +1,65 @@ +# modules/gcp/dns + +Private DNS zones (hub + spoke) used with restricted-egress workspaces. + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | + +## Providers + +| Name | Version | +|------|---------| +| [google](#provider\_google) | 7.31.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [google_dns_managed_zone.gcr](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.google_apis](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.hub_dbx](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.pkg_dev](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_managed_zone.spoke_dbx](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_managed_zone) | resource | +| [google_dns_record_set.gcr_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.gcr_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.google_apis_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.google_apis_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.hub_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.hub_psc_auth](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.hub_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.pkg_dev_a](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.pkg_dev_cname](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.spoke_dp](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.spoke_tunnel](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | +| [google_dns_record_set.spoke_workspace_url](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/dns_record_set) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [backend\_psc\_ip\_spoke](#input\_backend\_psc\_ip\_spoke) | n/a | `string` | n/a | yes | +| [frontend\_psc\_ip\_spoke](#input\_frontend\_psc\_ip\_spoke) | PSC IPs | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | n/a | `string` | n/a | yes | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | n/a | `string` | n/a | yes | +| [hub\_vpc\_id](#input\_hub\_vpc\_id) | Hub | `string` | n/a | yes | +| [hub\_vpc\_self\_link](#input\_hub\_vpc\_self\_link) | n/a | `string` | n/a | yes | +| [prefix](#input\_prefix) | n/a | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | n/a | `string` | n/a | yes | +| [spoke\_vpc\_id](#input\_spoke\_vpc\_id) | Spoke | `string` | n/a | yes | +| [spoke\_vpc\_self\_link](#input\_spoke\_vpc\_self\_link) | n/a | `string` | n/a | yes | +| [workspace\_url](#input\_workspace\_url) | Workspace | `string` | n/a | yes | +| [frontend\_psc\_ip\_hub](#input\_frontend\_psc\_ip\_hub) | n/a | `string` | `null` | no | + +## Outputs + +No outputs. + diff --git a/modules/gcp/dns/hub.tf b/modules/gcp/dns/hub.tf new file mode 100644 index 00000000..206d072f --- /dev/null +++ b/modules/gcp/dns/hub.tf @@ -0,0 +1,145 @@ +locals { + # Regex extracts the workspace DNS id (numeric.numeric) from the URL. + workspace_dns_id = regex("[0-9]+\\.[0-9]+", var.workspace_url) +} + +# === gcp.databricks.com (hub) ============================================ +resource "google_dns_managed_zone" "hub_dbx" { + name = "${var.prefix}-hub-gcp-databricks-com" + project = var.hub_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "hub_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_psc_auth" { + name = "${var.google_region}.psc-auth.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +resource "google_dns_record_set" "hub_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.hub_dbx.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.hub_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_hub] +} + +# === gcr.io ============================================================== +resource "google_dns_managed_zone" "gcr" { + name = "${var.prefix}-gcr-io" + project = var.hub_vpc_google_project + dns_name = "gcr.io." + description = "Private DNS zone for GCR private resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "gcr_cname" { + name = "*.${google_dns_managed_zone.gcr.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "CNAME" + ttl = 300 + rrdatas = ["gcr.io."] +} + +resource "google_dns_record_set" "gcr_a" { + name = google_dns_managed_zone.gcr.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.gcr.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} + +# === googleapis.com ====================================================== +resource "google_dns_managed_zone" "google_apis" { + name = "${var.prefix}-google-apis" + project = var.hub_vpc_google_project + dns_name = "googleapis.com." + description = "Private DNS zone for Google APIs resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "google_apis_cname" { + name = "*.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "CNAME" + ttl = 300 + rrdatas = ["restricted.googleapis.com."] +} + +resource "google_dns_record_set" "google_apis_a" { + name = "restricted.${google_dns_managed_zone.google_apis.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.google_apis.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.4", "199.36.153.5", "199.36.153.6", "199.36.153.7"] +} + +# === pkg.dev ============================================================= +resource "google_dns_managed_zone" "pkg_dev" { + name = "${var.prefix}-pkg-dev" + project = var.hub_vpc_google_project + dns_name = "pkg.dev." + description = "Private DNS zone for Go Packages resolution" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.hub_vpc_id + } + } +} + +resource "google_dns_record_set" "pkg_dev_cname" { + name = "*.${google_dns_managed_zone.pkg_dev.dns_name}" + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "CNAME" + ttl = 300 + rrdatas = ["pkg.dev."] +} + +resource "google_dns_record_set" "pkg_dev_a" { + name = google_dns_managed_zone.pkg_dev.dns_name + project = var.hub_vpc_google_project + managed_zone = google_dns_managed_zone.pkg_dev.name + type = "A" + ttl = 300 + rrdatas = ["199.36.153.8", "199.36.153.9", "199.36.153.10", "199.36.153.11"] +} diff --git a/modules/gcp/dns/outputs.tf b/modules/gcp/dns/outputs.tf new file mode 100644 index 00000000..19cbc3d5 --- /dev/null +++ b/modules/gcp/dns/outputs.tf @@ -0,0 +1 @@ +# This module has no outputs; DNS records are terminal. diff --git a/modules/gcp/dns/spoke.tf b/modules/gcp/dns/spoke.tf new file mode 100644 index 00000000..25c3bc2e --- /dev/null +++ b/modules/gcp/dns/spoke.tf @@ -0,0 +1,41 @@ +# === gcp.databricks.com (spoke) ========================================== +resource "google_dns_managed_zone" "spoke_dbx" { + name = "${var.prefix}-spoke-gcp-databricks-com" + project = var.spoke_vpc_google_project + dns_name = "gcp.databricks.com." + description = "Private DNS zone for Databricks PSC management" + visibility = "private" + + private_visibility_config { + networks { + network_url = var.spoke_vpc_id + } + } +} + +resource "google_dns_record_set" "spoke_workspace_url" { + name = "${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_dp" { + name = "dp-${local.workspace_dns_id}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.frontend_psc_ip_spoke] +} + +resource "google_dns_record_set" "spoke_tunnel" { + name = "tunnel.${var.google_region}.${google_dns_managed_zone.spoke_dbx.dns_name}" + project = var.spoke_vpc_google_project + managed_zone = google_dns_managed_zone.spoke_dbx.name + type = "A" + ttl = 300 + rrdatas = [var.backend_psc_ip_spoke] +} diff --git a/modules/gcp/dns/tests/hub-and-spoke/main.tf b/modules/gcp/dns/tests/hub-and-spoke/main.tf new file mode 100644 index 00000000..baebc0dd --- /dev/null +++ b/modules/gcp/dns/tests/hub-and-spoke/main.tf @@ -0,0 +1,29 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "dns" { + source = "../.." + + prefix = "fixture" + google_region = "us-central1" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + + workspace_url = "https://1234567890123456.7.gcp.databricks.com" + + frontend_psc_ip_spoke = "10.0.255.4" + frontend_psc_ip_hub = "10.1.0.10" + backend_psc_ip_spoke = "10.0.255.5" +} diff --git a/modules/gcp/dns/variables.tf b/modules/gcp/dns/variables.tf new file mode 100644 index 00000000..cd56786c --- /dev/null +++ b/modules/gcp/dns/variables.tf @@ -0,0 +1,52 @@ +variable "prefix" { + type = string +} + +variable "google_region" { + type = string +} + +# Hub +variable "hub_vpc_id" { + type = string +} + +variable "hub_vpc_self_link" { + type = string +} + +variable "hub_vpc_google_project" { + type = string +} + +# Spoke +variable "spoke_vpc_id" { + type = string +} + +variable "spoke_vpc_self_link" { + type = string +} + +variable "spoke_vpc_google_project" { + type = string +} + +# Workspace +variable "workspace_url" { + type = string +} + +# PSC IPs +variable "frontend_psc_ip_spoke" { + type = string +} + +variable "frontend_psc_ip_hub" { + type = string + default = null +} + +variable "backend_psc_ip_spoke" { + type = string +} diff --git a/modules/gcp/dns/versions.tf b/modules/gcp/dns/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/dns/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/network/Makefile b/modules/gcp/network/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/network/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/network/README.md b/modules/gcp/network/README.md new file mode 100644 index 00000000..278c8630 --- /dev/null +++ b/modules/gcp/network/README.md @@ -0,0 +1,77 @@ +# modules/gcp/network + +VPC, subnet, router, NAT, peering, and Shared-VPC binding for the Databricks GCP composer. + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | + +## Providers + +| Name | Version | +|------|---------| +| [google](#provider\_google) | 6.46.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [google_compute_network.hub_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | +| [google_compute_network.spoke_vpc](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network) | resource | +| [google_compute_network_peering.hub_to_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | +| [google_compute_network_peering.spoke_to_hub](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_network_peering) | resource | +| [google_compute_router.router](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router) | resource | +| [google_compute_router_nat.nat](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_router_nat) | resource | +| [google_compute_shared_vpc_host_project.host](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_host_project) | resource | +| [google_compute_shared_vpc_service_project.service](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_shared_vpc_service_project) | resource | +| [google_compute_subnetwork.hub_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | +| [google_compute_subnetwork.spoke_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | +| [google_compute_network.existing_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_network) | data source | +| [google_compute_subnetwork.existing_spoke_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/data-sources/compute_subnetwork) | data source | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [google\_region](#input\_google\_region) | GCP region for all network resources | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix for generated resource names | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | GCP project hosting the spoke VPC | `string` | n/a | yes | +| [suffix](#input\_suffix) | Random suffix passed by the composer for uniqueness | `string` | n/a | yes | +| [vpc\_source](#input\_vpc\_source) | Either 'create' (Terraform creates a VPC) or 'existing' (data-source lookup) | `string` | n/a | yes | +| [create\_hub](#input\_create\_hub) | Create a hub VPC + subnet + peering with the spoke. Composer passes restricted\_egress here. | `bool` | `false` | no | +| [existing\_subnet\_name](#input\_existing\_subnet\_name) | Name of pre-existing subnet (required when vpc\_source=existing) | `string` | `null` | no | +| [existing\_vpc\_name](#input\_existing\_vpc\_name) | Name of pre-existing VPC (required when vpc\_source=existing) | `string` | `null` | no | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | CIDR for the hub subnet (required when create\_hub=true) | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | GCP project hosting the hub VPC (required when create\_hub=true) | `string` | `null` | no | +| [is\_spoke\_vpc\_shared](#input\_is\_spoke\_vpc\_shared) | If true, bind the spoke VPC's project as a Shared-VPC host and the workspace project as a service project | `bool` | `false` | no | +| [pod\_cidr](#input\_pod\_cidr) | GKE secondary range for pods (optional) | `string` | `null` | no | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | CIDR for the spoke subnet primary range (required when vpc\_source=create) | `string` | `null` | no | +| [subnet\_cidr](#input\_subnet\_cidr) | CIDR for the spoke subnet (required when vpc\_source=create) | `string` | `null` | no | +| [subnet\_name](#input\_subnet\_name) | Override for spoke subnet name (default: "{prefix}-subnet-{suffix}") | `string` | `null` | no | +| [svc\_cidr](#input\_svc\_cidr) | GKE secondary range for services (optional) | `string` | `null` | no | +| [workspace\_google\_project](#input\_workspace\_google\_project) | Workspace project (used for Shared-VPC service binding) | `string` | `null` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [hub\_subnet\_name](#output\_hub\_subnet\_name) | Name of the hub subnet (null when create\_hub=false) | +| [hub\_vpc\_id](#output\_hub\_vpc\_id) | ID of the hub VPC (null when create\_hub=false) | +| [hub\_vpc\_name](#output\_hub\_vpc\_name) | Name of the hub VPC (null when create\_hub=false) | +| [hub\_vpc\_self\_link](#output\_hub\_vpc\_self\_link) | Self-link of the hub VPC (null when create\_hub=false) | +| [nat\_id](#output\_nat\_id) | ID of the Cloud NAT (null when vpc\_source=existing) | +| [spoke\_subnet\_id](#output\_spoke\_subnet\_id) | ID of the spoke subnet | +| [spoke\_subnet\_name](#output\_spoke\_subnet\_name) | Name of the spoke subnet | +| [spoke\_subnet\_self\_link](#output\_spoke\_subnet\_self\_link) | Self-link of the spoke subnet | +| [spoke\_vpc\_id](#output\_spoke\_vpc\_id) | ID of the spoke VPC | +| [spoke\_vpc\_name](#output\_spoke\_vpc\_name) | Name of the spoke VPC | +| [spoke\_vpc\_self\_link](#output\_spoke\_vpc\_self\_link) | Self-link of the spoke VPC | + diff --git a/modules/gcp/network/main.tf b/modules/gcp/network/main.tf new file mode 100644 index 00000000..48283568 --- /dev/null +++ b/modules/gcp/network/main.tf @@ -0,0 +1,131 @@ +locals { + create_vpc = var.vpc_source == "create" + use_existing_vpc = var.vpc_source == "existing" + + subnet_name = coalesce(var.subnet_name, "${var.prefix}-subnet-${var.suffix}") +} + +# === Spoke VPC (created) ================================================ +resource "google_compute_network" "spoke_vpc" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-spoke-vpc-${var.suffix}" + project = var.spoke_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +resource "google_compute_subnetwork" "spoke_subnet" { + count = local.create_vpc ? 1 : 0 + + name = local.subnet_name + project = var.spoke_vpc_google_project + network = google_compute_network.spoke_vpc[0].id + region = var.google_region + ip_cidr_range = var.subnet_cidr + private_ip_google_access = true + + dynamic "secondary_ip_range" { + for_each = var.pod_cidr != null ? [1] : [] + content { + range_name = "pods" + ip_cidr_range = var.pod_cidr + } + } + + dynamic "secondary_ip_range" { + for_each = var.svc_cidr != null ? [1] : [] + content { + range_name = "services" + ip_cidr_range = var.svc_cidr + } + } +} + +resource "google_compute_router" "router" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-router-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = google_compute_network.spoke_vpc[0].id +} + +resource "google_compute_router_nat" "nat" { + count = local.create_vpc ? 1 : 0 + + name = "${var.prefix}-nat-${var.suffix}" + project = var.spoke_vpc_google_project + router = google_compute_router.router[0].name + region = var.google_region + nat_ip_allocate_option = "AUTO_ONLY" + source_subnetwork_ip_ranges_to_nat = "ALL_SUBNETWORKS_ALL_IP_RANGES" +} + +# === Spoke VPC (data lookup) ============================================ +data "google_compute_network" "existing_spoke" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_vpc_name + project = var.spoke_vpc_google_project +} + +data "google_compute_subnetwork" "existing_spoke_subnet" { + count = local.use_existing_vpc ? 1 : 0 + + name = var.existing_subnet_name + project = var.spoke_vpc_google_project + region = var.google_region +} + +# === Hub VPC ============================================================ +resource "google_compute_network" "hub_vpc" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-vpc-${var.suffix}" + project = var.hub_vpc_google_project + auto_create_subnetworks = false + routing_mode = "GLOBAL" +} + +resource "google_compute_subnetwork" "hub_subnet" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-subnet-${var.suffix}" + project = var.hub_vpc_google_project + network = google_compute_network.hub_vpc[0].id + region = var.google_region + ip_cidr_range = var.hub_vpc_cidr + private_ip_google_access = true +} + +# === Peering ============================================================ +resource "google_compute_network_peering" "hub_to_spoke" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-hub-spoke-${var.suffix}" + network = google_compute_network.hub_vpc[0].self_link + peer_network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link +} + +resource "google_compute_network_peering" "spoke_to_hub" { + count = var.create_hub ? 1 : 0 + + name = "${var.prefix}-spoke-hub-${var.suffix}" + network = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : data.google_compute_network.existing_spoke[0].self_link + peer_network = google_compute_network.hub_vpc[0].self_link +} + +# === Shared VPC ========================================================= +resource "google_compute_shared_vpc_host_project" "host" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + project = var.spoke_vpc_google_project +} + +resource "google_compute_shared_vpc_service_project" "service" { + count = var.create_hub && var.is_spoke_vpc_shared && var.workspace_google_project != var.spoke_vpc_google_project ? 1 : 0 + + host_project = google_compute_shared_vpc_host_project.host[0].project + service_project = var.workspace_google_project +} diff --git a/modules/gcp/network/outputs.tf b/modules/gcp/network/outputs.tf new file mode 100644 index 00000000..958fe271 --- /dev/null +++ b/modules/gcp/network/outputs.tf @@ -0,0 +1,66 @@ +output "spoke_vpc_id" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].id : ( + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].id : null + ) + description = "ID of the spoke VPC" +} + +output "spoke_vpc_name" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].name : ( + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].name : null + ) + description = "Name of the spoke VPC" +} + +output "spoke_vpc_self_link" { + value = local.create_vpc ? google_compute_network.spoke_vpc[0].self_link : ( + local.use_existing_vpc ? data.google_compute_network.existing_spoke[0].self_link : null + ) + description = "Self-link of the spoke VPC" +} + +output "spoke_subnet_id" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].id : ( + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].id : null + ) + description = "ID of the spoke subnet" +} + +output "spoke_subnet_name" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].name : ( + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].name : null + ) + description = "Name of the spoke subnet" +} + +output "spoke_subnet_self_link" { + value = local.create_vpc ? google_compute_subnetwork.spoke_subnet[0].self_link : ( + local.use_existing_vpc ? data.google_compute_subnetwork.existing_spoke_subnet[0].self_link : null + ) + description = "Self-link of the spoke subnet" +} + +output "nat_id" { + value = local.create_vpc ? google_compute_router_nat.nat[0].id : null + description = "ID of the Cloud NAT (null when vpc_source=existing)" +} + +output "hub_vpc_id" { + value = var.create_hub ? google_compute_network.hub_vpc[0].id : null + description = "ID of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_name" { + value = var.create_hub ? google_compute_network.hub_vpc[0].name : null + description = "Name of the hub VPC (null when create_hub=false)" +} + +output "hub_vpc_self_link" { + value = var.create_hub ? google_compute_network.hub_vpc[0].self_link : null + description = "Self-link of the hub VPC (null when create_hub=false)" +} + +output "hub_subnet_name" { + value = var.create_hub ? google_compute_subnetwork.hub_subnet[0].name : null + description = "Name of the hub subnet (null when create_hub=false)" +} diff --git a/modules/gcp/network/tests/create-with-hub/main.tf b/modules/gcp/network/tests/create-with-hub/main.tf new file mode 100644 index 00000000..2d27e0bd --- /dev/null +++ b/modules/gcp/network/tests/create-with-hub/main.tf @@ -0,0 +1,26 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-spoke-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" + + create_hub = true + hub_vpc_google_project = "fixture-hub-project" + hub_vpc_cidr = "10.1.0.0/24" + is_spoke_vpc_shared = true + workspace_google_project = "fixture-workspace-project" +} diff --git a/modules/gcp/network/tests/create/main.tf b/modules/gcp/network/tests/create/main.tf new file mode 100644 index 00000000..29d729bb --- /dev/null +++ b/modules/gcp/network/tests/create/main.tf @@ -0,0 +1,20 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "create" + spoke_vpc_google_project = "fixture-project" + spoke_vpc_cidr = "10.0.0.0/16" + subnet_cidr = "10.0.0.0/22" +} diff --git a/modules/gcp/network/tests/existing/main.tf b/modules/gcp/network/tests/existing/main.tf new file mode 100644 index 00000000..8935be2c --- /dev/null +++ b/modules/gcp/network/tests/existing/main.tf @@ -0,0 +1,20 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-project" + region = "us-central1" +} + +module "network" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + vpc_source = "existing" + spoke_vpc_google_project = "fixture-project" + existing_vpc_name = "preexisting-vpc" + existing_subnet_name = "preexisting-subnet" +} diff --git a/modules/gcp/network/variables.tf b/modules/gcp/network/variables.tf new file mode 100644 index 00000000..17e96d61 --- /dev/null +++ b/modules/gcp/network/variables.tf @@ -0,0 +1,104 @@ +variable "prefix" { + type = string + description = "Prefix for generated resource names" +} + +variable "suffix" { + type = string + description = "Random suffix passed by the composer for uniqueness" +} + +variable "google_region" { + type = string + description = "GCP region for all network resources" +} + +variable "vpc_source" { + type = string + description = "Either 'create' (Terraform creates a VPC) or 'existing' (data-source lookup)" + validation { + condition = contains(["create", "existing"], var.vpc_source) + error_message = "vpc_source must be 'create' or 'existing'." + } +} + +# Spoke project always required +variable "spoke_vpc_google_project" { + type = string + description = "GCP project hosting the spoke VPC" +} + +# === Used when vpc_source = "create" ==================================== +variable "spoke_vpc_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet primary range (required when vpc_source=create)" +} + +variable "subnet_cidr" { + type = string + default = null + description = "CIDR for the spoke subnet (required when vpc_source=create)" +} + +variable "subnet_name" { + type = string + default = null + description = "Override for spoke subnet name (default: \"{prefix}-subnet-{suffix}\")" +} + +variable "pod_cidr" { + type = string + default = null + description = "GKE secondary range for pods (optional)" +} + +variable "svc_cidr" { + type = string + default = null + description = "GKE secondary range for services (optional)" +} + +# === Used when vpc_source = "existing" ================================== +variable "existing_vpc_name" { + type = string + default = null + description = "Name of pre-existing VPC (required when vpc_source=existing)" +} + +variable "existing_subnet_name" { + type = string + default = null + description = "Name of pre-existing subnet (required when vpc_source=existing)" +} + +# === Hub configuration (only when create_hub = true) ==================== +variable "create_hub" { + type = bool + default = false + description = "Create a hub VPC + subnet + peering with the spoke. Composer passes restricted_egress here." +} + +variable "hub_vpc_google_project" { + type = string + default = null + description = "GCP project hosting the hub VPC (required when create_hub=true)" +} + +variable "hub_vpc_cidr" { + type = string + default = null + description = "CIDR for the hub subnet (required when create_hub=true)" +} + +variable "is_spoke_vpc_shared" { + type = bool + default = false + description = "If true, bind the spoke VPC's project as a Shared-VPC host and the workspace project as a service project" +} + +variable "workspace_google_project" { + type = string + default = null + description = "Workspace project (used for Shared-VPC service binding)" +} diff --git a/modules/gcp/network/versions.tf b/modules/gcp/network/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/network/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/private-connectivity/Makefile b/modules/gcp/private-connectivity/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/private-connectivity/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/private-connectivity/README.md b/modules/gcp/private-connectivity/README.md new file mode 100644 index 00000000..bbe06d36 --- /dev/null +++ b/modules/gcp/private-connectivity/README.md @@ -0,0 +1,73 @@ +# modules/gcp/private-connectivity + +GCP-side PSC endpoints + restricted-egress firewall for the Databricks GCP composer. + + +## Requirements + +| Name | Version | +|------|---------| +| [terraform](#requirement\_terraform) | >= 1.5 | +| [google](#requirement\_google) | >= 4.0 | + +## Providers + +| Name | Version | +|------|---------| +| [google](#provider\_google) | 7.31.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [google_compute_address.backend_address](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | +| [google_compute_address.frontend_address_hub](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | +| [google_compute_address.frontend_address_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_address) | resource | +| [google_compute_firewall.hub_ingress](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_allow_ctl_plane](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_allow_google_apis](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_allow_hive](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_firewall.spoke_default_deny_egress](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_firewall) | resource | +| [google_compute_forwarding_rule.backend_fr](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | +| [google_compute_forwarding_rule.frontend_fr_hub](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | +| [google_compute_forwarding_rule.frontend_fr_spoke](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_forwarding_rule) | resource | +| [google_compute_subnetwork.psc_subnet](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/compute_subnetwork) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [google\_region](#input\_google\_region) | n/a | `string` | n/a | yes | +| [prefix](#input\_prefix) | n/a | `string` | n/a | yes | +| [psc\_subnet\_cidr](#input\_psc\_subnet\_cidr) | CIDR for the dedicated PSC subnet in the spoke VPC | `string` | n/a | yes | +| [spoke\_vpc\_cidr](#input\_spoke\_vpc\_cidr) | n/a | `string` | n/a | yes | +| [spoke\_vpc\_google\_project](#input\_spoke\_vpc\_google\_project) | n/a | `string` | n/a | yes | +| [spoke\_vpc\_id](#input\_spoke\_vpc\_id) | Spoke network refs | `string` | n/a | yes | +| [spoke\_vpc\_self\_link](#input\_spoke\_vpc\_self\_link) | n/a | `string` | n/a | yes | +| [suffix](#input\_suffix) | n/a | `string` | n/a | yes | +| [enable\_backend](#input\_enable\_backend) | n/a | `bool` | `false` | no | +| [enable\_frontend](#input\_enable\_frontend) | Feature flags | `bool` | `false` | no | +| [hive\_metastore\_ip](#input\_hive\_metastore\_ip) | Regional Hive metastore IP (looked up via internal map if null) | `string` | `null` | no | +| [hub\_subnet\_name](#input\_hub\_subnet\_name) | n/a | `string` | `null` | no | +| [hub\_vpc\_cidr](#input\_hub\_vpc\_cidr) | n/a | `string` | `null` | no | +| [hub\_vpc\_google\_project](#input\_hub\_vpc\_google\_project) | n/a | `string` | `null` | no | +| [hub\_vpc\_id](#input\_hub\_vpc\_id) | Hub network refs (nullable when no hub) | `string` | `null` | no | +| [hub\_vpc\_self\_link](#input\_hub\_vpc\_self\_link) | n/a | `string` | `null` | no | +| [restrict\_egress](#input\_restrict\_egress) | n/a | `bool` | `false` | no | + +## Outputs + +| Name | Description | +|------|-------------| +| [backend\_psc\_fr\_id](#output\_backend\_psc\_fr\_id) | Name of the backend (SCC) PSC forwarding rule (null when enable\_backend=false) | +| [backend\_psc\_ip\_spoke](#output\_backend\_psc\_ip\_spoke) | IP address of the spoke-side backend PSC endpoint | +| [frontend\_psc\_fr\_id](#output\_frontend\_psc\_fr\_id) | Name of the frontend PSC forwarding rule (null when enable\_frontend=false) | +| [frontend\_psc\_ip\_hub](#output\_frontend\_psc\_ip\_hub) | IP address of the hub-side frontend PSC endpoint (null when no hub) | +| [frontend\_psc\_ip\_spoke](#output\_frontend\_psc\_ip\_spoke) | IP address of the spoke-side frontend PSC endpoint | +| [hub\_frontend\_psc\_fr\_id](#output\_hub\_frontend\_psc\_fr\_id) | Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend) | +| [psc\_subnet\_self\_link](#output\_psc\_subnet\_self\_link) | Self-link of the PSC subnet | + diff --git a/modules/gcp/private-connectivity/firewall.tf b/modules/gcp/private-connectivity/firewall.tf new file mode 100644 index 00000000..97342c67 --- /dev/null +++ b/modules/gcp/private-connectivity/firewall.tf @@ -0,0 +1,97 @@ +# Egress firewall stack — only emitted when restrict_egress = true. + +# === Spoke deny-egress ================================================== +resource "google_compute_firewall" "spoke_default_deny_egress" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-default-deny-egress" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1100 + destination_ranges = ["0.0.0.0/0"] + source_ranges = [] + + deny { + protocol = "all" + } +} + +# === Spoke allow Google APIs ============================================ +resource "google_compute_firewall" "spoke_allow_google_apis" { + count = var.restrict_egress ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-google-apis" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "199.36.153.4/30", + "199.36.153.8/30", + "34.126.0.0/18" + ] + + allow { + protocol = "all" + } +} + +# === Spoke allow Databricks control plane (to PSC IPs) ================== +resource "google_compute_firewall" "spoke_allow_ctl_plane" { + count = var.restrict_egress && var.enable_frontend && var.enable_backend ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-databricks-control-plane" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = [ + "${google_compute_forwarding_rule.backend_fr[0].ip_address}/32", + "${google_compute_forwarding_rule.frontend_fr_spoke[0].ip_address}/32" + ] + + allow { + protocol = "tcp" + ports = ["443"] + } +} + +# === Spoke allow managed Hive (conditional on hive_metastore_ip) ======== +resource "google_compute_firewall" "spoke_allow_hive" { + count = var.restrict_egress && local.hive_metastore_ip != "" ? 1 : 0 + + name = "${var.prefix}-spoke-${var.suffix}-to-${var.google_region}-managed-hive" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_self_link + + direction = "EGRESS" + priority = 1000 + destination_ranges = ["${local.hive_metastore_ip}/32"] + + allow { + protocol = "tcp" + ports = ["3306"] + } +} + +# === Hub ingress from spoke ============================================= +resource "google_compute_firewall" "hub_ingress" { + count = var.restrict_egress && local.hub_present ? 1 : 0 + + name = "${var.prefix}-hub-${var.suffix}-ingress" + project = var.hub_vpc_google_project + network = var.hub_vpc_self_link + + direction = "INGRESS" + priority = 1000 + destination_ranges = [] + source_ranges = [var.spoke_vpc_cidr] + + allow { + protocol = "all" + } +} diff --git a/modules/gcp-with-psc-exfiltration-protection/main.tf b/modules/gcp/private-connectivity/locals.tf similarity index 79% rename from modules/gcp-with-psc-exfiltration-protection/main.tf rename to modules/gcp/private-connectivity/locals.tf index c587c4b3..e2c8cead 100644 --- a/modules/gcp-with-psc-exfiltration-protection/main.tf +++ b/modules/gcp/private-connectivity/locals.tf @@ -1,16 +1,4 @@ -##################################################### -# Local Values and Random String Resource -##################################################### - -# --------------------------------------------------- -# Local Value: Extract Workspace DNS ID -# --------------------------------------------------- locals { - # Extracts a numeric identifier from the Databricks workspace URL. - # The regex pattern "[0-9]+\.[0-9]+" matches the first occurrence of two groups of digits separated by a dot (e.g., "1234567890123456.1234"). - # This value is typically used to generate unique DNS names for the workspace. - workspace_dns_id = regex("[0-9]+\\.[0-9]+", databricks_mws_workspaces.databricks_workspace.workspace_url) - google_frontend_psc_targets = { "asia-northeast1" = "projects/general-prod-asianortheast1-01/regions/asia-northeast1/serviceAttachments/plproxy-psc-endpoint-all-ports" "asia-south1" = "projects/gen-prod-asias1-01/regions/asia-south1/serviceAttachments/plproxy-psc-endpoint-all-ports" @@ -44,20 +32,14 @@ locals { "us-west1" = "projects/prod-gcp-us-west1/regions/us-west1/serviceAttachments/ngrok-psc-endpoint" "us-west4" = "projects/prod-gcp-us-west4/regions/us-west4/serviceAttachments/ngrok-psc-endpoint" } -} -# --------------------------------------------------- -# Random String Resource: Suffix Generator -# --------------------------------------------------- -resource "random_string" "suffix" { - lifecycle { - ignore_changes = [ - special, - upper - ] + # Regional default Hive Metastore IPs per Databricks docs: + # https://docs.gcp.databricks.com/en/resources/ip-domain-region.html#addresses-for-default-metastore + # NOTE: kept empty initially; override via var.hive_metastore_ip. + default_hive_metastore_ips = { } - special = false - upper = false - length = 6 + hive_metastore_ip = var.hive_metastore_ip != null ? var.hive_metastore_ip : try(local.default_hive_metastore_ips[var.google_region], "") + + hub_present = var.hub_vpc_id != null } diff --git a/modules/gcp/private-connectivity/outputs.tf b/modules/gcp/private-connectivity/outputs.tf new file mode 100644 index 00000000..916bcf9c --- /dev/null +++ b/modules/gcp/private-connectivity/outputs.tf @@ -0,0 +1,34 @@ +output "psc_subnet_self_link" { + value = google_compute_subnetwork.psc_subnet.self_link + description = "Self-link of the PSC subnet" +} + +output "frontend_psc_fr_id" { + value = var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_spoke[0].name : null + description = "Name of the frontend PSC forwarding rule (null when enable_frontend=false)" +} + +output "backend_psc_fr_id" { + value = var.enable_backend ? google_compute_forwarding_rule.backend_fr[0].name : null + description = "Name of the backend (SCC) PSC forwarding rule (null when enable_backend=false)" +} + +output "hub_frontend_psc_fr_id" { + value = local.hub_present && var.enable_frontend ? google_compute_forwarding_rule.frontend_fr_hub[0].name : null + description = "Name of the hub-side frontend PSC forwarding rule (null when no hub or no frontend)" +} + +output "frontend_psc_ip_spoke" { + value = var.enable_frontend ? google_compute_address.frontend_address_spoke[0].address : null + description = "IP address of the spoke-side frontend PSC endpoint" +} + +output "backend_psc_ip_spoke" { + value = var.enable_backend ? google_compute_address.backend_address[0].address : null + description = "IP address of the spoke-side backend PSC endpoint" +} + +output "frontend_psc_ip_hub" { + value = local.hub_present && var.enable_frontend ? google_compute_address.frontend_address_hub[0].address : null + description = "IP address of the hub-side frontend PSC endpoint (null when no hub)" +} diff --git a/modules/gcp/private-connectivity/psc.tf b/modules/gcp/private-connectivity/psc.tf new file mode 100644 index 00000000..485abb2e --- /dev/null +++ b/modules/gcp/private-connectivity/psc.tf @@ -0,0 +1,78 @@ +# === PSC Subnet (spoke) ================================================= +resource "google_compute_subnetwork" "psc_subnet" { + name = "${var.prefix}-psc-subnet-${var.suffix}" + project = var.spoke_vpc_google_project + network = var.spoke_vpc_id + region = var.google_region + ip_cidr_range = var.psc_subnet_cidr + private_ip_google_access = true +} + +# === Backend (SCC) PSC endpoint — spoke ================================= +resource "google_compute_address" "backend_address" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "backend_fr" { + count = var.enable_backend ? 1 : 0 + + name = "${var.prefix}-psc-scc-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.backend_address[0].id + target = local.google_backend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — spoke ====================================== +resource "google_compute_address" "frontend_address_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ip-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + subnetwork = google_compute_subnetwork.psc_subnet.name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_spoke" { + count = var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-psc-ws-ep-${var.suffix}" + project = var.spoke_vpc_google_project + region = var.google_region + network = var.spoke_vpc_id + ip_address = google_compute_address.frontend_address_spoke[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} + +# === Frontend PSC endpoint — hub (transit) ============================== +resource "google_compute_address" "frontend_address_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ip-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + subnetwork = var.hub_subnet_name + address_type = "INTERNAL" +} + +resource "google_compute_forwarding_rule" "frontend_fr_hub" { + count = local.hub_present && var.enable_frontend ? 1 : 0 + + name = "${var.prefix}-hub-psc-ws-ep-${var.suffix}" + project = var.hub_vpc_google_project + region = var.google_region + network = var.hub_vpc_id + ip_address = google_compute_address.frontend_address_hub[0].id + target = local.google_frontend_psc_targets[var.google_region] + load_balancing_scheme = "" +} diff --git a/modules/gcp/private-connectivity/tests/full-isolated/main.tf b/modules/gcp/private-connectivity/tests/full-isolated/main.tf new file mode 100644 index 00000000..f7a11a3c --- /dev/null +++ b/modules/gcp/private-connectivity/tests/full-isolated/main.tf @@ -0,0 +1,32 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + hub_vpc_id = "projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-hub/global/networks/hub-vpc" + hub_vpc_google_project = "fixture-hub" + hub_subnet_name = "fixture-hub-subnet-abc123" + hub_vpc_cidr = "10.1.0.0/24" + + enable_frontend = true + enable_backend = true + restrict_egress = true + psc_subnet_cidr = "10.0.255.0/28" +} diff --git a/modules/gcp/private-connectivity/tests/no-egress/main.tf b/modules/gcp/private-connectivity/tests/no-egress/main.tf new file mode 100644 index 00000000..6c318d4c --- /dev/null +++ b/modules/gcp/private-connectivity/tests/no-egress/main.tf @@ -0,0 +1,26 @@ +terraform { + required_version = ">= 1.5" +} + +provider "google" { + project = "fixture-spoke" + region = "us-central1" +} + +module "pc" { + source = "../.." + + prefix = "fixture" + suffix = "abc123" + google_region = "us-central1" + + spoke_vpc_id = "projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_self_link = "https://www.googleapis.com/compute/v1/projects/fixture-spoke/global/networks/spoke-vpc" + spoke_vpc_google_project = "fixture-spoke" + spoke_vpc_cidr = "10.0.0.0/16" + + enable_frontend = true + enable_backend = false + restrict_egress = false + psc_subnet_cidr = "10.0.255.0/28" +} diff --git a/modules/gcp/private-connectivity/variables.tf b/modules/gcp/private-connectivity/variables.tf new file mode 100644 index 00000000..a539e9d0 --- /dev/null +++ b/modules/gcp/private-connectivity/variables.tf @@ -0,0 +1,57 @@ +variable "prefix" { type = string } +variable "suffix" { type = string } +variable "google_region" { type = string } + +# Spoke network refs +variable "spoke_vpc_id" { type = string } +variable "spoke_vpc_self_link" { type = string } +variable "spoke_vpc_google_project" { type = string } +variable "spoke_vpc_cidr" { type = string } + +# Hub network refs (nullable when no hub) +variable "hub_vpc_id" { + type = string + default = null +} +variable "hub_vpc_self_link" { + type = string + default = null +} +variable "hub_vpc_google_project" { + type = string + default = null +} +variable "hub_subnet_name" { + type = string + default = null +} +variable "hub_vpc_cidr" { + type = string + default = null +} + +# Feature flags +variable "enable_frontend" { + type = bool + default = false +} +variable "enable_backend" { + type = bool + default = false +} +variable "restrict_egress" { + type = bool + default = false +} + +# PSC subnet CIDR +variable "psc_subnet_cidr" { + type = string + description = "CIDR for the dedicated PSC subnet in the spoke VPC" +} + +variable "hive_metastore_ip" { + type = string + default = null + description = "Regional Hive metastore IP (looked up via internal map if null)" +} diff --git a/modules/gcp/private-connectivity/versions.tf b/modules/gcp/private-connectivity/versions.tf new file mode 100644 index 00000000..de067e7d --- /dev/null +++ b/modules/gcp/private-connectivity/versions.tf @@ -0,0 +1,9 @@ +terraform { + required_version = ">= 1.5" + required_providers { + google = { + source = "hashicorp/google" + version = ">= 4.0" + } + } +} diff --git a/modules/gcp/service-account/Makefile b/modules/gcp/service-account/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/service-account/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp-sa-provisioning/README.md b/modules/gcp/service-account/README.md similarity index 97% rename from modules/gcp-sa-provisioning/README.md rename to modules/gcp/service-account/README.md index 75b57fcf..35adbf9e 100644 --- a/modules/gcp-sa-provisioning/README.md +++ b/modules/gcp/service-account/README.md @@ -34,7 +34,7 @@ No requirements. | Name | Version | |------|---------| -| [google](#provider\_google) | n/a | +| [google](#provider\_google) | 7.31.0 | ## Modules diff --git a/modules/gcp-sa-provisioning/init.tf b/modules/gcp/service-account/init.tf similarity index 100% rename from modules/gcp-sa-provisioning/init.tf rename to modules/gcp/service-account/init.tf diff --git a/modules/gcp-sa-provisioning/main.tf b/modules/gcp/service-account/main.tf similarity index 100% rename from modules/gcp-sa-provisioning/main.tf rename to modules/gcp/service-account/main.tf diff --git a/modules/gcp-sa-provisioning/outputs.tf b/modules/gcp/service-account/outputs.tf similarity index 100% rename from modules/gcp-sa-provisioning/outputs.tf rename to modules/gcp/service-account/outputs.tf diff --git a/modules/gcp-sa-provisioning/variables.tf b/modules/gcp/service-account/variables.tf similarity index 100% rename from modules/gcp-sa-provisioning/variables.tf rename to modules/gcp/service-account/variables.tf diff --git a/modules/gcp/unity-catalog/Makefile b/modules/gcp/unity-catalog/Makefile new file mode 100644 index 00000000..17b32ec8 --- /dev/null +++ b/modules/gcp/unity-catalog/Makefile @@ -0,0 +1,7 @@ +.PHONY: docs test_docs + +docs: + terraform-docs -c ../../../.terraform-docs.yml . + +test_docs: + terraform-docs -c ../../../.terraform-docs.yml --output-check . diff --git a/modules/gcp/unity-catalog/README.md b/modules/gcp/unity-catalog/README.md new file mode 100644 index 00000000..b5bdb04d --- /dev/null +++ b/modules/gcp/unity-catalog/README.md @@ -0,0 +1,50 @@ +# modules/gcp/unity-catalog + +Unity Catalog metastore, GCS bucket, storage credential, external location, and catalog for GCP Databricks workspaces. Called by examples after the workspace exists (uses workspace-scoped Databricks provider alias). + + +## Requirements + +No requirements. + +## Providers + +| Name | Version | +|------|---------| +| [databricks](#provider\_databricks) | 1.114.2 | +| [databricks.workspace](#provider\_databricks.workspace) | 1.114.2 | +| [google](#provider\_google) | 7.31.0 | + +## Modules + +No modules. + +## Resources + +| Name | Type | +|------|------| +| [databricks_catalog.main](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/catalog) | resource | +| [databricks_external_location.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/external_location) | resource | +| [databricks_metastore.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/metastore) | resource | +| [databricks_metastore_assignment.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/metastore_assignment) | resource | +| [databricks_storage_credential.this](https://registry.terraform.io/providers/databricks/databricks/latest/docs/resources/storage_credential) | resource | +| [google_storage_bucket.ext_bucket](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket) | resource | +| [google_storage_bucket_iam_member.unity_cred_admin](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket_iam_member) | resource | +| [google_storage_bucket_iam_member.unity_cred_reader](https://registry.terraform.io/providers/hashicorp/google/latest/docs/resources/storage_bucket_iam_member) | resource | + +## Inputs + +| Name | Description | Type | Default | Required | +|------|-------------|------|---------|:--------:| +| [catalog\_name](#input\_catalog\_name) | Name to assign to default catalog | `string` | n/a | yes | +| [databricks\_workspace\_id](#input\_databricks\_workspace\_id) | The unique identifier of the Databricks workspace in which resources will be managed. | `any` | n/a | yes | +| [databricks\_workspace\_url](#input\_databricks\_workspace\_url) | The URL of the Databricks workspace to which resources will be deployed (e.g., https://.gcp.databricks.com). | `any` | n/a | yes | +| [google\_project](#input\_google\_project) | The Google Cloud project ID where the Databricks workspace and associated resources will be created. | `string` | n/a | yes | +| [google\_region](#input\_google\_region) | Google Cloud region where the resources will be created | `string` | n/a | yes | +| [metastore\_name](#input\_metastore\_name) | Name to assign to regional metastore | `string` | n/a | yes | +| [prefix](#input\_prefix) | Prefix to use in generated resources name | `string` | n/a | yes | + +## Outputs + +No outputs. + diff --git a/modules/gcp-unity-catalog/databricks-cloud-resources.tf b/modules/gcp/unity-catalog/databricks-cloud-resources.tf similarity index 100% rename from modules/gcp-unity-catalog/databricks-cloud-resources.tf rename to modules/gcp/unity-catalog/databricks-cloud-resources.tf diff --git a/modules/gcp-unity-catalog/gcs.tf b/modules/gcp/unity-catalog/gcs.tf similarity index 100% rename from modules/gcp-unity-catalog/gcs.tf rename to modules/gcp/unity-catalog/gcs.tf diff --git a/modules/gcp-unity-catalog/terraform.tf b/modules/gcp/unity-catalog/terraform.tf similarity index 100% rename from modules/gcp-unity-catalog/terraform.tf rename to modules/gcp/unity-catalog/terraform.tf diff --git a/modules/gcp-unity-catalog/variables.tf b/modules/gcp/unity-catalog/variables.tf similarity index 100% rename from modules/gcp-unity-catalog/variables.tf rename to modules/gcp/unity-catalog/variables.tf