diff --git a/README.md b/README.md
index 29c6b55..4813b90 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,14 @@
-# KCP-aware Dependency Controller
+# kcp-aware Dependency Controller
+[](https://github.com/opendefensecloud/dependency-controller/actions/workflows/golang.yaml)
+[](https://goreportcard.com/report/go.opendefense.cloud/dependency-controller)
+[](https://pkg.go.dev/go.opendefense.cloud/dependency-controller)
[](https://scorecard.dev/viewer/?uri=github.com/opendefensecloud/dependency-controller)
[](https://github.com/opendefensecloud/dependency-controller/releases/latest)
## Problem Statement
-In KCP, APIs can be offered to users via APIExports by a multitude of providers.
+In kcp, APIs can be offered to users via APIExports by a multitude of providers.
For IaaS services however there is a critical shortcoming:
IaaS APIs typically depend on each other -- for example, a VM is provisioned in a VPC.
The VM is dependent on the VPC. If the VPC is deleted, it pulls the rug from under the VM.
@@ -21,13 +24,13 @@ flowchart TD
A["Provider creates
DependencyRule
(e.g. VM → VPC)"] --> B["Both binaries discover rule
via dep-ctrl APIExport"]
B --> C["Controller:
Install ValidatingWebhook
in dependency provider workspace"]
- B --> E["Webhook:
Start indexed cache watching
dependent type via APIExport VW"]
+ B --> E["Webhook:
Register rule metadata
(dependent GVR, field paths)
in RuleRegistry"]
- E --> F["Informer indexes dependents
by field paths
(e.g. .spec.vpcRef.name)"]
+ E --> F["Registry holds rule metadata only
— no cache of dependents"]
F --> G{"Consumer tries to delete
dependency (e.g. VPC)"}
G --> H["Webhook intercepts DELETE"]
- H --> I["Query indexed cache:
any VMs where .spec.vpcRef.name = my-vpc?"]
+ H --> I["List VMs in consumer workspace
via kcp front-proxy;
in-memory filter:
.spec.vpcRef.name == my-vpc?"]
I -- Yes --> J["Deny deletion
'still referenced by VirtualMachine/my-vm'"]
I -- No --> K["Allow deletion"]
@@ -43,8 +46,9 @@ flowchart TD
Along with their APIExport, providers create `DependencyRule` objects to describe how their
resources depend on others. A single rule attaches to one dependent resource type (via its
-APIExport reference) and lists all of its dependencies with field paths that describe where
-the reference lives:
+APIExport reference in the same workspace as the rule) and lists each dependency together
+with the **dependency provider's** APIExport reference (workspace path + name) and the field
+path inside the dependent resource where the reference lives:
```yaml
apiVersion: dependencies.opendefense.cloud/v1alpha1
@@ -57,13 +61,20 @@ spec:
group: compute.example.com
version: v1alpha1
kind: VirtualMachine
+ resource: virtualmachines
dependencies:
- - group: network.example.com
+ - apiExportRef:
+ path: root:providers:network
+ name: network.example.com
+ group: network.example.com
version: v1alpha1
resource: vpcs
fieldRef:
path: ".spec.vpcRef.name"
- - group: network.example.com
+ - apiExportRef:
+ path: root:providers:network
+ name: network.example.com
+ group: network.example.com
version: v1alpha1
resource: subnets
fieldRef:
@@ -76,32 +87,44 @@ The system runs as two binaries, deployed together via a single Helm chart, that
both watch `DependencyRule` objects via the dep-ctrl APIExport:
**Controller** (`cmd/controller`) -- handles infrastructure setup:
+
- Installs `ValidatingWebhookConfiguration` in each provider workspace whose
resources are protected as dependencies
-- All provider workspace access goes through the dep-ctrl APIExport's virtual
- workspace, authorized by `permissionClaims` on the APIExport
+- Webhook management goes through the dep-ctrl APIExport's virtual workspace,
+ authorized by the `validatingwebhookconfigurations` `permissionClaim`.
+ Workspace-path resolution (translating `apiExportRef.path` into a logical
+ cluster name) goes through the kcp front-proxy directly, authorized by plain
+ RBAC on `tenancy.kcp.io/workspaces` plus a binding to the kcp-predefined
+ `system:kcp:workspace:access` ClusterRole.
**Webhook** (`cmd/webhook`) -- handles admission:
-- Maintains a dedicated indexed cache per rule, watching the dependent resource
- type via the provider's APIExport virtual workspace
-- Serves admission requests, querying indexed caches to block deletion of
- resources that are still referenced
-### Indexed Cache
+- Watches `DependencyRule` objects via the dep-ctrl APIExport's virtual
+ workspace and stores parsed metadata (dependent GVR + field paths) in an
+ in-memory `RuleRegistry`.
+- On each DELETE admission request, finds matching rules in the registry,
+ lists dependent resources directly in the consumer workspace via the kcp
+ front-proxy, and filters in-memory by the configured field path to block
+ deletion of still-referenced resources.
+
+### Rule Registry
-For each DependencyRule, the webhook server starts a multicluster manager that watches the
-dependent resource type (e.g., VirtualMachines) via the referenced APIExport's virtual
-workspace. Field indices are registered on the dependent informer for each dependency
-target's field path (e.g., `.spec.vpcRef.name`), enabling O(1) lookups by referenced
-resource name.
+The webhook keeps an in-memory `RuleRegistry` populated by reconciling
+`DependencyRule` objects through the dep-ctrl APIExport's virtual workspace.
+Each entry holds rule metadata only — the dependent's GroupVersionResource
+and the field paths that hold dependency references — not the dependent
+resources themselves. Dependent listing happens on demand per admission
+request (see [Admission Webhook](#admission-webhook) below).
### Admission Webhook
-A KCP ValidatingAdmissionWebhook intercepts DELETE requests. When a delete is attempted,
-the webhook queries the indexed caches to find dependent resources that reference the
-resource being deleted. If any are found, the request is denied with a clear error message
-listing the dependents. Finalizers are intentionally avoided as they conflict with KCP's
-sync-agent.
+A kcp ValidatingAdmissionWebhook intercepts DELETE requests. When a delete is attempted,
+the webhook looks up matching rules in the registry, builds a per-request dynamic client
+targeting the consumer workspace via the kcp front-proxy
+(`{base}/clusters/{logicalCluster}`), `List`s the dependent type, and filters the results
+in-memory by the rule's field path. If any blocker is found, the request is denied with a
+clear error message listing the dependents. Finalizers are intentionally avoided as they
+conflict with kcp's sync-agent.
### Architecture
@@ -112,7 +135,7 @@ in those workspaces. Consumer workspaces do not need to bind to the dep-ctrl exp
```mermaid
graph LR
- subgraph DC["Dep-Ctrl Workspace"]
+ subgraph DC["dep-ctrl Workspace"]
DCExport["APIExport:
DependencyRule
+ permissionClaims"]
end
@@ -121,37 +144,38 @@ graph LR
end
subgraph WB["Webhook Binary"]
- WH["Rule Cache Manager
· Indexed Caches (per rule)
· Deletion Validator"]
+ WH["DependencyRule Reconciler
· Rule Registry (metadata)
· Deletion Validator"]
end
- subgraph CP["Compute Provider WS"]
+ subgraph CP["Compute Provider Workspace"]
CPBinding["APIBinding: dep-ctrl
(claims accepted)"]
CPExport["APIExport: compute"]
CPRule["DependencyRule:
VM → VPC"]
end
- subgraph NP["Network Provider WS"]
+ subgraph NP["Network Provider Workspace"]
NPBinding["APIBinding: dep-ctrl
(claims accepted)"]
NPExport["APIExport: VPCs"]
NPWebhook["ValidatingWebhook"]
end
- subgraph ROOT["Root Workspace"]
- ROOTROLE["ClusterRoles
(workspaces/content +
workspace resolution)"]
+ subgraph ROOT["Workspace-resolution RBAC
(typical: root; alt: per-shard system:admin)"]
+ ROOTROLE["ClusterRole binding:
tenancy.kcp.io/workspaces get,list,watch
+ system:kcp:workspace:access"]
end
- subgraph CW["Consumer WS"]
+ subgraph CW["Consumer Workspace"]
CWBindings["APIBindings:
compute, network"]
CWResources["VPC, VM"]
end
CPBinding -->|binds to| DCExport
NPBinding -->|binds to| DCExport
- Ctrl -.->|watches rules via VW| DCExport
- Ctrl -.->|installs webhook via VW| NP
- WH -.->|watches rules via VW| DCExport
- WH -.->|watches VMs via| CPExport
+ Ctrl -.->|watches rules via virtual workspace| DCExport
+ Ctrl -.->|installs webhook via virtual workspace| NP
+ Ctrl -.->|resolves workspace paths
via kcp front-proxy| ROOTROLE
+ WH -.->|watches rules via virtual workspace| DCExport
NPWebhook -.->|dispatches DELETE to| WH
+ WH -.->|on DELETE: lists dependents
via kcp front-proxy| CW
CWBindings -->|binds to| CPExport
CWBindings -->|binds to| NPExport
@@ -164,27 +188,25 @@ graph LR
style CW fill:#fef3c7,color:#664d03
```
-**Two levels of multicluster watching:**
-
-1. **DependencyRule reconciler** (both binaries) watches rules via the dep-ctrl's own
- APIExport virtual workspace, discovering provider workspaces that bind to the dep-ctrl
- export.
-
-2. **Indexed cache** (webhook only, dynamic per-rule) watches the dependent resource type
- (e.g., VMs) via the referenced APIExport's virtual workspace. Field indices enable the
- webhook to quickly find dependents referencing a given resource.
+**Multicluster watching is one-level only:** both binaries watch
+`DependencyRule` objects via the dep-ctrl APIExport's virtual workspace,
+which spans every provider workspace bound to it. Dependent resources
+(e.g., VMs) are not watched — the webhook lists them on demand from the
+consumer workspace via the kcp front-proxy when validating a DELETE.
For detailed architecture documentation, see [docs/architecture.md](docs/architecture.md).
For a step-by-step deployment walkthrough, see [docs/getting-started.md](docs/getting-started.md).
+For development setup and project layout, see [docs/development.md](docs/development.md).
### RBAC Model
-The system uses static bootstrap RBAC in three kcp locations. No dynamic RBAC is
-created at runtime.
+The system relies on static bootstrap RBAC plus one `permissionClaim` declared
+on the dep-ctrl APIExport. No dynamic RBAC is created at runtime.
#### permissionClaims on the dep-ctrl APIExport
The dep-ctrl APIExport declares a `permissionClaim` for:
+
- `validatingwebhookconfigurations` (admissionregistration.k8s.io) -- to install webhooks
Provider workspaces that bind to the dep-ctrl APIExport must **accept** this claim
@@ -193,47 +215,51 @@ in binding workspaces through the virtual workspace.
#### Bootstrap RBAC (static, applied at deployment)
-**Root workspace** -- both components need `workspaces/content` access to enter child
-workspaces. The controller additionally needs `workspaces` read access to resolve
-workspace paths to logical cluster names.
-
-**Dep-ctrl workspace** -- the controller needs `apiexportendpointslices` read access
-for VW URL discovery and full CRUD on `apiexports/content` to manage webhooks in
-binding workspaces via the VW.
-
-No shard-wide RBAC is needed. The webhook watches dependent resources through the
-dep-ctrl APIExport's virtual workspace, authorized by dynamically managed
-permissionClaims. Providers accept these claims in their APIBinding.
-
-See [docs/getting-started.md](docs/getting-started.md) for the full deployment guide
-using [kcp-operator](https://github.com/kcp-dev/helm-charts).
+Three categories of static RBAC must be in place at deployment time:
+
+**Per-shard `system:admin` RBAC (webhook)** -- grants the webhook ServiceAccount
+`*/*` `get,list`. The webhook needs this during admission to list dependent
+resources directly in any consumer workspace via the kcp front-proxy. Because
+kcp's `BootstrapPolicyAuthorizer` reads bindings from each shard's local
+`system:admin` workspace and bindings do not propagate across shards, this
+binding must be applied **once per kcp shard** through a direct (non-front-proxy)
+connection.
+
+**Workspace-resolution RBAC (controller)** -- the controller needs
+`tenancy.kcp.io/workspaces` `get,list,watch` plus workspace-content access — the
+canonical way is to bind the kcp-predefined `system:kcp:workspace:access`
+ClusterRole, which grants the `access` verb on the non-resource URL `/`. Both
+must be in place in every **parent** of a workspace the controller operates on.
+The controller uses these rules to translate a `DependencyRule`'s
+`apiExportRef.path` (e.g., `root:providers:network`) into the underlying logical
+cluster name. In a typical deployment where provider workspaces live directly
+under `root`, granting them in the `root` workspace is enough; deeper paths need
+the same bindings in each intermediate parent. As an alternative, the bindings
+may be applied in each shard's `system:admin` workspace — those cover every
+workspace on the shard and implicitly satisfy any parent the resolver needs to
+traverse, at the cost of (like the webhook binding above) being applied once per
+shard.
+
+**Dep-ctrl workspace RBAC (both components)** -- both binaries need
+`apis.kcp.io/apiexportendpointslices` `get,list,watch` (to discover the dep-ctrl
+APIExport's virtual-workspace URLs) and `apis.kcp.io/apiexports/content` on the
+dep-ctrl APIExport. The controller uses the latter to manage
+`ValidatingWebhookConfiguration` objects in binding workspaces through the
+virtual workspace; the webhook uses it to watch `DependencyRule` objects through
+the same virtual workspace.
+
+Webhook installation in provider workspaces is authorized by the
+`validatingwebhookconfigurations` permissionClaim above, not by RBAC. Dependent
+listing during admission is authorized by the per-shard `system:admin` binding,
+not by the dep-ctrl APIExport.
## Development
-### Prerequisites
-
-- Go 1.26+
-- [kcp](https://github.com/kcp-dev/kcp) binary (for integration tests)
+The fastest way to get a working dev environment is the [Nix flake](flake.nix)
+together with [direnv](https://direnv.net/): `direnv allow` (or `nix develop`)
+drops you into a shell with Go, `golangci-lint`, `helm`, `kind`, and the kcp
+toolchain on `$PATH`. After that, `pre-commit install` registers the project's
+hooks.
-### Build
-
-```sh
-make build
-```
-
-### Run Tests
-
-```sh
-# Unit and integration tests (requires kcp binary)
-make test
-
-# E2E tests (requires kind, helm, docker)
-# Deploys a multi-shard kcp via kcp-operator (root + shard1)
-make test-e2e
-```
-
-### Generate Code
-
-```sh
-make generate
-```
+For project layout, the full `make` target reference, integration- and e2e-test
+internals, and shard-config tips, see [docs/development.md](docs/development.md).
diff --git a/docs/development.md b/docs/development.md
index 1dbf002..d48ba65 100644
--- a/docs/development.md
+++ b/docs/development.md
@@ -2,8 +2,38 @@
## Prerequisites
+The recommended setup is the [dev shell](#dev-shell) below — it provides the
+full toolchain. If you're not using Nix, you need:
+
- Go 1.26+
- A kcp binary (downloaded automatically by `make kcp`)
+- `golangci-lint`, `helm`, `kind`, `docker`, and `pre-commit` on `$PATH`
+
+## Dev shell
+
+The repo ships a [Nix flake](../flake.nix) wired up via [direnv](https://direnv.net/)
+(see [.envrc](../.envrc)). With both installed, the dev shell auto-loads on `cd`
+and provides Go 1.26.2, `golangci-lint`, `gopls`, `helm`, `kind`, `task`, the kcp
+toolchain, and the rest of the dependencies.
+
+```sh
+direnv allow # one-time, on first entry
+# or, without direnv:
+nix develop
+```
+
+The shell is defined by [`opendefensecloud/dev-kit`](https://github.com/opendefensecloud/dev-kit)
+via the `dev-kit` flake input — adding tools project-wide is a PR there, not here.
+
+## Pre-commit hooks
+
+```sh
+pre-commit install # registers the hooks listed in .pre-commit-config.yaml
+```
+
+The configured hooks cover trailing whitespace, YAML/JSON syntax, `yamllint`,
+`shellcheck`, `gofmt`, `go vet`, `go mod tidy`, `golangci-lint` (manual stage),
+and `helm lint`.
## Project Structure