CM-1040: Extract shared ApplyResource helper and migrate istiocsr to SSA#420
CM-1040: Extract shared ApplyResource helper and migrate istiocsr to SSA#420sebrandon1 wants to merge 1 commit intoopenshift:masterfrom
Conversation
|
@sebrandon1: This pull request references CM-1040 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
WalkthroughAdds a generic Server-Side Apply utility ChangesGeneric Resource Applier and Controller Refactor
sequenceDiagram
participant Reconciler as Reconciler
participant ApplyUtil as ApplyResource
participant API as Kubernetes API Server
participant Recorder as EventRecorder
participant Logger as Logger
Reconciler->>ApplyUtil: Call ApplyResource(ctx, client, logger, recorder, owner, desired, existing, fieldOwner, hasChanged)
ApplyUtil->>API: Get resource (exists check)
API-->>ApplyUtil: Exists / NotFound / Error
alt exists & hasChanged==false
ApplyUtil->>Logger: emit debug "no drift, skipping"
ApplyUtil-->>Reconciler: return nil
else not exists or hasChanged==true
ApplyUtil->>API: Server-Side Apply (Patch, Force=true, FieldManager=fieldOwner)
API-->>ApplyUtil: Success / Error
alt success
ApplyUtil->>Recorder: Event(fmt.Sprintf("Applied %s/%s", ns, name))
ApplyUtil->>Logger: emit info "applied resource"
ApplyUtil-->>Reconciler: return nil
else error
ApplyUtil->>Logger: emit error
ApplyUtil-->>Reconciler: return propagated error
end
end
🎯 4 (Complex) | ⏱️ ~45 minutes 🚥 Pre-merge checks | ✅ 11 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (11 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: sebrandon1 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (2)
pkg/controller/istiocsr/install_instiocsr_test.go (1)
68-84: ⚡ Quick winPreserve generated-name simulation for
ClusterRolein these fakes.
createOrApplyClusterRoles()still depends onCreatemutating the object with a generated name before status is written and before theClusterRoleBindinggets itsRoleRef.Name. These stubs only backfillClusterRoleBinding, so the test can still pass even if the role name is left empty. I'd add a*rbacv1.ClusterRolebranch in eachCreateCallsblock as well.Representative tweak
m.CreateCalls(func(ctx context.Context, obj client.Object, option ...client.CreateOption) error { switch o := obj.(type) { + case *rbacv1.ClusterRole: + role := testClusterRole() + role.DeepCopyInto(o) case *appsv1.Deployment: if !reflect.DeepEqual(o.GetLabels(), labels) { return fmt.Errorf("labels mismatch in %v resource; got: %v, want: %v", o, o.GetLabels(), labels) }Also applies to: 110-117, 131-138
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controller/istiocsr/install_instiocsr_test.go` around lines 68 - 84, The CreateCalls test stubs need to simulate Create mutating a ClusterRole with a generated name so createOrApplyClusterRoles() sees a non-empty Role.Name; update the CreateCalls handlers (the ones that currently switch over *appsv1.Deployment and *rbacv1.ClusterRoleBinding) to also include a case for *rbacv1.ClusterRole that sets o.Name to a generated value (e.g., append "-generated" or a deterministic string) and preserves labels so subsequent logic that reads the ClusterRole's name (and the ClusterRoleBinding RoleRef.Name) behaves as in real Create; apply this same addition to the other CreateCalls blocks mentioned.pkg/controller/trustmanager/webhooks_test.go (1)
207-221: ⚡ Quick winKeep the error assertion specific to the webhook config.
ApplyResourcestill includes the resource name in the returned error, so shortening these expectations to justfailed to ... resourcemakes this table much less discriminating. A wrong object name or an extra apply call in this path would still satisfy the assertion. I'd keep the expected substring specific totrustManagerWebhookConfigNamehere, and mirror that in the other migrated reconciliation tests.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controller/trustmanager/webhooks_test.go` around lines 207 - 221, The test's error expectation is too generic; update the table entries (the case with name "patch error propagates" and similar migrated reconciliation tests) to assert the error message includes the specific webhook resource name by matching the substring that contains trustManagerWebhookConfigName (the Reconciler's ApplyResource error includes the resource name), e.g. expect the returned error to contain trustManagerWebhookConfigName along with "failed to apply resource", and adjust other tests that assert "failed to check if resource" / "failed to apply resource" to similarly include trustManagerWebhookConfigName so the assertions target the specific webhook config rather than any resource.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/controller/common/applier.go`:
- Around line 33-36: The resourceName construction in applier.go currently uses
fmt.Sprintf("%s/%s", desired.GetNamespace(), desired.GetName()) which yields
"/name" for cluster-scoped objects; change the logic to detect an empty
namespace (desired.GetNamespace() == "") and build resourceName as just
desired.GetName(), otherwise build it as namespace + "/" + name so
cluster-scoped resources do not get a leading slash; update any use of the
existing resourceName variable accordingly.
---
Nitpick comments:
In `@pkg/controller/istiocsr/install_instiocsr_test.go`:
- Around line 68-84: The CreateCalls test stubs need to simulate Create mutating
a ClusterRole with a generated name so createOrApplyClusterRoles() sees a
non-empty Role.Name; update the CreateCalls handlers (the ones that currently
switch over *appsv1.Deployment and *rbacv1.ClusterRoleBinding) to also include a
case for *rbacv1.ClusterRole that sets o.Name to a generated value (e.g., append
"-generated" or a deterministic string) and preserves labels so subsequent logic
that reads the ClusterRole's name (and the ClusterRoleBinding RoleRef.Name)
behaves as in real Create; apply this same addition to the other CreateCalls
blocks mentioned.
In `@pkg/controller/trustmanager/webhooks_test.go`:
- Around line 207-221: The test's error expectation is too generic; update the
table entries (the case with name "patch error propagates" and similar migrated
reconciliation tests) to assert the error message includes the specific webhook
resource name by matching the substring that contains
trustManagerWebhookConfigName (the Reconciler's ApplyResource error includes the
resource name), e.g. expect the returned error to contain
trustManagerWebhookConfigName along with "failed to apply resource", and adjust
other tests that assert "failed to check if resource" / "failed to apply
resource" to similarly include trustManagerWebhookConfigName so the assertions
target the specific webhook config rather than any resource.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 683e1b36-07c6-463b-8cd3-d29a46add8d5
📒 Files selected for processing (26)
pkg/controller/common/applier.gopkg/controller/istiocsr/certificates.gopkg/controller/istiocsr/certificates_test.gopkg/controller/istiocsr/constants.gopkg/controller/istiocsr/install_instiocsr_test.gopkg/controller/istiocsr/install_istiocsr.gopkg/controller/istiocsr/networkpolicies.gopkg/controller/istiocsr/rbacs.gopkg/controller/istiocsr/rbacs_test.gopkg/controller/istiocsr/serviceaccounts.gopkg/controller/istiocsr/serviceaccounts_test.gopkg/controller/istiocsr/services.gopkg/controller/istiocsr/services_test.gopkg/controller/trustmanager/certificates.gopkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/rbacs.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/webhooks.gopkg/controller/trustmanager/webhooks_test.go
Both istiocsr and trustmanager had 8-10 nearly identical createOrApply methods with duplicated reconciliation boilerplate. trustmanager used Server-Side Apply while istiocsr used Create/UpdateWithRetry. Add a generic common.ApplyResource[T]() helper that handles the common Exists/drift-check/Patch pattern with a pluggable hasChanged callback. The helper derives the Kubernetes kind from the type parameter for clear error messages and uses client.ObjectKeyFromObject for consistent resource name formatting. Migrate all trustmanager methods and istiocsr simple methods to use it. ClusterRole/ClusterRoleBinding methods in istiocsr are kept as-is since they use GenerateName with List fallback and Delete for immutable RoleRef changes. The istioCSRCreateRecon warning events are dropped from the migrated methods since SSA is inherently idempotent.
ca8b171 to
fdd735e
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/controller/istiocsr/serviceaccounts_test.go (1)
22-25: ⚡ Quick winCover the no-op branch explicitly.
Now that this table has
assertCalls, the"serviceaccount reconciliation successful"case should also assertPatchCallCount() == 0. Right now that case still passes ifApplyResourcestarts patching unchangedServiceAccounts on every reconcile, which is one of the main regression risks in this SSA migration.Suggested assertion
{ name: "serviceaccount reconciliation successful", preReq: func(r *Reconciler, m *fakes.FakeCtrlClient) { m.ExistsCalls(func(ctx context.Context, ns types.NamespacedName, obj client.Object) (bool, error) { switch o := obj.(type) { case *corev1.ServiceAccount: serviceaccount := testServiceAccount() serviceaccount.DeepCopyInto(o) } return true, nil }) }, + assertCalls: func(t *testing.T, mock *fakes.FakeCtrlClient) { + if mock.PatchCallCount() != 0 { + t.Errorf("createOrApplyServiceAccounts() Patch call count: %d, want 0", mock.PatchCallCount()) + } + }, },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/controller/istiocsr/serviceaccounts_test.go` around lines 22 - 25, Update the "serviceaccount reconciliation successful" testcase in the table-driven test in serviceaccounts_test.go to explicitly assert that no patch operations occurred: inside its assertCalls function, call mock.PatchCallCount() and require it equals 0 (on the provided *fakes.FakeCtrlClient) to ensure ApplyResource did not issue patches for unchanged ServiceAccounts; keep other existing assertions and reference the Reconciler and ApplyResource behavior as context.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/controller/istiocsr/rbacs.go`:
- Around line 265-269: createOrApplyRoleBindings and
createOrApplyRoleBindingForLeases currently always call common.ApplyResource
(Server-Side Apply) but RoleBinding.roleRef is immutable—mirror the
ClusterRoleBinding handler's pattern: after calling common.ApplyResource (with
hasObjectChanged from getRoleBindingObject/hasObjectChanged), detect when the
failure or diff indicates only a roleRef drift (compare desired.RoleRef to the
live object.RoleRef), then delete the existing rbacv1.RoleBinding and recreate
the desired object (preserving owner refs/events/fieldOwner) as a fallback;
implement this delete-then-create flow in both createOrApplyRoleBindings and
createOrApplyRoleBindingForLeases using the same logic used in the
ClusterRoleBinding handler to avoid permanent SSA validation errors.
---
Nitpick comments:
In `@pkg/controller/istiocsr/serviceaccounts_test.go`:
- Around line 22-25: Update the "serviceaccount reconciliation successful"
testcase in the table-driven test in serviceaccounts_test.go to explicitly
assert that no patch operations occurred: inside its assertCalls function, call
mock.PatchCallCount() and require it equals 0 (on the provided
*fakes.FakeCtrlClient) to ensure ApplyResource did not issue patches for
unchanged ServiceAccounts; keep other existing assertions and reference the
Reconciler and ApplyResource behavior as context.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 3367fe15-60ea-4966-a17e-6222e23128a6
📒 Files selected for processing (26)
pkg/controller/common/applier.gopkg/controller/istiocsr/certificates.gopkg/controller/istiocsr/certificates_test.gopkg/controller/istiocsr/constants.gopkg/controller/istiocsr/install_instiocsr_test.gopkg/controller/istiocsr/install_istiocsr.gopkg/controller/istiocsr/networkpolicies.gopkg/controller/istiocsr/rbacs.gopkg/controller/istiocsr/rbacs_test.gopkg/controller/istiocsr/serviceaccounts.gopkg/controller/istiocsr/serviceaccounts_test.gopkg/controller/istiocsr/services.gopkg/controller/istiocsr/services_test.gopkg/controller/trustmanager/certificates.gopkg/controller/trustmanager/certificates_test.gopkg/controller/trustmanager/controller_test.gopkg/controller/trustmanager/deployments.gopkg/controller/trustmanager/deployments_test.gopkg/controller/trustmanager/rbacs.gopkg/controller/trustmanager/rbacs_test.gopkg/controller/trustmanager/serviceaccounts.gopkg/controller/trustmanager/serviceaccounts_test.gopkg/controller/trustmanager/services.gopkg/controller/trustmanager/services_test.gopkg/controller/trustmanager/webhooks.gopkg/controller/trustmanager/webhooks_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- pkg/controller/trustmanager/certificates.go
- pkg/controller/trustmanager/certificates_test.go
| func (r *Reconciler) createOrApplyRoleBindings(istiocsr *v1alpha1.IstioCSR, serviceAccount string, resourceLabels map[string]string) error { | ||
| desired := r.getRoleBindingObject(serviceAccount, istiocsr.GetNamespace(), istiocsr.Spec.IstioCSRConfig.Istio.Namespace, resourceLabels) | ||
|
|
||
| roleBindingName := fmt.Sprintf("%s/%s", desired.GetNamespace(), desired.GetName()) | ||
| r.log.V(4).Info("reconciling rolebinding resource", "name", roleBindingName) | ||
| fetched := &rbacv1.RoleBinding{} | ||
| exist, err := r.Exists(r.ctx, client.ObjectKeyFromObject(desired), fetched) | ||
| if err != nil { | ||
| return common.FromClientError(err, "failed to check %s rolebinding resource already exists", roleBindingName) | ||
| } | ||
|
|
||
| if exist { | ||
| if istioCSRCreateRecon { | ||
| r.eventRecorder.Eventf(istiocsr, corev1.EventTypeWarning, "ResourceAlreadyExists", "%s rolebinding resource already exists, maybe from previous installation", roleBindingName) | ||
| } | ||
| if hasObjectChanged(desired, fetched) { | ||
| r.log.V(1).Info("rolebinding has been modified, updating to desired state", "name", roleBindingName) | ||
| if err := r.UpdateWithRetry(r.ctx, desired); err != nil { | ||
| return common.FromClientError(err, "failed to update %s rolebinding resource", roleBindingName) | ||
| } | ||
| r.eventRecorder.Eventf(istiocsr, corev1.EventTypeNormal, "Reconciled", "rolebinding resource %s reconciled back to desired state", roleBindingName) | ||
| } else { | ||
| r.log.V(4).Info("rolebinding resource already exists and is in expected state", "name", roleBindingName) | ||
| } | ||
| } | ||
|
|
||
| if !exist { | ||
| if err := r.Create(r.ctx, desired); err != nil { | ||
| return common.FromClientError(err, "failed to create %s rolebinding resource", roleBindingName) | ||
| } | ||
| r.eventRecorder.Eventf(istiocsr, corev1.EventTypeNormal, "Reconciled", "rolebinding resource %s created", roleBindingName) | ||
| } | ||
|
|
||
| return nil | ||
| return common.ApplyResource(r.ctx, r.CtrlClient, r.log, r.eventRecorder, istiocsr, desired, &rbacv1.RoleBinding{}, fieldOwner, | ||
| func(d, e *rbacv1.RoleBinding) bool { return hasObjectChanged(d, e) }, | ||
| ) |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
Is roleRefimmutable on KubernetesRoleBinding resources, and does changing it require deleting and recreating the binding instead of patching/applying it?
💡 Result:
Yes, roleRef is immutable on Kubernetes RoleBinding resources. Changing it requires deleting and recreating the binding, as patching or applying an update to roleRef will result in a validation error.[1][2] The official Kubernetes API documentation explicitly states: "This field is immutable."[1] The RBAC documentation confirms: "After you create a binding, you cannot change the Role or ClusterRole that it refers to. If you try to change a binding's roleRef, you get a validation error. If you do want to change the roleRef for a binding, you need to remove the binding object and create a replacement."[2] This behavior is enforced in the validation code and recently formalized with the +k8s:immutable tag in Kubernetes source code (as of early 2026).[3] Multiple sources, including Stack Overflow discussions citing the docs and code, report the exact error: "cannot change roleRef".[4][5]
Citations:
- 1: https://kubernetes.io/docs/reference/kubernetes-api/authorization-resources/role-binding-v1
- 2: https://kubernetes.io/docs/reference/access-authn-authz/rbac/
- 3: Migrate RoleBinding.RoleRef to +k8s:immutable declarative validation kubernetes/kubernetes#137094
- 4: https://stackoverflow.com/questions/69069078/why-kubernetes-doesnt-allow-roleref-of-rolebinding-to-be-updated
- 5: https://stackoverflow.com/questions/59825777/rbac-kubectl-add-patch-to-existing-rolebinding
🏁 Script executed:
# First, let's locate and examine the file mentioned in the review
fd -t f "rbacs.go" --exec wc -l {} \;Repository: openshift/cert-manager-operator
Length of output: 157
🏁 Script executed:
# Now let's read the relevant sections of the file
cat -n pkg/controller/istiocsr/rbacs.go | sed -n '260,300p'Repository: openshift/cert-manager-operator
Length of output: 2800
🏁 Script executed:
# Check the ClusterRoleBinding section mentioned (lines 329-350)
cat -n pkg/controller/istiocsr/rbacs.go | sed -n '320,360p'Repository: openshift/cert-manager-operator
Length of output: 2181
🏁 Script executed:
# Let's also check the common.ApplyResource function to understand its behavior
fd -t f -name "*.go" | xargs grep -l "func.*ApplyResource" | head -5Repository: openshift/cert-manager-operator
Length of output: 306
🏁 Script executed:
# Search for ApplyResource implementation
grep -r "func.*ApplyResource" --include="*.go" | head -10Repository: openshift/cert-manager-operator
Length of output: 281
🏁 Script executed:
# Alternative search with rg
rg "func.*ApplyResource" --type go -A 5Repository: openshift/cert-manager-operator
Length of output: 929
🏁 Script executed:
# Read the ApplyResource implementation
cat -n pkg/controller/common/applier.go | head -150Repository: openshift/cert-manager-operator
Length of output: 2180
🏁 Script executed:
# Check file size first
wc -l pkg/controller/common/applier.goRepository: openshift/cert-manager-operator
Length of output: 111
Preserve the delete/recreate path for RoleBinding.RoleRef drift.
Both createOrApplyRoleBindings (265–269) and createOrApplyRoleBindingForLeases (294–298) use ApplyResource, which patches via Server-Side Apply. Kubernetes treats RoleBinding.roleRef as immutable, so attempting to patch a changed roleRef will fail with a validation error and permanently break reconciliation. The ClusterRoleBinding handler (329–350) already implements the required delete/recreate fallback when roleRef changes; apply the same pattern here.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@pkg/controller/istiocsr/rbacs.go` around lines 265 - 269,
createOrApplyRoleBindings and createOrApplyRoleBindingForLeases currently always
call common.ApplyResource (Server-Side Apply) but RoleBinding.roleRef is
immutable—mirror the ClusterRoleBinding handler's pattern: after calling
common.ApplyResource (with hasObjectChanged from
getRoleBindingObject/hasObjectChanged), detect when the failure or diff
indicates only a roleRef drift (compare desired.RoleRef to the live
object.RoleRef), then delete the existing rbacv1.RoleBinding and recreate the
desired object (preserving owner refs/events/fieldOwner) as a fallback;
implement this delete-then-create flow in both createOrApplyRoleBindings and
createOrApplyRoleBindingForLeases using the same logic used in the
ClusterRoleBinding handler to avoid permanent SSA validation errors.
|
@sebrandon1: The following test failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
common.ApplyResource[T]()helper for SSA-based reconciliation with pluggable drift detection callbackscreateOrApplymethods to use the shared helpercreateOrApplymethods from Create/UpdateWithRetry to SSA via the shared helperistioCSRCreateReconwarning events from migrated methods (SSA is inherently idempotent)Test plan
go test ./pkg/controller/common/...passesgo test ./pkg/controller/istiocsr/...passesgo test ./pkg/controller/trustmanager/...passesgo build ./...succeedsmake lintshows only pre-existing issues (9 total, none introduced)Summary by CodeRabbit