CM-966: Implement centralized TLS profile fetching and application without istio-csr changes#416
CM-966: Implement centralized TLS profile fetching and application without istio-csr changes#416arun717 wants to merge 24 commits intoopenshift:masterfrom
Conversation
Add new package to map OpenShift TLS security profiles to cert-manager CLI flags. This enables centralized TLS configuration from the cluster APIServer resource. Features: - EffectiveSpec resolves TLS profiles (Old, Intermediate, Modern, Custom) - Defaults to Intermediate profile when nil or empty - CertManagerWebhookTLSArgs generates flags for webhook main + metrics listeners - CertManagerOperandMetricsTLSArgs generates flags for controller/cainjector metrics - Converts OpenSSL cipher names to IANA format using library-go Note: Curve preferences are not yet configurable due to upstream cert-manager limitations - operands inherit Go's default curve ordering. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Implement deployment hook to fetch and apply TLS configuration from the cluster APIServer resource to cert-manager operand deployments. The hook: - Fetches apiserver.config.openshift.io/cluster resource - Resolves TLS profile using tlsprofile.EffectiveSpec - Applies appropriate TLS flags based on deployment type: * cert-manager-webhook: main TLS + metrics TLS flags * cert-manager (controller): metrics TLS flags only * cert-manager-cainjector: metrics TLS flags only - Merges TLS args into existing container args with override semantics - Skips deployments with != 1 container Includes comprehensive unit tests covering: - All TLS profile types (Old, Intermediate, Modern, Custom) - Nil profile handling (defaults to Intermediate) - Flag override behavior - Multi-container deployment skipping - Unknown deployment handling - Error handling when APIServer resource not found Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Register withClusterTLSProfileFromAPIServer hook in the generic deployment controller for cert-manager operands. The hook is only registered when infraInformers.Applicable() returns true (OpenShift clusters), ensuring the operator remains platform-agnostic and doesn't break on non-OpenShift environments. Adds APIServer informer to the controller's watch list to trigger reconciliation when the cluster TLS profile changes. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add permissions to read apiservers resources from config.openshift.io API group. This is required for the TLS profile hook to fetch the cluster APIServer configuration. Changes: - Add apiservers to kubebuilder RBAC annotation in certmanager_controller.go - Update config/rbac/role.yaml with generated permissions - Update bundle CSV with generated permissions Without these permissions, the operator cannot fetch the cluster TLS profile and will fail to configure cert-manager operands correctly on OpenShift clusters. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add documentation for the --certificate-request-minimum-backoff-duration controller flag in the CertManagerSpec ControllerConfig examples. Also add the flag to the list of supported controller args in the validation hook to allow users to configure initial backoff duration when CertificateRequests fail. This flag controls the cert-manager trigger controller's backoff behavior, with backoff doubling on consecutive failures up to 32h. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Add period to updateWebhookClientConfig function comment to comply with godoc style guidelines. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Move tls_profile_hook.go and tls_profile_hook_test.go from pkg/controller/certmanager/ back to pkg/controller/deployment/ to fix package mismatch and compilation errors. Changes: - Move TLS profile hook files to deployment package - Export WithClusterTLSProfileFromAPIServer for use by certmanager package - Add tls_helpers.go with deployment name constants and mergeContainerArgs - Update generic_deployment_controller.go to import and use deploymentpkg - Update all test references to use capitalized function name This fixes the "found packages certmanager and deployment" error that occurred after the directory structure was changed during rebase. All unit tests now pass ✅ - pkg/controller/deployment: 94.4% coverage - pkg/controller/certmanager: 41.1% coverage - pkg/tlsprofile: 63.6% coverage Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Change the RBAC permissions for the apiservers resource from full mutating verbs (get;list;watch;create;update;patch;delete) to read-only (get;list;watch). The operator only needs to read the cluster APIServer resource to fetch the TLS security profile. It does not need to create or modify the APIServer resource, so granting mutating permissions violates the principle of least privilege. Changes: - Split the config.openshift.io RBAC marker into two separate rules - apiservers: get;list;watch (read-only) - certmanagers, clusteroperators, clusteroperators/status, infrastructures: get;list;watch;create;update;patch;delete (full access) - Regenerated config/rbac/role.yaml - Regenerated bundle/manifests CSV - Regenerated CRD manifests All unit tests pass ✅ Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Expose tls.Config construction from a resolved TLSProfileSpec so operator code (and future callers) can align outbound HTTPS with the cluster TLS security profile: minimum version, cipher suites, and explicit curve order. Signed-off-by: Super User <root@amaurya-thinkpadp1gen7.bengluru.csb> Made-with: Cursor
|
@arun717: This pull request references CM-966 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
WalkthroughAdds cluster TLS profile support for cert-manager: new tlsprofile utilities and client TLS builder, a deployment hook that applies TLS CLI args from the APIServer resource, wiring in the deployment controller and manager, RBAC to read APIServer, context-threading updates, and related unit/e2e/test and dependency bumps. ChangesCluster TLS profile integration
Sequence Diagram(s)sequenceDiagram
participant Reconciler as Cert-Manager Reconciler
participant DepCtrl as Generic Deployment Controller
participant Hook as TLS Profile Hook
participant Lister as APIServer Informer Lister
participant TLSProf as TLSProfile Utilities
Reconciler->>DepCtrl: Reconcile deployment
alt infra informers available
DepCtrl->>Hook: Invoke WithClusterTLSProfileFromAPIServer
Hook->>Lister: Lister.Get("cluster")
Lister-->>Hook: APIServer resource
Hook->>TLSProf: EffectiveSpec(TLSSecurityProfile)
TLSProf-->>Hook: TLSProfileSpec
Hook->>TLSProf: CertManagerWebhookTLSArgs / OperandMetricsTLSArgs
TLSProf-->>Hook: TLS CLI args
Hook->>Hook: mergeContainerArgs(existingArgs, extraArgs)
Hook-->>DepCtrl: Updated Deployment (container args)
end
DepCtrl-->>Reconciler: Reconciliation complete
Estimated code review effort🎯 4 (Complex) | ⏱️ ~50 minutes ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
pkg/controller/istiocsr/install_istiocsr.go (1)
55-58:⚠️ Potential issue | 🟠 MajorThread the request context through the final update.
The
ctxparameter is accepted at line 12, but the finalUpdateWithRetryat line 56 still usesr.ctx. This drops the controller-runtime deadline and cancellation for the last write, negating the benefit of accepting a request-scoped context.Proposed fix
- if err := r.UpdateWithRetry(r.ctx, istiocsr); err != nil { + if err := r.UpdateWithRetry(ctx, istiocsr); err != nil {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@pkg/controller/istiocsr/install_istiocsr.go` around lines 55 - 58, The final UpdateWithRetry call is using the controller's background context (r.ctx) instead of the request-scoped ctx parameter, losing deadline/cancellation; change the call to use the passed-in ctx (i.e., call r.UpdateWithRetry(ctx, istiocsr) when addProcessedAnnotation(istiocsr) triggers) so the update honors the request context while keeping the same error wrapping and message.test/e2e/utils_test.go (1)
1567-1574:⚠️ Potential issue | 🟡 MinorPropagate the context into the HTTP request too.
Right now
ctxis only used for the APIServer lookup. The actual GET can keep running past the polling context, which makes cancellation less predictable.🔧 Proposed fix
- req, err := http.NewRequest("GET", url, nil) + req, err := http.NewRequestWithContext(ctx, http.MethodGet, url, nil)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@test/e2e/utils_test.go` around lines 1567 - 1574, The HTTP request needs to carry the test context so cancellation propagates: replace http.NewRequest("GET", url, nil) with creating the request with ctx (use http.NewRequestWithContext or attach ctx via req.WithContext(ctx)) before calling client.Do(req) so the polling context used for APIServer lookup also cancels the actual GET; ensure the same error handling around request creation remains and pass that context-aware req into client.Do.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@pkg/tlsprofile/tlsprofile.go`:
- Around line 38-41: The current default branch silently returns the
Intermediate profile (using
configv1.TLSProfiles[configv1.TLSProfileIntermediateType]) for unknown
TLSSecurityProfile.Type values, which can mask misconfiguration; instead, in the
function handling TLSSecurityProfile.Type (referencing TLSSecurityProfile.Type,
configv1.TLSProfiles and configv1.TLSProfileIntermediateType), validate the
incoming Type and return a clear error (or an explicit validation error) when
the Type is unrecognized rather than falling back silently—include the unknown
Type value in the error message so callers can detect and handle
invalid/misconfigured TLS profile types.
---
Outside diff comments:
In `@pkg/controller/istiocsr/install_istiocsr.go`:
- Around line 55-58: The final UpdateWithRetry call is using the controller's
background context (r.ctx) instead of the request-scoped ctx parameter, losing
deadline/cancellation; change the call to use the passed-in ctx (i.e., call
r.UpdateWithRetry(ctx, istiocsr) when addProcessedAnnotation(istiocsr) triggers)
so the update honors the request context while keeping the same error wrapping
and message.
In `@test/e2e/utils_test.go`:
- Around line 1567-1574: The HTTP request needs to carry the test context so
cancellation propagates: replace http.NewRequest("GET", url, nil) with creating
the request with ctx (use http.NewRequestWithContext or attach ctx via
req.WithContext(ctx)) before calling client.Do(req) so the polling context used
for APIServer lookup also cancels the actual GET; ensure the same error handling
around request creation remains and pass that context-aware req into client.Do.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e340c8bc-5fba-4d0d-8a12-df126917250b
⛔ Files ignored due to path filters (1)
test/go.sumis excluded by!**/*.sum
📒 Files selected for processing (22)
api/operator/v1alpha1/certmanager_types.gobundle/manifests/cert-manager-operator.clusterserviceversion.yamlbundle/manifests/operator.openshift.io_certmanagers.yamlconfig/crd/bases/operator.openshift.io_certmanagers.yamlconfig/rbac/role.yamlpkg/controller/certmanager/certmanager_controller.gopkg/controller/certmanager/generic_deployment_controller.gopkg/controller/deployment/tls_helpers.gopkg/controller/deployment/tls_profile_hook.gopkg/controller/deployment/tls_profile_hook_test.gopkg/controller/istiocsr/controller.gopkg/controller/istiocsr/controller_test.gopkg/controller/istiocsr/install_instiocsr_test.gopkg/controller/istiocsr/install_istiocsr.gopkg/operator/setup_manager.gopkg/tlsprofile/client_config.gopkg/tlsprofile/client_config_test.gopkg/tlsprofile/tlsprofile.gopkg/tlsprofile/tlsprofile_test.gotest/e2e/issuer_selfsigned_ca_test.gotest/e2e/utils_test.gotest/go.mod
…intermediate type.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/tlsprofile/tlsprofile.go`:
- Around line 17-31: The built-in-profile branches currently return shallow
copies (spec := *configv1.TLSProfiles[...]) that still share the Ciphers backing
array; change those branches in EffectiveSpec so you deep-copy the spec before
returning — e.g., copy the struct into a new variable and replace spec.Ciphers
with a newly allocated slice containing copied values (same approach used in the
custom branch). Apply this to the branches referencing
configv1.TLSProfileOldType, configv1.TLSProfileIntermediateType, and
configv1.TLSProfileModernType so callers can safely mutate spec.Ciphers without
altering the global configv1.TLSProfiles.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 1e16aa2d-ea24-4a65-b766-ee05e57c82c9
📒 Files selected for processing (2)
pkg/tlsprofile/tlsprofile.gopkg/tlsprofile/tlsprofile_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
pkg/tlsprofile/tlsprofile_test.go (1)
80-97: ⚡ Quick winAdd a direct unit test for
CertManagerOperandMetricsTLSArgs.This file validates webhook TLS args, but the separate operand-metrics helper is still untested. A small test here would prevent drift between the two flag builders.
Suggested test addition
+func TestCertManagerOperandMetricsTLSArgs_joinsCiphers(t *testing.T) { + spec := &configv1.TLSProfileSpec{ + Ciphers: []string{"ECDHE-RSA-AES128-GCM-SHA256"}, + MinTLSVersion: configv1.VersionTLS12, + } + args := CertManagerOperandMetricsTLSArgs(spec) + argMap := map[string]string{} + for _, a := range args { + parts := strings.SplitN(a, "=", 2) + if len(parts) != 2 { + t.Fatalf("bad arg %q", a) + } + argMap[parts[0]] = parts[1] + } + if argMap["--metrics-tls-min-version"] != string(configv1.VersionTLS12) { + t.Fatalf("unexpected metrics min version: %q", argMap["--metrics-tls-min-version"]) + } + if !strings.Contains(argMap["--metrics-tls-cipher-suites"], "TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256") { + t.Fatalf("unexpected metrics tls ciphers: %q", argMap["--metrics-tls-cipher-suites"]) + } +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/tlsprofile/tlsprofile_test.go` around lines 80 - 97, Add a new unit test in tlsprofile_test.go analogous to TestCertManagerWebhookTLSArgs that directly exercises CertManagerOperandMetricsTLSArgs: construct a configv1.TLSProfileSpec with Ciphers and MinTLSVersion, call CertManagerOperandMetricsTLSArgs(spec), parse the returned args into a map like the existing test, and assert that the generated --tls-cipher-suites (or the operand-metrics-specific flag produced by CertManagerOperandMetricsTLSArgs) contains the expected normalized cipher name (e.g. TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); ensure the test name clearly references CertManagerOperandMetricsTLSArgs and mirrors the parsing/assertion style used in TestCertManagerWebhookTLSArgs.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@pkg/tlsprofile/tlsprofile.go`:
- Around line 47-66: Both exported helpers CertManagerWebhookTLSArgs and
CertManagerOperandMetricsTLSArgs currently dereference the incoming spec and
will panic on a nil input; add a nil guard at the top of each function (e.g., if
spec == nil) and return an empty []string{} immediately to avoid a runtime panic
before calling joinIANACiphers(spec.Ciphers) or converting spec.MinTLSVersion to
string. Ensure this nil-check is placed before any use of spec so both functions
safely handle nil callers.
---
Nitpick comments:
In `@pkg/tlsprofile/tlsprofile_test.go`:
- Around line 80-97: Add a new unit test in tlsprofile_test.go analogous to
TestCertManagerWebhookTLSArgs that directly exercises
CertManagerOperandMetricsTLSArgs: construct a configv1.TLSProfileSpec with
Ciphers and MinTLSVersion, call CertManagerOperandMetricsTLSArgs(spec), parse
the returned args into a map like the existing test, and assert that the
generated --tls-cipher-suites (or the operand-metrics-specific flag produced by
CertManagerOperandMetricsTLSArgs) contains the expected normalized cipher name
(e.g. TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256); ensure the test name clearly
references CertManagerOperandMetricsTLSArgs and mirrors the parsing/assertion
style used in TestCertManagerWebhookTLSArgs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: 54a08f14-de28-4f16-b6fe-502a900f8a11
📒 Files selected for processing (2)
pkg/tlsprofile/tlsprofile.gopkg/tlsprofile/tlsprofile_test.go
|
@arun717 i wonder if we should take tlsAdherence into consideration after re-reading the EP. |
There was a problem hiding this comment.
🧹 Nitpick comments (2)
pkg/tlsprofile/tlsprofile_test.go (2)
75-77: ⚡ Quick winStrengthen custom-profile assertions beyond count only.
len(spec.Ciphers) == 2can pass even if cipher content/order or min TLS version is wrong. Assert exact values to catch regressions earlier.Proposed change
if len(spec.Ciphers) != 2 { t.Fatalf("cipher count: %d", len(spec.Ciphers)) } + if spec.Ciphers[0] != "TLS_AES_128_GCM_SHA256" || spec.Ciphers[1] != "ECDHE-RSA-AES128-GCM-SHA256" { + t.Fatalf("unexpected ciphers: %#v", spec.Ciphers) + } + if spec.MinTLSVersion != configv1.VersionTLS12 { + t.Fatalf("unexpected min TLS version: %q", spec.MinTLSVersion) + } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/tlsprofile/tlsprofile_test.go` around lines 75 - 77, The test currently only checks len(spec.Ciphers) which misses content/order and other fields; update the test that inspects spec (e.g., spec.Ciphers and spec.MinTLSVersion) to assert exact expected values and order (compare the cipher slice to the expected slice and compare spec.MinTLSVersion to the expected constant) so the test fails on wrong ciphers or TLS version—locate the assertions around spec.Ciphers in tlsprofile_test.go and replace the length-only check with explicit equality checks for the slice contents and the MinTLSVersion field.
15-32: ⚡ Quick winIsolate potential global mutation to keep failures contained.
At Line 21, if deep-copy behavior regresses, this test can mutate shared
TLSProfilesstate and affect later tests. Add cleanup to restore the original value.Proposed change
func TestEffectiveSpec_builtinDeepCopiesCiphers(t *testing.T) { ref := configv1.TLSProfiles[configv1.TLSProfileIntermediateType] if len(ref.Ciphers) == 0 { t.Fatal("expected non-empty intermediate cipher list") } origFirst := ref.Ciphers[0] + t.Cleanup(func() { + ref.Ciphers[0] = origFirst + }) spec, err := EffectiveSpec(&configv1.TLSSecurityProfile{Type: configv1.TLSProfileIntermediateType}) if err != nil { t.Fatal(err) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@pkg/tlsprofile/tlsprofile_test.go` around lines 15 - 32, The test mutates a cipher in spec which may leak into global TLSProfiles (ref), so capture the original value (origFirst) then immediately schedule a cleanup that restores ref.Ciphers[0] = origFirst using defer so the global state is reset even on failures; locate the setup that captures origFirst and add a defer to reassign ref.Ciphers[0] back to origFirst, keeping the rest of the assertions (EffectiveSpec, spec/spec2) unchanged.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Nitpick comments:
In `@pkg/tlsprofile/tlsprofile_test.go`:
- Around line 75-77: The test currently only checks len(spec.Ciphers) which
misses content/order and other fields; update the test that inspects spec (e.g.,
spec.Ciphers and spec.MinTLSVersion) to assert exact expected values and order
(compare the cipher slice to the expected slice and compare spec.MinTLSVersion
to the expected constant) so the test fails on wrong ciphers or TLS
version—locate the assertions around spec.Ciphers in tlsprofile_test.go and
replace the length-only check with explicit equality checks for the slice
contents and the MinTLSVersion field.
- Around line 15-32: The test mutates a cipher in spec which may leak into
global TLSProfiles (ref), so capture the original value (origFirst) then
immediately schedule a cleanup that restores ref.Ciphers[0] = origFirst using
defer so the global state is reset even on failures; locate the setup that
captures origFirst and add a defer to reassign ref.Ciphers[0] back to origFirst,
keeping the rest of the assertions (EffectiveSpec, spec/spec2) unchanged.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e6b9e61b-0839-459e-ab45-50a5f6c9970d
📒 Files selected for processing (2)
pkg/tlsprofile/tlsprofile.gopkg/tlsprofile/tlsprofile_test.go
✅ Files skipped from review due to trivial changes (1)
- pkg/tlsprofile/tlsprofile.go
…d library-go with release-4.22
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: arun717 The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
180bb2c to
04d6f88
Compare
…v0.35.1 and running make bundle locally.
|
@arun717: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Same description as [https://github.com//pull/409]
Removed the changes dependent on istio-csr upstream changes.
Summary by CodeRabbit
New Features
Improvements
Documentation
Tests