OCPBUGS-85243: Set aws-load-balancer-scheme on public HCP router service#8458
OCPBUGS-85243: Set aws-load-balancer-scheme on public HCP router service#8458typeid wants to merge 1 commit intoopenshift:mainfrom
Conversation
|
Pipeline controller notification For optional jobs, comment This repository is configured in: LGTM mode |
|
Skipping CI for Draft Pull Request. |
|
@typeid: This pull request references Jira Issue OCPBUGS-85243, which is invalid:
Comment The bug has been updated to refer to the pull request using the external bug tracker. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Repository: openshift/coderabbit/.coderabbit.yaml Review profile: CHILL Plan: Enterprise Run ID: 📒 Files selected for processing (5)
✅ Files skipped from review due to trivial changes (1)
🚧 Files skipped from review as they are similar to previous changes (3)
WalkthroughRouter service reconciliation now manages AWS load balancer scheme: for external routers it sets ChangesAWS Router Scheme Handling
Sequence Diagram(s)sequenceDiagram
participant Reconciler
participant KubeAPI as Kubernetes API (Service)
participant AWS as AWS Load Balancer
Reconciler->>KubeAPI: Get router Service
alt platform == AWS and internal == false
Reconciler->>KubeAPI: Remove aws-load-balancer-internal annotation (if present)
Reconciler->>KubeAPI: Set aws-load-balancer-scheme = "internet-facing"
else platform == AWS and internal == true
Reconciler->>KubeAPI: Remove aws-load-balancer-scheme annotation (if present)
Reconciler->>KubeAPI: Ensure aws-load-balancer-internal = "true"
end
Reconciler->>KubeAPI: Update Service
KubeAPI->>AWS: AWS provisions/updates LB based on annotations
AWS-->>KubeAPI: LB status
KubeAPI-->>Reconciler: Service updated/status observed
🎯 3 (Moderate) | ⏱️ ~20 minutes 🚥 Pre-merge checks | ✅ 10 | ❌ 2❌ Failed checks (2 warnings)
✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
e53b716 to
c29c038
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #8458 +/- ##
==========================================
+ Coverage 37.50% 37.54% +0.03%
==========================================
Files 751 751
Lines 91992 92034 +42
==========================================
+ Hits 34505 34552 +47
+ Misses 54844 54841 -3
+ Partials 2643 2641 -2
... and 4 files with indirect coverage changes
Flags with carried forward coverage won't be shown. Click here to find out more. 🚀 New features to boost your workflow:
|
c29c038 to
8050e6e
Compare
There was a problem hiding this comment.
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.go (1)
629-633:⚠️ Potential issue | 🟠 Major | ⚡ Quick winVerify that
CompareWithFixturegolden files have been regenerated with the new AWS load balancer annotation.The fixture files in
control-plane-operator/controllers/hostedcontrolplane/infra/testdata/forAWS_Public_Route,AWS_PublicAndPrivate_Route,AWS_Public_KAS_LoadBalancer, andAWS_PublicAndPrivate_KAS_LoadBalancertest cases do not currently contain theservice.beta.kubernetes.io/aws-load-balancer-scheme: internet-facingannotation on their router services. However, the test code defines the public router service with this annotation. Whentestutil.CompareWithFixtureruns, it will compare generated resources (which now include the annotation) against these fixture files (which do not), causing test failures. The fixtures must be regenerated to include the new annotation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.go` around lines 629 - 633, The test failures are due to golden fixtures missing the new AWS annotation on router Services; regenerate the fixtures used by testutil.CompareWithFixture so the files under the testdata for the cases AWS_Public_Route, AWS_PublicAndPrivate_Route, AWS_Public_KAS_LoadBalancer and AWS_PublicAndPrivate_KAS_LoadBalancer include the service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing annotation on the router Service objects; run the same test/golden-generation workflow you used previously (the code path that builds resources via collectResources and validates via testutil.CompareWithFixture) to update the fixture files so the generated resources match the expected fixtures.
🧹 Nitpick comments (1)
control-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go (1)
62-103: ⚡ Quick winConsider adding a reverse-direction test for the external→internal annotation cleanup.
TestReconcileRouterService_SwitchFromInternalToExternalcovers the internal→external path. The symmetric case — where a service object already carriesaws-load-balancer-scheme: internet-facingand is then reconciled asinternal=true— is never explicitly asserted. That covers thedeleteatrouter.goline 32 in a meaningful way (currentlyTestReconcileRouterServiceAnnotationsstarts with an empty service so thedeleteis a no-op there).🧪 Suggested additional test function
+// When switching a service from external to internal +// it should remove the aws-load-balancer-scheme annotation. +func TestReconcileRouterService_SwitchFromExternalToInternal(t *testing.T) { + hcp := &hyperv1.HostedControlPlane{} + hcp.Spec.Platform.Type = hyperv1.AWSPlatform + + svc := &corev1.Service{} + svc.Annotations = map[string]string{ + "service.beta.kubernetes.io/aws-load-balancer-scheme": "internet-facing", + } + + if err := ReconcileRouterService(svc, true /* internal */, false /* cross-zone */, hcp); err != nil { + t.Fatalf("ReconcileRouterService returned error: %v", err) + } + + if _, exists := svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"]; exists { + t.Fatalf("expected aws-load-balancer-scheme to be removed after switching to internal") + } + if got := svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"]; got != "true" { + t.Fatalf("expected aws-load-balancer-internal to be 'true', got %q", got) + } +}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@control-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go` around lines 62 - 103, Add a symmetric test that starts with a Service already annotated "service.beta.kubernetes.io/aws-load-balancer-scheme":"internet-facing", call ReconcileRouterService(svc, true /* internal */, true /* cross-zone */, hcp) and assert that the scheme annotation is removed and the "service.beta.kubernetes.io/aws-load-balancer-internal" annotation is now present (e.g. set to "true"); name it TestReconcileRouterService_SwitchFromExternalToInternal to mirror TestReconcileRouterService_SwitchFromInternalToExternal and ensure this exercises the code path in ReconcileRouterService that deletes the scheme annotation and sets the internal annotation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Outside diff comments:
In `@control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.go`:
- Around line 629-633: The test failures are due to golden fixtures missing the
new AWS annotation on router Services; regenerate the fixtures used by
testutil.CompareWithFixture so the files under the testdata for the cases
AWS_Public_Route, AWS_PublicAndPrivate_Route, AWS_Public_KAS_LoadBalancer and
AWS_PublicAndPrivate_KAS_LoadBalancer include the
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facing annotation
on the router Service objects; run the same test/golden-generation workflow you
used previously (the code path that builds resources via collectResources and
validates via testutil.CompareWithFixture) to update the fixture files so the
generated resources match the expected fixtures.
---
Nitpick comments:
In
`@control-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go`:
- Around line 62-103: Add a symmetric test that starts with a Service already
annotated
"service.beta.kubernetes.io/aws-load-balancer-scheme":"internet-facing", call
ReconcileRouterService(svc, true /* internal */, true /* cross-zone */, hcp) and
assert that the scheme annotation is removed and the
"service.beta.kubernetes.io/aws-load-balancer-internal" annotation is now
present (e.g. set to "true"); name it
TestReconcileRouterService_SwitchFromExternalToInternal to mirror
TestReconcileRouterService_SwitchFromInternalToExternal and ensure this
exercises the code path in ReconcileRouterService that deletes the scheme
annotation and sets the internal annotation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: e3090993-8f32-4a26-a773-3c7e60c6300a
📒 Files selected for processing (3)
control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.gocontrol-plane-operator/controllers/hostedcontrolplane/ingress/router.gocontrol-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@control-plane-operator/controllers/hostedcontrolplane/ingress/router.go`:
- Around line 30-37: The svc annotation handling in router.go (inside the
reconcile that sets svc.Annotations based on the internal boolean) only adds the
desired AWS annotation but does not remove the opposite key, which leaves stale
mutually-exclusive keys (service.beta.kubernetes.io/aws-load-balancer-internal
vs service.beta.kubernetes.io/aws-load-balancer-scheme) after an
internal↔external flip; update the logic in that reconcile block (where svc and
internal are used) so that when internal is true you set/reset
"service.beta.kubernetes.io/aws-load-balancer-internal"="true" and explicitly
delete "service.beta.kubernetes.io/aws-load-balancer-scheme", and when internal
is false you set
"service.beta.kubernetes.io/aws-load-balancer-scheme"="internet-facing" and
explicitly delete "service.beta.kubernetes.io/aws-load-balancer-internal" to
ensure no stale mutually-exclusive annotations remain.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: fbda403e-5c8f-4819-ae99-29888467b063
⛔ Files ignored due to path filters (2)
control-plane-operator/controllers/hostedcontrolplane/infra/testdata/zz_fixture_TestReconcileInfrastructure_AWS_PublicAndPrivate_Route.yamlis excluded by!**/testdata/**control-plane-operator/controllers/hostedcontrolplane/infra/testdata/zz_fixture_TestReconcileInfrastructure_AWS_Public_Route.yamlis excluded by!**/testdata/**
📒 Files selected for processing (3)
control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.gocontrol-plane-operator/controllers/hostedcontrolplane/ingress/router.gocontrol-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go
🚧 Files skipped from review as they are similar to previous changes (2)
- control-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go
- control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.go
8050e6e to
b891f34
Compare
|
/jira refresh |
|
@typeid: This pull request references Jira Issue OCPBUGS-85243, which is invalid:
Comment DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
There was a problem hiding this comment.
♻️ Duplicate comments (1)
control-plane-operator/controllers/hostedcontrolplane/ingress/router.go (1)
30-38:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove stale mutually-exclusive AWS annotations during reconciliation.
Line 31 and Line 38 set opposite-mode annotations, but neither branch removes the other key. This can leave conflicting annotations on pre-existing Services and cause incorrect NLB exposure after mode changes.
Proposed fix
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-type"] = "nlb" if internal { + delete(svc.Annotations, "service.beta.kubernetes.io/aws-load-balancer-scheme") svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"] = "true" } else { + delete(svc.Annotations, "service.beta.kubernetes.io/aws-load-balancer-internal") // The AWS Load Balancer Controller (used on EKS) defaults to scheme=internal: // https://kubernetes-sigs.github.io/aws-load-balancer-controller/v2.4/guide/service/annotations/ // The in-tree AWS cloud provider (used on OpenShift) defaults to internet-facing: // https://cloud-provider-aws.sigs.k8s.io/service_controller/ // Set the annotation explicitly so the public router works on both. svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"] = "internet-facing" }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@control-plane-operator/controllers/hostedcontrolplane/ingress/router.go` around lines 30 - 38, The reconciliation currently sets svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"] = "true" in the internal branch and svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"] = "internet-facing" in the public branch but never removes the opposite key, leaving conflicting annotations on pre-existing Services; update the code in router.go (the block that mutates svc.Annotations) so that when setting the internal annotation you also delete svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"], and when setting the scheme you delete svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"], ensuring the Service has only the correct AWS annotation after reconciliation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@control-plane-operator/controllers/hostedcontrolplane/ingress/router.go`:
- Around line 30-38: The reconciliation currently sets
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"] =
"true" in the internal branch and
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"] =
"internet-facing" in the public branch but never removes the opposite key,
leaving conflicting annotations on pre-existing Services; update the code in
router.go (the block that mutates svc.Annotations) so that when setting the
internal annotation you also delete
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"], and when
setting the scheme you delete
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"],
ensuring the Service has only the correct AWS annotation after reconciliation.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Repository YAML (base), Central YAML (inherited)
Review profile: CHILL
Plan: Enterprise
Run ID: 73e026e2-d497-42d1-8885-c0b1c9286c69
⛔ Files ignored due to path filters (2)
control-plane-operator/controllers/hostedcontrolplane/infra/testdata/zz_fixture_TestReconcileInfrastructure_AWS_PublicAndPrivate_Route.yamlis excluded by!**/testdata/**control-plane-operator/controllers/hostedcontrolplane/infra/testdata/zz_fixture_TestReconcileInfrastructure_AWS_Public_Route.yamlis excluded by!**/testdata/**
📒 Files selected for processing (3)
control-plane-operator/controllers/hostedcontrolplane/infra/infra_test.gocontrol-plane-operator/controllers/hostedcontrolplane/ingress/router.gocontrol-plane-operator/controllers/hostedcontrolplane/ingress/router_test.go
|
/jira refresh |
|
@typeid: This pull request references Jira Issue OCPBUGS-85243, which is valid. The bug has been moved to the POST state. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test conformance-e2e |
|
@typeid: This pull request references Jira Issue OCPBUGS-85243, which is valid. 3 validation(s) were run on this bug
DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-conformance |
|
/test e2e-aks e2e-aws e2e-aws-upgrade-hypershift-operator e2e-azure-self-managed e2e-kubevirt-aws-ovn-reduced e2e-v2-aws |
|
/retest |
1 similar comment
|
/retest |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
Test Resultse2e-aws
e2e-aks
|
|
/retest |
1 similar comment
|
/retest |
jparrill
left a comment
There was a problem hiding this comment.
Review with two suggestions -- both non-blocking. The fix is correct and safe for merge as-is.
| // The in-tree AWS cloud provider (used on OpenShift) defaults to internet-facing: | ||
| // https://cloud-provider-aws.sigs.k8s.io/service_controller/ | ||
| // Set the annotation explicitly so the public router works on both. | ||
| svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"] = "internet-facing" |
There was a problem hiding this comment.
suggestion (non-blocking): The PR description mentions "clean up mutually exclusive annotations" but the production code only adds -- it never deletes the conflicting annotation. While the current architecture uses separate Service objects for public/private (so the transition scenario can't happen today), defensive cleanup costs nothing and follows the pattern already used in oauth/service.go:77:
if internal {
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-internal"] = "true"
delete(svc.Annotations, "service.beta.kubernetes.io/aws-load-balancer-scheme")
} else {
svc.Annotations["service.beta.kubernetes.io/aws-load-balancer-scheme"] = "internet-facing"
delete(svc.Annotations, "service.beta.kubernetes.io/aws-load-balancer-internal")
}This would also make the code match what the infra_test.go fixture helper already expects (the privateService helper does delete(s.Annotations, "...scheme")).
There was a problem hiding this comment.
Makes sense, thanks! Updated.
|
|
||
| // When reconciling an external (non-internal) AWS router service | ||
| // it should set aws-load-balancer-scheme to internet-facing. | ||
| func TestReconcileRouterService_WhenExternalAWS_ItShouldSetInternetFacingScheme(t *testing.T) { |
There was a problem hiding this comment.
nit: Good test -- gomega + Gherkin naming, positive and negative assertions. Consider adding a complementary subtest that pre-populates the conflicting annotation to verify cleanup:
t.Run("When switching from internal to external it should remove the internal annotation", func(t *testing.T) {
g := NewGomegaWithT(t)
svc := &corev1.Service{}
svc.Annotations = map[string]string{
"service.beta.kubernetes.io/aws-load-balancer-internal": "true",
}
err := ReconcileRouterService(svc, false, true, hcp)
g.Expect(err).ToNot(HaveOccurred())
g.Expect(svc.Annotations).To(HaveKeyWithValue(
"service.beta.kubernetes.io/aws-load-balancer-scheme", "internet-facing"))
g.Expect(svc.Annotations).ToNot(HaveKey(
"service.beta.kubernetes.io/aws-load-balancer-internal"))
})This would pair with the defensive delete() calls suggested in router.go.
There was a problem hiding this comment.
Added a test case for the switching back and forth.
b891f34 to
e1bf047
Compare
The AWS Load Balancer Controller (used on EKS) defaults to scheme=internal when the aws-load-balancer-scheme annotation is not set, while the in-tree AWS cloud provider (used on OpenShift) defaults to internet-facing. This causes the public HCP router NLB to be created as internal on EKS, making KAS and OAuth unreachable from outside the VPC for PublicAndPrivate clusters. Set the annotation explicitly on the public router service so it works on both OpenShift and EKS management clusters. Also clean up mutually exclusive annotations (aws-load-balancer-internal vs aws-load-balancer-scheme) when switching between internal and external. Signed-off-by: Claudio Busse <cbusse@redhat.com> Commit-Message-Assisted-by: Claude (via Claude Code)
e1bf047 to
01f2912
Compare
|
I have all the evidence needed. Here is the analysis: Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryThis is a CI infrastructure failure, not a code or test problem. The Prow CI pod for the Root CauseThe CI pod could not be scheduled because the
Of the 73 nodes in the cluster, all were excluded:
This means the small pool of nodes matching Recommendations
Evidence
|
|
/test ci/prow/security ci/prow/e2e-aks ci/prow/e2e-aws ci/prow/e2e-aws-upgrade-hypershift-operator ci/prow/e2e-azure-self-managed ci/prow/e2e-kubevirt-aws-ovn-reduced ci/prow/e2e-v2-aws |
|
/test e2e-conformance |
|
/test security e2e-aks e2e-aws e2e-aws-upgrade-hypershift-operator e2e-azure-self-managed e2e-kubevirt-aws-ovn-reduced e2e-v2-aws |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: jparrill, typeid The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@typeid: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
service.beta.kubernetes.io/aws-load-balancer-scheme: internet-facingon the public HCP router service so it creates an internet-facing NLB on both OpenShift (in-tree cloud provider) and EKS (AWS Load Balancer Controller) management clustersaws-load-balancer-internalvsaws-load-balancer-scheme) when switching between internal and externalProblem
The AWS Load Balancer Controller (used on EKS) defaults to
scheme=internalwhenaws-load-balancer-schemeis not set (docs). The in-tree AWS cloud provider (used on OpenShift) defaults tointernet-facing(docs).For
PublicAndPrivateclusters usingtype: Routewith explicit hostnames, HyperShift deploys a dedicated HCP router with a public LoadBalancer service. Without the scheme annotation, this NLB is created as internal on EKS, making KAS and OAuth unreachable from outside the VPC.Test plan
internet-facingannotation on public services🤖 Generated with Claude Code
Summary by CodeRabbit
Bug Fixes
Tests