CNTRLPLANE-2207: Upgrade to CAPI 1.11#7590
Conversation
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughThe pull request makes three configuration and dependency-related updates. It adds a new staticcheck exclusion rule in 🚥 Pre-merge checks | ✅ 10✅ Passed checks (10 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
|
Skipping CI for Draft Pull Request. |
|
@clebs: This pull request references CNTRLPLANE-2207 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
@clebs: This pull request references CNTRLPLANE-2207 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
/test e2e-aws-minimal verify |
|
/test e2e-aks |
AI Test Failure AnalysisJob: Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6 |
|
/test e2e-aws e2e-aks |
1 similar comment
|
/test e2e-aws e2e-aks |
|
/test e2e-aks |
|
/test e2e-aws e2e-aks |
|
/test e2e-aws-v2 e2e-aks-4-22 e2e-aws-4-22 e2e-kubevirt-aws-ovn-reduced e2e-aws-upgrade-hypershift-operator |
|
/test all |
|
@clebs: This pull request references CNTRLPLANE-2207 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target either version "5.0." or "openshift-5.0.", but it targets "openshift-4.22" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
|
Approving the API changes as they are gomod/vendor only updates. /approve |
|
/retest-required |
|
/approve |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: clebs, everettraven, jparrill The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test e2e-gke e2e-v2-gke |
|
As we discussed, I'll continue working on |
| return err | ||
| } | ||
|
|
||
| imageOverride, err = backwardcompat.GetBackwardCompatibleCAPIImage(cpContext, pullSecret, r.RegistryProvider.GetReleaseProvider(), releaseVersion, ImageStreamCAPI) |
There was a problem hiding this comment.
can this func declaration be removed: GetBackwardCompatibleCAPIImage?
| Name: sa.Name, | ||
| Namespace: sa.Namespace, | ||
| }, | ||
| { |
There was a problem hiding this comment.
can you point me to where the provider need access to this?
Should we instead create a dedicated targeted clusterRole with only what's needed here hypershift/control-plane-operator/controllers/hostedcontrolplane/v2/assets/capi-provider?
There was a problem hiding this comment.
It' s been a while but IIRC I added this so the provider would have access to CRDs to be able to do conversions.
I will look into creating a separate role + binding + SA for this.
There was a problem hiding this comment.
I have given a second look to this and the Role we are binding capi-provider to (hypershift-cluster-api) is already minimal. It only grants get, watch and list of CRDs and is only used by capi-manager SA and capi-provider SA.
I am not sure we would benefit of having a dedicated role here.
- Upgrade all CAPI modules to 1.11. - Update changed import paths - Silence depreciation linter errors - Update make cluster-api goal.
CAPI 1.11 defaults to v1beta2 storage. Override to v1beta1 for HyperShift compatibility. Signed-off-by: Borja Clemente <bclement@redhat.com>
Signed-off-by: Borja Clemente <bclement@redhat.com>
Signed-off-by: Borja Clemente <bclement@redhat.com>
Remove the temporary hardocded CAPI image overrides now that hypershift supports CAPI 1.11 Signed-off-by: Borja Clemente <bclement@redhat.com>
For conversion to work, the CAPI provider needs to be able to access CRDs cluster-wide to list available versions. Signed-off-by: Borja Clemente <bclement@redhat.com>
Update TestScaleFromZero to support both CAPI 1.11+ native Status.Capacity and pre-1.11 annotation-based capacity information. In CAPI 1.11, cluster-api-provider-aws now populates Status.Capacity directly on AWSMachineTemplate, making the workaround annotations unnecessary. The HyperShift controller detects this and skips setting annotations when Status.Capacity is present. The test now: - First checks AWSMachineTemplate.Status.Capacity (CAPI 1.11+) - Falls back to MachineDeployment annotations (pre-CAPI 1.11) - Logs the capacity source for debugging This makes the test backward compatible and fixes the failure in PR openshift#7590.
Setting the MinReadySeconds default to 0 explicitly on the nodepool controller causes infinite reconciliaiton due to a lossy v1beta1 -> v1beta2 conversion and flipping value between 0 and nil. Removing the explicit setting should not have any other side effect since the zero value of the field is the same. Signed-off-by: Borja Clemente <bclement@redhat.com>
…mplete check Replace the 1-second sleep workaround for OCPBUGS-77922 with a deterministic cross-check of the v1beta2 conversion-data annotation. In CAPI v1.11+, the v1beta1 UpdatedReplicas field maps from deprecated.v1beta1.updatedReplicas rather than the native upToDateReplicas, which can transiently disagree. When v1beta1 fields indicate completion, we now verify against the authoritative v1beta2 status in the conversion-data annotation before declaring complete. Jira: https://issues.redhat.com/browse/OCPBUGS-77922 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
The word uptodate and all its casing variants are a false positive on codespell. They are defined as such in CAPI. Signed-off-by: Borja Clemente <bclement@redhat.com>
| // Initialize CAPI v1beta1 conversion support. | ||
| // CAPI v1beta1 types need an apiVersionGetter to convert object references | ||
| // from v1beta2 (Hub) ContractVersionedObjectReference back to v1beta1 corev1.ObjectReference. | ||
| // The getter resolves GroupKind to the preferred API version string using the scheme. |
There was a problem hiding this comment.
which object references to core objects might we have in capi resouces?
There was a problem hiding this comment.
As discussed, this block will be removed because it actually always falls back to the default and therefore has no effect.
|
@clebs: The following tests failed, say
Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Test Failure Analysis CompleteJob Information
Test Failure AnalysisErrorSummaryBoth GKE e2e jobs fail with identical root cause: the CAPI v1.11 upgrade (PR #7590) introduces Root CauseThe PR upgrades Cluster API from v1beta1 to v1.11 (v1beta2). This introduces a CRD version conversion path: when any component requests The conversion webhook endpoint is unreachable (
Cascade of failures:
Both jobs exhibit the exact same failure mode, confirming this is a deterministic regression introduced by the CAPI 1.11 bump, not a flake. Recommendations
Evidence
|
What this PR does / why we need it:
Bumps hypershift to use CAPI
v1.11including the following tasks:v1.11compatible version ingo.mod.v1beta1as storage version.v1beta1<->v1beta2.Which issue(s) this PR fixes:
Fixes CNTRLPLANE-2207
Special notes for your reviewer:
Checklist:
Summary by CodeRabbit
Release Notes