OCPSTRAT-1986: Enhancement: API-driven Azure topology and Swift private connectivity for HyperShift#1985
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
…yperShift Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
00b0101 to
f7111cd
Compare
|
@muraee: This pull request references OCPSTRAT-1986 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the feature to target either version "5.0." or "openshift-5.0.", but it targets "openshift-4.22" instead. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
| `omitzero`; old code ignores unknown fields during | ||
| deserialization. The `Type` value "Swift" is preserved as | ||
| a string. | ||
| - **N-1 round-trip risk:** Old code re-serializing an object |
There was a problem hiding this comment.
Wouldn't an API version be the way to mitigate this? Old client writing old API version doesn't have their intent squash fields they don't know about
There was a problem hiding this comment.
Can you clarify what you mean here? HyperShift currently has a single API version (v1beta1) with no conversion webhooks — are you suggesting this enhancement should introduce a new API version?
- Clarify "private-only" scope (control-plane only, not openshift-ingress) - Expand scope to include all per-cluster isAroHCP() call sites (CNO, CCM, storage, registry) - Add exact CEL validation rules as kubebuilder markers on types - Add AzurePlatformSpec-level CEL rule for topology→private.type requirement - Change validation from operator-side to CEL admission - Add PublicEndpointExposed status condition with full API spec - Expand Day-2 transition section with HO→HCP propagation, reverse flow, and transition scope - Add CPO rollout considerations including backport scenario - Add Phase 2 verification query and Phase 3 CPO version prerequisite - Fix annotation precedence inconsistency (API field takes precedence, annotation is fallback) - Fix CRD schema ordering to consistently say "before" (not "before or simultaneously") - Update graduation criteria for expanded scope Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
| type AzureSwiftSpec struct { | ||
| // +required | ||
| // +kubebuilder:validation:MinLength=1 | ||
| PodNetworkInstance string `json:"podNetworkInstance"` |
There was a problem hiding this comment.
Would changing this on a live cluster break Swift networking? Should it be immutable?
There was a problem hiding this comment.
Changing it doesn't permanently break the cluster — the router pod restarts and gets new IPs from the new pod network instance. If the new one is valid, connectivity resumes after the restart. It causes a temporary disruption but is recoverable, so we're keeping it mutable.
There was a problem hiding this comment.
it's easier to make it immutable later, if we decide to.
There was a problem hiding this comment.
it's easier to make it immutable later, if we decide to.
This would be a breaking change, while relaxing from immutable to mutable is allowed. Let's remove ambiguity and decide now what it needs to be.
@bennerv is there a known case for this to be mutable? let's make it immutable otherwise.
Please include CEL for this here, make sure to include prevention for unsetting at the parent level. We can also include the field docs already here for clarity
|
/lgtm |
| version carrying the updated CPO. Until then, the N-1 CPO | ||
| continues to function via the legacy annotation fallback. | ||
| If clusters cannot be upgraded to the new minor stream in a | ||
| reasonable timeframe, a CPO backport to the previous minor |
There was a problem hiding this comment.
can we please specify concrete ocp versions and timelines we need to support in ARO?
Also would it be possible we express intent in ARO-RP/Maestro for new HCs to be created without the annotation and no-op on existing so they remain on the old path until they upgrade and we don't care?
There was a problem hiding this comment.
Phase 1 is developed in 4.23 and backported to 4.22. Phase 3 cleanup proceeds once all clusters are running CPO 4.22+.
For the migration strategy: yes, for new HCs ARO-RP/Maestro sets the API fields from day one and skips the annotation. For existing HCs, ARO-RP/Maestro can set the API fields at any time — the old CPO ignores them and continues using the annotation. Once the cluster upgrades to 4.22+ the new CPO starts consuming the API fields. No active coordinated migration needed.
e264200 to
766513c
Compare
|
New changes are detected. LGTM label has been removed. |
- Call out CEL immutability rule change as deliberate validation loosening - Clarify UseSharedIngressHCP/HC naming distinct from isAroHCP() in main.go - Add envtests for CEL rule validation to integration test plan Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
766513c to
bfa8ff7
Compare
|
@muraee: all tests passed! Full PR test history. Your PR dashboard. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Summary
IsAroHCP()/ env-var checks with API-driven per-cluster decisions usingTopologyandPrivatefields on HostedCluster/HostedControlPlaneAzurePrivateTypeenum withSwiftvariant and introduceAzureSwiftSpecto replace theSwiftPodNetworkInstanceAnnotationannotationTopology=Private), causing shared ingress to drop all public endpoint exposureTest plan
🤖 Generated with Claude Code