Skip to content

feat(hypershift/gcp): add e2e-v2 GKE workflow and presubmit job#77007

Merged
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
cristianoveiga:cntrlplane-2904
Apr 9, 2026
Merged

feat(hypershift/gcp): add e2e-v2 GKE workflow and presubmit job#77007
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
cristianoveiga:cntrlplane-2904

Conversation

@cristianoveiga
Copy link
Copy Markdown
Contributor

@cristianoveiga cristianoveiga commented Mar 27, 2026

Summary

CNTRLPLANE-2904: Add a v2 e2e CI workflow for HyperShift GCP on GKE.

  • hypershift-gcp-create chain — creates a GCP HostedCluster using the hypershift create cluster gcp CLI and waits for version rollout
  • hypershift-gcp-destroy chain — destroys the HostedCluster CR with a grace period for ExternalDNS cleanup
  • hypershift-gcp-gke-e2e-v2 workflow — reuses all v1 pre steps (GKE provisioning, prerequisites, operator install, WIF/network setup) and adds the new create/destroy chains with the shared hypershift-e2e-v2 test chain
  • e2e-v2-gke presubmit — optional job triggered on GCP-related file changes

Also includes the DNS zone name fix and error surfacing from #76993.

v1 vs v2 differences

v1 (e2e-gke) v2 (e2e-v2-gke)
Cluster lifecycle Managed internally by test framework Explicit create/destroy chains
Test binary hack/ci-test-e2e.sh (v1) bin/test-e2e-v2 (Ginkgo v2)
Cleanup ExternalDNS killed with GKE cluster HostedCluster CR deleted first, ExternalDNS cleans up DNS

Dependencies

Test plan

  • make update succeeds
  • Step registry validation passes
  • Rehearsal: /pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

🤖 Generated with Claude Code

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@cristianoveiga cristianoveiga marked this pull request as ready for review March 27, 2026 22:00
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/retest-required

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@cristianoveiga cristianoveiga force-pushed the cntrlplane-2904 branch 2 times, most recently from b8004ec to 0387ab2 Compare March 28, 2026 15:12
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-ci openshift-ci Bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 30, 2026
@openshift-ci-robot openshift-ci-robot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Mar 30, 2026
@openshift-ci openshift-ci Bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 1, 2026
@openshift-ci-robot openshift-ci-robot removed the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Apr 1, 2026
CNTRLPLANE-2904: Add a v2 e2e CI workflow for HyperShift GCP on GKE.

- hypershift-gcp-create chain: creates a GCP HostedCluster using the
  hypershift CLI and waits for version rollout
- hypershift-gcp-destroy chain: destroys the HostedCluster CR with
  grace period for ExternalDNS cleanup
- hypershift-gcp-gke-e2e-v2 workflow: reuses v1 pre steps with new
  create/destroy chains and shared hypershift-e2e-v2 test chain
- e2e-v2-gke presubmit: optional job triggered on GCP file changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@openshift-ci-robot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@devguyio
Copy link
Copy Markdown
Contributor

devguyio commented Apr 8, 2026

/cc @devguyio

@openshift-ci openshift-ci Bot requested a review from devguyio April 8, 2026 10:51
Copy link
Copy Markdown
Member

@cblecker cblecker left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few nits on documentation accuracy, a missing file guard in the destroy chain, and some shellcheck findings. Nothing blocking — nice work on the v2 workflow structure.

commands: |-
set -exuo pipefail

CLUSTER_NAME="$(cat ${SHARED_DIR}/cluster-name)"
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: If hypershift create cluster gcp fails before writing ${SHARED_DIR}/cluster-name, this cat will error under set -euo pipefail and the destroy step aborts without attempting cleanup.

Since the step has best_effort: true, it won't fail the job, but it will produce a noisy error. The existing hypershift-gcp-gke-deprovision-commands.sh demonstrates a guard pattern:

if [[ ! -f "${SHARED_DIR}/cluster-name" ]]; then
    echo "WARNING: cluster-name not found — create step may not have completed. Skipping destroy."
    exit 0
fi

Also, shellcheck flags ${SHARED_DIR}/cluster-name as unquoted here (SC2086) — should be "${SHARED_DIR}/cluster-name".

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed

2. hypershift-gcp-gke-provision: Create GCP projects, VPC, and GKE cluster
3. hypershift-gcp-gke-prerequisites: Install CRDs and cert-manager on GKE
4. hypershift-install: Install the HyperShift operator
5. hypershift-gcp-control-plane-setup: Configure WIF and webhook TLS
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The hypershift-gcp-control-plane-setup step configures Workload Identity for PSC operator and ExternalDNS — there's no webhook TLS configuration in that step.

Suggestion:

5. hypershift-gcp-control-plane-setup: Configure GCP Workload Identity for PSC and ExternalDNS

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch - this was outdated, indeed. Fixed.

documentation: "Number of nodes for the hosted cluster NodePool."
- name: HYPERSHIFT_GCP_BOOT_IMAGE
default: ""
documentation: "GCP boot image for hosted cluster nodes (RHCOS image path). If empty, uses the default from the release image."
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: The doc says "If empty, uses the default from the release image" but the code on line 49 falls back to a hardcoded RHCOS image path (projects/rhcos-cloud/global/images/rhcos-9-6-20250925-0-gcp-x86-64), not a dynamic default from the release image.

Suggestion to match reality:

documentation: "GCP boot image for hosted cluster nodes (RHCOS image path). If empty, falls back to a pinned default (see TODO GCP-440)."

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.

'

if [[ $? -ne 0 ]]; then
cat << EOF > ${ARTIFACT_DIR}/junit_hosted_cluster.xml
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit (shellcheck SC2086): ${ARTIFACT_DIR} is unquoted in the heredoc redirect here and on line 119. Same for hostedcluster/${HC_NAME} on lines 110-111.

Low risk in CI (these paths won't contain spaces), but quoting would be more correct:

cat << EOF > "${ARTIFACT_DIR}/junit_hosted_cluster.xml"

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Addressed.


Reads infrastructure configuration from SHARED_DIR files created by
hypershift-gcp-gke-provision, hypershift-gcp-hosted-cluster-setup,
and hypershift-gcp-control-plane-setup.
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: hypershift-gcp-control-plane-setup doesn't write any SHARED_DIR files consumed by this chain — it only annotates ServiceAccounts and restarts deployments. All the files read here (wif-*, *-sa, hc-vpc-name, hc-subnet-name, etc.) come from hypershift-gcp-hosted-cluster-setup, while gcp-region and hosted-cluster-project-id come from hypershift-gcp-gke-provision.

Suggestion:

Reads infrastructure configuration from SHARED_DIR files created by
hypershift-gcp-gke-provision and hypershift-gcp-hosted-cluster-setup.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

- Add guard for missing cluster-name file in destroy chain
- Fix quoting for SHARED_DIR and ARTIFACT_DIR paths (SC2086)
- Fix control-plane-setup description (WIF for PSC/ExternalDNS, not webhook TLS)
- Fix HYPERSHIFT_GCP_BOOT_IMAGE docs to reference pinned default and TODO GCP-440
- Remove incorrect hypershift-gcp-control-plane-setup SHARED_DIR attribution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse pull-ci-openshift-hypershift-main-e2e-v2-gke

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

[REHEARSALNOTIFIER]
@cristianoveiga: the pj-rehearse plugin accommodates running rehearsal tests for the changes in this PR. Expand 'Interacting with pj-rehearse' for usage details. The following rehearsable tests have been affected by this change:

Test name Repo Type Reason
pull-ci-openshift-hypershift-main-e2e-kubevirt-aws-ovn openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-hypershift-main-e2e-v2-gke openshift/hypershift presubmit Presubmit changed
pull-ci-openshift-hypershift-main-okd-scos-e2e-aws-ovn openshift/hypershift presubmit Presubmit changed
Interacting with pj-rehearse

Comment: /pj-rehearse to run up to 5 rehearsals
Comment: /pj-rehearse skip to opt-out of rehearsals
Comment: /pj-rehearse {test-name}, with each test separated by a space, to run one or more specific rehearsals
Comment: /pj-rehearse more to run up to 10 rehearsals
Comment: /pj-rehearse max to run up to 25 rehearsals
Comment: /pj-rehearse auto-ack to run up to 5 rehearsals, and add the rehearsals-ack label on success
Comment: /pj-rehearse list to get an up-to-date list of affected jobs
Comment: /pj-rehearse abort to abort all active rehearsals
Comment: /pj-rehearse network-access-allowed to allow rehearsals of tests that have the restrict_network_access field set to false. This must be executed by an openshift org member who is not the PR author

Once you are satisfied with the results of the rehearsals, comment: /pj-rehearse ack to unblock merge. When the rehearsals-ack label is present on your PR, merge will no longer be blocked by rehearsals.
If you would like the rehearsals-ack label removed, comment: /pj-rehearse reject to re-block merging.

@cristianoveiga
Copy link
Copy Markdown
Contributor Author

/pj-rehearse ack

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

@cristianoveiga: now processing your pj-rehearse request. Please allow up to 10 minutes for jobs to trigger or cancel.

@openshift-merge-bot openshift-merge-bot Bot added the rehearsals-ack Signifies that rehearsal jobs have been acknowledged label Apr 9, 2026
@cblecker
Copy link
Copy Markdown
Member

cblecker commented Apr 9, 2026

/lgtm
/approve

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 9, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: cblecker, cristianoveiga

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Apr 9, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 9, 2026

@cristianoveiga: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit f625a66 into openshift:main Apr 9, 2026
18 checks passed
HarshwardhanPatil07 pushed a commit to HarshwardhanPatil07/release that referenced this pull request Apr 23, 2026
…shift#77007)

* feat(hypershift/gcp): add e2e-v2 GKE workflow and presubmit job

CNTRLPLANE-2904: Add a v2 e2e CI workflow for HyperShift GCP on GKE.

- hypershift-gcp-create chain: creates a GCP HostedCluster using the
  hypershift CLI and waits for version rollout
- hypershift-gcp-destroy chain: destroys the HostedCluster CR with
  grace period for ExternalDNS cleanup
- hypershift-gcp-gke-e2e-v2 workflow: reuses v1 pre steps with new
  create/destroy chains and shared hypershift-e2e-v2 test chain
- e2e-v2-gke presubmit: optional job triggered on GCP file changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hypershift/gcp): address review feedback on e2e-v2 GKE workflow

- Add guard for missing cluster-name file in destroy chain
- Fix quoting for SHARED_DIR and ARTIFACT_DIR paths (SC2086)
- Fix control-plane-setup description (WIF for PSC/ExternalDNS, not webhook TLS)
- Fix HYPERSHIFT_GCP_BOOT_IMAGE docs to reference pinned default and TODO GCP-440
- Remove incorrect hypershift-gcp-control-plane-setup SHARED_DIR attribution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Prucek pushed a commit to Prucek/release that referenced this pull request Apr 29, 2026
…shift#77007)

* feat(hypershift/gcp): add e2e-v2 GKE workflow and presubmit job

CNTRLPLANE-2904: Add a v2 e2e CI workflow for HyperShift GCP on GKE.

- hypershift-gcp-create chain: creates a GCP HostedCluster using the
  hypershift CLI and waits for version rollout
- hypershift-gcp-destroy chain: destroys the HostedCluster CR with
  grace period for ExternalDNS cleanup
- hypershift-gcp-gke-e2e-v2 workflow: reuses v1 pre steps with new
  create/destroy chains and shared hypershift-e2e-v2 test chain
- e2e-v2-gke presubmit: optional job triggered on GCP file changes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* fix(hypershift/gcp): address review feedback on e2e-v2 GKE workflow

- Add guard for missing cluster-name file in destroy chain
- Fix quoting for SHARED_DIR and ARTIFACT_DIR paths (SC2086)
- Fix control-plane-setup description (WIF for PSC/ExternalDNS, not webhook TLS)
- Fix HYPERSHIFT_GCP_BOOT_IMAGE docs to reference pinned default and TODO GCP-440
- Remove incorrect hypershift-gcp-control-plane-setup SHARED_DIR attribution

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged. rehearsals-ack Signifies that rehearsal jobs have been acknowledged

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants