Skip to content

GCP-410: feat(gcp): add HCCO credential propagation for GCP image registry#7896

Merged
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
cblecker:feat/gcp-410-hcco-image-registry-creds
Apr 30, 2026
Merged

GCP-410: feat(gcp): add HCCO credential propagation for GCP image registry#7896
openshift-merge-bot[bot] merged 2 commits intoopenshift:mainfrom
cblecker:feat/gcp-410-hcco-image-registry-creds

Conversation

@cblecker
Copy link
Copy Markdown
Member

@cblecker cblecker commented Mar 9, 2026

Summary

Enables the GCP image registry operator to authenticate to GCS in HyperShift-hosted
clusters by propagating a WIF credential secret into the guest cluster's
openshift-image-registry namespace.

  • Add support/gcputil package exposing BuildWorkloadIdentityCredentials and the
    associated ExternalAccountCredential types so both the HO and HCCO can share the
    credential builder without an import cycle
  • Update the HO's internal GCP platform to call gcputil.BuildWorkloadIdentityCredentials
    (identical behaviour, no functional change to the HO side)
  • Add control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp
    package with SetupOperandCredentials, following the Azure HCCO pattern; currently
    manages a single credential — openshift-image-registry/installer-cloud-credentials
    populated with a WIF JSON blob for the image registry GSA
  • Wire GCP into reconcileCloudCredentialSecrets in the HCCO resources controller
  • Guard secret upsert on namespace existence to handle the bootstrap race (namespace may
    not yet exist on first reconcile); mirrors the same pattern used for AWS
  • Skip credential creation when the ImageRegistry capability is explicitly disabled

The service_account.json data key matches the CCO-provisioned format expected by
cluster-image-registry-operator. Only image registry is wired through HCCO for GCP;
other GCP operators receive their credentials directly from HO-provisioned secrets.

Test plan

  • go test ./support/gcputil/... — unit tests covering happy path and all validation
    error branches (ProjectNumber, PoolID, ProviderID, ServiceAccountEmail empty)
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/...
    — unit tests covering: capability enabled (secret created with correct WIF JSON structure),
    capability disabled (secret skipped), target namespace absent (skipped without error)
  • make test — full unit test suite
  • E2E: create a GCP HCP cluster with imageRegistry GSA set, verify
    openshift-image-registry/installer-cloud-credentials exists in the guest cluster with
    a valid service_account.json key, and confirm the image registry operator reaches
    Available (tracked in GCP-413)

@openshift-ci-robot
Copy link
Copy Markdown

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Mar 9, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 9, 2026

@cblecker: This pull request references GCP-410 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds GCP image registry credential propagation to the Hosted Cluster Config Operator (HCCO), enabling the cluster-image-registry-operator in guest clusters to authenticate to GCS via Workload Identity Federation
  • Creates installer-cloud-credentials secret in openshift-image-registry with service_account.json containing WIF external_account JSON credentials
  • Follows the established Azure pattern with a separate gcp package and exported SetupOperandCredentials function
  • Respects ImageRegistry capability gating — secret is not created when the capability is disabled

Test plan

  • go build ./control-plane-operator/... compiles without errors
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/... — all unit tests pass
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/... — all existing tests still pass
  • make lint — no lint issues
  • E2E: Deploy a GCP hosted cluster with image registry enabled and verify the installer-cloud-credentials secret is created in openshift-image-registry with valid WIF credentials

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Mar 9, 2026

📝 Walkthrough

Walkthrough

Adds GCP Workload Identity credential construction and reconciliation for hosted clusters: introduces SetupOperandCredentials to enumerate per-credential configs, validate capability gates, build external_account JSON via gcputil.BuildWorkloadIdentityCredentials, and upsert operand Secrets (e.g., image-registry service_account.json). Adds a manifest helper for the GCP image-registry secret, integrates GCP into cloud credential reconciliation, moves an in-file builder to support/gcputil, and adds unit tests for the new flows and validation.

Changes

Cohort / File(s) Summary
GCP Credential Management
control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/gcp.go, control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/gcp_test.go
New SetupOperandCredentials function and internal reconcile logic that iterate credential configs, apply capability gates, build Workload Identity JSON via gcputil.BuildWorkloadIdentityCredentials, and upsert Secrets (service_account.json). Tests cover success, capability gating, and validation error paths.
Credential Manifest
control-plane-operator/hostedclusterconfigoperator/controllers/resources/manifests/creds.go
Adds GCPImageRegistryCloudCredsSecret() helper returning openshift-image-registry/installer-cloud-credentials Secret template.
Reconciliation Integration
control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
Imports GCP resources and adds a hyperv1.GCPPlatform branch to call gcpresources.SetupOperandCredentials during cloud credential reconciliation.
Shared GCP Utility
support/gcputil/gcputil.go, support/gcputil/gcputil_test.go
New public utility BuildWorkloadIdentityCredentials and supporting types that validate inputs (project number, pool ID, provider ID, service account email), assemble external_account JSON pointing at STS and token file, and unit tests for output and validation.
Hypershift Host GCP Changes
hypershift-operator/controllers/hostedcluster/internal/platform/gcp/gcp.go, hypershift-operator/controllers/hostedcluster/internal/platform/gcp/gcp_test.go
Removed in-file credential builder and delegated to gcputil.BuildWorkloadIdentityCredentials; corresponding unit tests for the removed builder were deleted.

Sequence Diagram(s)

sequenceDiagram
    participant Reconciler as reconcileCloudCredentialSecrets
    participant GCP as gcpresources.SetupOperandCredentials
    participant Builder as gcputil.BuildWorkloadIdentityCredentials
    participant Upsert as upsertProvider
    participant K8s as Kubernetes API

    Reconciler->>GCP: Invoke for HCP with GCP platform
    GCP->>GCP: Enumerate credential configs (capability checks)
    GCP->>Builder: Request JSON credential for serviceAccountEmail + WIF config
    Builder->>Builder: Validate fields (project#, poolID, providerID, email)
    alt valid
        Builder-->>GCP: Return external_account JSON
        GCP->>Upsert: Upsert Secret with `service_account.json`
        Upsert->>K8s: Apply Secret
        K8s-->>Upsert: Success
        Upsert-->>GCP: Success
    else invalid
        Builder-->>GCP: Return validation error
        GCP-->>Reconciler: Aggregate errors
    end
    GCP-->>Reconciler: Return result/errors
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 40.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ❓ Inconclusive PR adds standard Go unit tests, not Ginkgo tests; check specifically requests Ginkgo test code review. Clarify whether check applies to all test types or only Ginkgo tests; standard Go tests can be assessed with adapted criteria.
✅ Passed checks (3 passed)
Check name Status Explanation
Stable And Deterministic Test Names ✅ Passed Test files use standard Go testing conventions, not Ginkgo, so Ginkgo test name stability requirement does not apply.
Title check ✅ Passed The title clearly and specifically describes the main change: adding GCP credential propagation for image registry to HCCO, which is the primary objective of this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/platform/gcp PR/issue for GCP (GCPPlatform) platform and removed do-not-merge/needs-area labels Mar 9, 2026
@openshift-ci openshift-ci Bot requested review from csrwng and jimdaga March 9, 2026 20:00
@cblecker cblecker marked this pull request as draft March 9, 2026 20:01
@cblecker cblecker marked this pull request as ready for review March 9, 2026 20:01
@openshift-ci openshift-ci Bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. and removed do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. labels Mar 9, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 9, 2026

@cblecker: This pull request references GCP-410 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds GCP image registry credential propagation to the Hosted Cluster Config Operator (HCCO), enabling the cluster-image-registry-operator in guest clusters to authenticate to GCS via Workload Identity Federation
  • Creates installer-cloud-credentials secret in openshift-image-registry with service_account.json containing WIF external_account JSON credentials
  • Follows the established Azure pattern with a separate gcp package and exported SetupOperandCredentials function
  • Respects ImageRegistry capability gating — secret is not created when the capability is disabled

Test plan

  • go build ./control-plane-operator/... compiles without errors
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/... — all unit tests pass
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/... — all existing tests still pass
  • make lint — no lint issues
  • E2E: Deploy a GCP hosted cluster with image registry enabled and verify the installer-cloud-credentials secret is created in openshift-image-registry with valid WIF credentials

Summary by CodeRabbit

Release Notes

  • New Features

  • Added support for Google Cloud Platform (GCP) Workload Identity credential configuration in hosted clusters

  • Automatic setup and reconciliation of GCP credentials for cluster components

  • Image registry credential management for GCP environments

  • Tests

  • Comprehensive test coverage for GCP credential setup and validation

  • Tests for credential secret creation and configuration validation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cblecker
Copy link
Copy Markdown
Member Author

/test e2e-aws-techpreview

@ckandag
Copy link
Copy Markdown
Contributor

ckandag commented Mar 10, 2026

LGTM
Please consider testing this in a live environment if you havent already.

@cblecker cblecker force-pushed the feat/gcp-410-hcco-image-registry-creds branch from 53670d4 to d4bfcbb Compare March 10, 2026 06:04
@cblecker
Copy link
Copy Markdown
Member Author

/test e2e-aws-techpreview

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 10, 2026

@cblecker: This pull request references GCP-410 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds GCP image registry credential propagation to the Hosted Cluster Config Operator (HCCO), enabling the cluster-image-registry-operator in guest clusters to authenticate to GCS via Workload Identity Federation
  • Creates installer-cloud-credentials secret in openshift-image-registry with service_account.json containing WIF external_account JSON credentials
  • Follows the established Azure pattern with a separate gcp package and exported SetupOperandCredentials function
  • Respects ImageRegistry capability gating — secret is not created when the capability is disabled

Test plan

  • go build ./control-plane-operator/... compiles without errors
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/... — all unit tests pass
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/... — all existing tests still pass
  • make lint — no lint issues
  • E2E: Deploy a GCP hosted cluster with image registry enabled and verify the installer-cloud-credentials secret is created in openshift-image-registry with valid WIF credentials

Summary by CodeRabbit

  • New Features

  • Added GCP Workload Identity credential support and automated reconciliation for hosted clusters

  • Automatic creation/management of image registry credentials for GCP environments

  • Tests

  • Added comprehensive tests for GCP credential construction, validation, and secret creation

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/gcp_test.go`:
- Around line 102-104: Test currently accepts any error in the else branch;
change the assertion to require a Kubernetes NotFound error so the
capability-gating check is specific. Replace the
g.Expect(err).To(HaveOccurred()) assertion with an explicit check using
apierrors.IsNotFound(err) (e.g.,
g.Expect(apierrors.IsNotFound(err)).To(BeTrue())), and add the import for
"k8s.io/apimachinery/pkg/api/errors" aliased as apierrors if not already
present; keep the same g test variable and location in gcp_test.go.

In
`@control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/gcp.go`:
- Around line 73-79: When cfg.capabilityChecker(hcp.Spec.Capabilities) returns
false you must remove any previously created resource instead of simply
continuing; update the loop in the reconciliation (where configs is iterated and
cfg.capabilityChecker is checked) to call the appropriate delete logic for the
secret produced by cfg.manifestFunc() (the installer-cloud-credentials secret)
when the capability is off: call cfg.manifestFunc() or otherwise resolve the
secret name, check for its existence in the cluster, and delete it (handling
not-found as success) before continuing so the GCP credentials are removed when
ImageRegistry/capability is disabled.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: 35f02a7e-c7ce-4648-ab02-57e9173991aa

📥 Commits

Reviewing files that changed from the base of the PR and between 53670d4 and d4bfcbb.

📒 Files selected for processing (4)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/gcp.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/gcp_test.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/manifests/creds.go
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • control-plane-operator/hostedclusterconfigoperator/controllers/resources/resources.go

@cblecker cblecker force-pushed the feat/gcp-410-hcco-image-registry-creds branch from d4bfcbb to 3128d86 Compare March 10, 2026 06:27
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Mar 10, 2026

@cblecker: This pull request references GCP-410 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds GCP image registry credential propagation to the Hosted Cluster Config Operator (HCCO), enabling the cluster-image-registry-operator in guest clusters to authenticate to GCS via Workload Identity Federation
  • Creates installer-cloud-credentials secret in openshift-image-registry with service_account.json containing WIF external_account JSON credentials
  • Follows the established Azure pattern with a separate gcp package and exported SetupOperandCredentials function
  • Respects ImageRegistry capability gating — secret is not created when the capability is disabled

Test plan

  • go build ./control-plane-operator/... compiles without errors
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp/... — all unit tests pass
  • go test ./control-plane-operator/hostedclusterconfigoperator/controllers/resources/... — all existing tests still pass
  • make lint — no lint issues
  • E2E: Deploy a GCP hosted cluster with image registry enabled and verify the installer-cloud-credentials secret is created in openshift-image-registry with valid WIF credentials

Summary by CodeRabbit

  • New Features

  • Added GCP Workload Identity credential support with automated reconciliation for hosted clusters.

  • Automatic creation and management of image registry credentials in GCP, conditioned on cluster capability checks.

  • Tests

  • Added comprehensive tests covering credential construction, validation, conditional secret creation, and emitted credential JSON.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cblecker
Copy link
Copy Markdown
Member Author

/test e2e-aws-techpreview

@openshift-ci openshift-ci Bot added the area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release label Mar 11, 2026
@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 29, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws

@hypershift-jira-solve-ci
Copy link
Copy Markdown

AI Test Failure Analysis

Job: pull-ci-openshift-hypershift-main-e2e-azure-self-managed | Build: 2049481613913362432 | Cost: $3.280589150000001 | Failed step: hypershift-azure-run-e2e-self-managed

View full analysis report


Generated by hypershift-analyze-e2e-failure post-step using Claude claude-opus-4-6

@cblecker
Copy link
Copy Markdown
Member Author

/retest-required

@cblecker
Copy link
Copy Markdown
Member Author

/test e2e-aws-4-22

@cblecker
Copy link
Copy Markdown
Member Author

/verified later

@openshift-ci-robot
Copy link
Copy Markdown

@cblecker: /verified later <@username> requires at least one GitHub @username to be specified (it can be a comma delimited list). It indicates the engineer(s) that will be performing the verification. See https://docs.ci.openshift.org/docs/architecture/jira/#premerge-verification for more information.

Details

In response to this:

/verified later

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cblecker
Copy link
Copy Markdown
Member Author

/verified later @cblecker

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Apr 30, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@cblecker: This PR has been marked to be verified later by @cblecker.

Details

In response to this:

/verified later @cblecker

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

…shared package

Move the GCP Workload Identity Federation credential builder out of the
HO's internal GCP platform package and into a new shared support/gcputil
package, so the HCCO can reuse it without creating an import cycle.

- Add support/gcputil package with BuildWorkloadIdentityCredentials and
  exported ExternalAccountCredential/CredentialSource/CredentialSourceFormat
  types
- Update hypershift-operator's GCP platform to call gcputil.BuildWorkloadIdentityCredentials
  instead of the now-removed private buildGCPWorkloadIdentityCredentials
- Remove the now-duplicate private type definitions and test coverage from
  the HO package (tests live in support/gcputil)

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Christoph Blecker <cblecker@redhat.com>
… image registry

Propagate GCP Workload Identity Federation credentials into the guest
cluster's openshift-image-registry namespace so the image registry
operator can authenticate to GCS.

- Add control-plane-operator/hostedclusterconfigoperator/controllers/resources/gcp
  package with SetupOperandCredentials, following the Azure HCCO pattern
- Add GCPImageRegistryCloudCredsSecret manifest targeting
  openshift-image-registry/installer-cloud-credentials with service_account.json
- Wire SetupOperandCredentials into the resources controller's
  reconcileCloudCredentialSecrets GCP case
- Guard secret upsert on namespace existence to handle bootstrap race
- Skip credential creation when ImageRegistry capability is disabled

Assisted-by: Claude:claude-sonnet-4-6
Signed-off-by: Christoph Blecker <cblecker@redhat.com>
@cblecker cblecker force-pushed the feat/gcp-410-hcco-image-registry-creds branch from 715f629 to 440fbf4 Compare April 30, 2026 15:53
@openshift-ci-robot openshift-ci-robot removed the verified Signifies that the PR passed pre-merge verification criteria label Apr 30, 2026
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Apr 30, 2026
@cblecker
Copy link
Copy Markdown
Member Author

/verified later @cblecker

@openshift-ci-robot openshift-ci-robot added verified-later verified Signifies that the PR passed pre-merge verification criteria labels Apr 30, 2026
@openshift-ci-robot
Copy link
Copy Markdown

@cblecker: This PR has been marked to be verified later by @cblecker.

Details

In response to this:

/verified later @cblecker

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@cblecker
Copy link
Copy Markdown
Member Author

/pipeline required

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Scheduling tests matching the pipeline_run_if_changed or not excluded by pipeline_skip_if_only_changed parameters:
/test e2e-aks-4-22
/test e2e-aws-4-22
/test e2e-aks
/test e2e-aws
/test e2e-aws-upgrade-hypershift-operator
/test e2e-azure-self-managed
/test e2e-kubevirt-aws-ovn-reduced
/test e2e-v2-aws

@cristianoveiga
Copy link
Copy Markdown
Contributor

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Apr 30, 2026
@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Tests from second stage were triggered manually. Pipeline can be controlled only manually, until HEAD changes. Use command to trigger second stage.

@hypershift-jira-solve-ci
Copy link
Copy Markdown

hypershift-jira-solve-ci Bot commented Apr 30, 2026

Confirmed: --e2e.gcp-network-sa is completely absent from the command line. The old version of the CI step script didn't extract it, didn't write it to SHARED_DIR, and the run-e2e step (which conditionally reads the file) got an empty value. Since Go's flag.StringVar with an empty default produces an empty string, and the --network-service-account flag is required, the cluster creation fails.

I now have all the evidence needed for the report.

Test Failure Analysis Complete

Job Information

Test Failure Analysis

Error

hypershift_framework.go:501: failed to create cluster, tearing down: failed to render cluster manifests: required flag(s) "network-service-account" not set

Summary

The TestCreateCluster e2e test failed immediately at cluster creation because the required --network-service-account CLI flag was not provided. This flag (added to main on 2026-04-16 via commit 2fe0329) requires a GCP service account email for the Cloud Network Config Controller. The CI step script in openshift/release that provisions the infrastructure (hypershift-gcp-hosted-cluster-setup-commands.sh) was not updated to extract and pass this value until PR #77415 merged at 18:24 UTC — roughly 2.5 hours after this Prow job started at 15:55 UTC. The job ran against the old script that lacked the NETWORK_SA extraction, so the --e2e.gcp-network-sa flag was never passed to the test binary. This is a CI infrastructure timing issue, not a bug in PR #7896 (which only touches HCCO credential propagation code and does not modify CLI flags or the e2e test framework).

Root Cause

The failure is caused by a missing CI step configuration, not by PR #7896 code changes.

The dependency chain:

  1. Commit 2fe0329 (merged 2026-04-16) added a required --network-service-account flag to hypershift create cluster gcp (cmd/cluster/gcp/create.go). This flag specifies the GCP service account for the Cloud Network Config Controller and is validated via util.ValidateRequiredOption().

  2. The e2e test framework (test/e2e/util/hypershift_framework.go:501) invokes create cluster gcp internally. The e2e binary accepts --e2e.gcp-network-sa (defined in test/e2e/e2e_test.go:181) and maps it to the CLI's NetworkServiceAccount field via test/e2e/util/options.go:464.

  3. The CI step hypershift-gcp-hosted-cluster-setup in openshift/release runs hypershift create iam gcp, which does create the cloud-network service account (confirmed in the setup log at line 102: ciaca478c8295-cloud-network@cicaf2694c-hosted-cluster.iam.gserviceaccount.com). However, the old version of the script did not extract this SA from the IAM output JSON or write it to ${SHARED_DIR}/network-sa.

  4. The hypershift-gcp-run-e2e step conditionally reads ${SHARED_DIR}/network-sa, but since the file didn't exist, NETWORK_SA was empty. The --e2e.gcp-network-sa flag was either omitted entirely or passed as an empty string, causing the required flag validation to fail.

  5. The fix (openshift/release PR #77415feat([GCP-431](https://redhat.atlassian.net/browse/GCP-431)): pass network service account to hypershift GCP e2e tests) was merged at 18:24 UTC on 2026-04-30, but this Prow job started at 15:55 UTC — 2.5 hours before the fix landed.

PR #7896 is innocent. It modifies control-plane-operator/hostedclusterconfigoperator/, hypershift-operator/controllers/hostedcluster/internal/platform/gcp/, and support/gcputil/ — none of which touch CLI flags, e2e flag registration, or CI step scripts.

Recommendations
  1. Retest the PR — The CI infrastructure fix (feat(GCP-431): pass network service account to hypershift GCP e2e tests release#77415) has already merged. A /retest should pass now that the hypershift-gcp-hosted-cluster-setup script correctly extracts and propagates the cloud-network service account.

  2. No code changes needed in PR GCP-410: feat(gcp): add HCCO credential propagation for GCP image registry #7896 — This failure is entirely a CI plumbing issue that has been resolved independently.

  3. Future prevention — When adding required CLI flags (like --network-service-account), the corresponding CI step script update in openshift/release should be merged before or simultaneously with the hypershift repo change that makes the flag required, to avoid a window where CI is broken.

Evidence
Evidence Detail
Primary error failed to render cluster manifests: required flag(s) "network-service-account" not set (hypershift_framework.go:501)
Flag introduced Commit 2fe0329 on 2026-04-16 added required --network-service-account to cmd/cluster/gcp/create.go
E2e flag mapping --e2e.gcp-network-saGCPNetworkServiceAccountNetworkServiceAccount (e2e_test.go:181 → options.go:464)
SA created in CI Setup log confirms cloud-network SA created: ciaca478c8295-cloud-network@cicaf2694c-hosted-cluster.iam.gserviceaccount.com
SA not extracted Setup log lines 130-155 show awk extractions for 5 SAs but no NETWORK_SA extraction — the old script lacked it
Flag absent from test Run-e2e command line includes all other --e2e.gcp-* flags but not --e2e.gcp-network-sa
Fix merged after job openshift/release#77415 merged at 2026-04-30T18:24:36Z; job started at 2026-04-30T15:55:44Z (2.5 hours earlier)
PR #7896 files Only touches control-plane-operator/, hypershift-operator/, support/gcputil/ — no CLI or e2e framework changes
Test result 8 tests total: 6 skipped, 2 failures (TestCreateCluster + TestCreateCluster/Teardown)

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 30, 2026

@cblecker: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-gke 440fbf4 link false /test e2e-gke
ci/prow/e2e-v2-gke 440fbf4 link false /test e2e-v2-gke

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot Bot merged commit 23d5637 into openshift:main Apr 30, 2026
43 of 45 checks passed
@cblecker cblecker deleted the feat/gcp-410-hcco-image-registry-creds branch April 30, 2026 19:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/gcp PR/issue for GCP (GCPPlatform) platform jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged. verified Signifies that the PR passed pre-merge verification criteria verified-later

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants