Skip to content

CNTRLPLANE-2262: Add Azure scale-from-zero support#8337

Open
jhjaggars wants to merge 6 commits intoopenshift:mainfrom
jhjaggars:azure-scale-from-zero
Open

CNTRLPLANE-2262: Add Azure scale-from-zero support#8337
jhjaggars wants to merge 6 commits intoopenshift:mainfrom
jhjaggars:azure-scale-from-zero

Conversation

@jhjaggars
Copy link
Copy Markdown
Contributor

@jhjaggars jhjaggars commented Apr 24, 2026

Extend the existing scale-from-zero autoscaling framework to support Azure by implementing an Azure instance type provider that queries the Azure Resource SKUs API for VM size specifications and writing capacity annotations on MachineDeployments.

Changes:

  • Add Azure instancetype.Provider using armcompute.ResourceSKUsClient
  • Add AzureMachineTemplate case to scale_from_zero.go type switch
  • Extend supportedScaleFromZeroPlatform() for Azure
  • Extend reconcileScaleFromZeroAnnotations() for Azure
  • Update autoscalerEnabledCondition() to accept Azure with min=0
  • Update effectiveMin guard in capi.go to allow min=0 for Azure
  • Add "azure" to supportedProviders in main.go and install.go
  • Add Azure provider initialization with credential file parsing
  • Update CRD CEL validation to allow min=0 for Azure platform
  • Add unit tests for Azure provider and extended type switches

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Summary by CodeRabbit

  • New Features

    • Scale-from-zero autoscaling now supported on Azure as well as AWS; operator/CLI accept Azure as a provider and use Azure SKU data for instance-type info.
  • Bug Fixes

    • Replica and autoscaler behavior updated so min=0 is honored for Azure where supported.
  • Documentation

    • API and CRD docs updated to reflect Azure support for scale-from-zero.
  • Tests

    • Added/updated tests covering Azure scale-from-zero, instance-type parsing, and annotation behavior.

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 24, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 24, 2026

@jhjaggars: This pull request references CNTRLPLANE-2262 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the epic to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Extend the existing scale-from-zero autoscaling framework to support Azure by implementing an Azure instance type provider that queries the Azure Resource SKUs API for VM size specifications and writing capacity annotations on MachineDeployments.

Changes:

  • Add Azure instancetype.Provider using armcompute.ResourceSKUsClient
  • Add AzureMachineTemplate case to scale_from_zero.go type switch
  • Extend supportedScaleFromZeroPlatform() for Azure
  • Extend reconcileScaleFromZeroAnnotations() for Azure
  • Update autoscalerEnabledCondition() to accept Azure with min=0
  • Update effectiveMin guard in capi.go to allow min=0 for Azure
  • Add "azure" to supportedProviders in main.go and install.go
  • Add Azure provider initialization with credential file parsing
  • Update CRD CEL validation to allow min=0 for Azure platform
  • Add unit tests for Azure provider and extended type switches

What this PR does / why we need it:

Which issue(s) this PR fixes:

Fixes

Special notes for your reviewer:

Checklist:

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 24, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 24, 2026
@openshift-ci openshift-ci Bot added do-not-merge/needs-area area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/azure PR/issue for Azure (AzurePlatform) platform and removed do-not-merge/needs-area labels Apr 24, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: jhjaggars
Once this PR has been reviewed and has the lgtm label, please assign csrwng for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 24, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR extends scale-from-zero support to Azure in addition to AWS. CRD XValidation for NodePoolSpec now permits autoScaling.min=0 on Azure. The operator CLI and bootstrap accept --scale-from-zero-provider=azure and initialize an Azure instancetype provider that queries and caches Azure Resource SKUs. Controller logic and tests were updated so autoscaling min 0 is honored for Azure, NodePool reconciliation reads AzureMachineTemplate VM sizes, and scale-from-zero annotation reconciliation handles Azure templates.

Changes

Scale-from-zero: Azure support + Azure instancetype provider

Layer / File(s) Summary
API / Schema
api/hypershift/v1beta1/nodepool_types.go, api/.../nodepools*.yaml
XValidation and OpenAPI docs for spec.autoScaling.min broadened: autoScaling.min=0 allowed for platform.type == Azure as well as AWS.
Core controller behavior
hypershift-operator/controllers/nodepool/nodepool_controller.go, hypershift-operator/controllers/nodepool/capi.go, hypershift-operator/controllers/nodepool/conditions.go, hypershift-operator/controllers/nodepool/scale_from_zero.go
Reconcile gate switched to configurable ScaleFromZeroPlatform; enforcement that bumped effective min to 1 excludes Azure; reconcileScaleFromZeroAnnotations and setScaleFromZeroAnnotationsOnObject gain Azure handling (read AzureMachineTemplate VMSize and apply annotations/taints accordingly).
Instantiation / Provider implementation
hypershift-operator/controllers/nodepool/instancetype/azure/provider.go
New Azure instancetype Provider: lazy-loads and caches Azure Resource SKUs (paginated), transforms SKUs into InstanceTypeInfo (vCPU, MemoryMb, GPUs, CPU arch), exposes GetInstanceTypeInfo.
Provider tests
hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
New tests and mocks validating SKU transformation, capability parsing, error cases, GetInstanceTypeInfo lookup behavior, and capability helper.
CLI / bootstrap wiring
hypershift-operator/main.go, cmd/install/install.go, go.mod
--scale-from-zero-provider accepts azure; main reads Azure creds JSON (subscriptionId, clientId, clientSecret, tenantId, location), constructs Azure credential and ResourceSKUs client, initializes Azure instancetype provider; go.mod adds Azure SDK dependency entry.
Tests / Integration
hypershift-operator/controllers/nodepool/scale_from_zero_test.go, hypershift-operator/controllers/nodepool/capi_test.go, cmd/install/assets/crds/.../stable.nodepools.autoscaling.testsuite.yaml, docs, e2e import formatting
Unit tests updated to expect Azure can scale-from-zero and to cover Azure template cases; CRD test-suite and docs updated to reflect Azure support; small e2e import reformat.

Sequence Diagram(s)

sequenceDiagram
  participant OperatorMain as Operator (main/boot)
  participant AzureSKUs as Azure Resource SKUs API
  participant InstTypeProv as Azure Instancetype Provider
  participant K8sAPI as Kubernetes API (CAPI objects)
  participant NodePoolCtrl as NodePool Controller

  OperatorMain->>AzureSKUs: Read credentials & location\ncreate SKUs client
  OperatorMain->>InstTypeProv: NewProvider(skuClient, location)
  Note over InstTypeProv: Provider initialized (cache empty)

  NodePoolCtrl->>K8sAPI: Get NodePool / AzureMachineTemplate
  K8sAPI-->>NodePoolCtrl: return AzureMachineTemplate (VMSize)
  NodePoolCtrl->>InstTypeProv: GetInstanceTypeInfo(ctx, vmSize)
  InstTypeProv->>AzureSKUs: ListPager() / Paginate SKUs
  AzureSKUs-->>InstTypeProv: SKU pages
  InstTypeProv-->>NodePoolCtrl: InstanceTypeInfo (vCPUs, Memory, GPUs)
  NodePoolCtrl->>K8sAPI: Patch node template annotations\n(scale-from-zero capacity/taints)
  K8sAPI-->>NodePoolCtrl: Patch result
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 31.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'CNTRLPLANE-2262: Add Azure scale-from-zero support' accurately and concisely describes the primary change in the PR, clearly communicating the main objective of extending scale-from-zero functionality to the Azure platform.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are static, deterministic strings. Tests use standard Go testing with t.Run() and static name literals. No dynamic content found in test titles.
Test Structure And Quality ✅ Passed Check designed for Ginkgo tests. PR contains only standard Go table-driven unit tests, no Ginkgo framework. Not applicable.
Microshift Test Compatibility ✅ Passed No new Ginkgo e2e tests were added to this PR. The PR only modifies unit tests and reorders imports in existing e2e tests. The custom check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed No new Ginkgo e2e tests were added. PR adds only unit tests for Azure scale-from-zero implementation. SNO compatibility check does not apply.
Topology-Aware Scheduling Compatibility ✅ Passed Adds Azure scale-from-zero support. No topology-breaking scheduling constraints: no affinity rules, node selectors, topology spread, or topology-dependent replica logic.
Ote Binary Stdout Contract ✅ Passed PR adds Azure scale-from-zero support without violating OTE Binary Stdout Contract. New code uses JSON-based logging (stderr) and structured error handling only.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed No new Ginkgo e2e tests were added. Only unit tests using standard Go testing.T were added. The e2e test file had import-only changes. Check does not apply.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
cmd/install/install.go (1)

251-263: ⚠️ Potential issue | 🟡 Minor

Expose Azure in the CLI help text as well.

Validation accepts azure here, but the --scale-from-zero-provider help string still says Platform type for scale-from-zero autoscaling (aws) at Line 394. hypershift install --help will still advertise AWS-only support.

✏️ Suggested follow-up
-	cmd.PersistentFlags().StringVar(&opts.ScaleFromZeroProvider, "scale-from-zero-provider", opts.ScaleFromZeroProvider, "Platform type for scale-from-zero autoscaling (aws)")
+	cmd.PersistentFlags().StringVar(&opts.ScaleFromZeroProvider, "scale-from-zero-provider", opts.ScaleFromZeroProvider, "Platform type for scale-from-zero autoscaling (aws, azure)")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@cmd/install/install.go` around lines 251 - 263, The CLI help text still
advertises AWS-only for --scale-from-zero-provider even though
supportedProviders includes "azure"; update the help string where the flag for
ScaleFromZeroProvider (the option described as "Platform type for
scale-from-zero autoscaling (aws)") to list both aws and azure (or a dynamic
list based on supportedProviders) so the help matches validation in
supportedProviders and the ScaleFromZeroProvider flag behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@api/hypershift/v1beta1/nodepool_types.go`:
- Line 109: Update the comment on the NodePoolAutoScaling.Min field to reflect
that scale-from-zero (min=0) is supported for both AWS and Azure platforms;
locate the NodePoolAutoScaling struct and the Min field comment in
nodepool_types.go (symbol: NodePoolAutoScaling.Min) and change the text that
currently mentions only AWS to mention "AWS and Azure" so the CRD/OpenAPI schema
and docs match the XValidation rule. Ensure the wording mirrors the XValidation
message: "Scale-from-zero (autoScaling.min=0) is supported for AWS and Azure
platforms."

In `@hypershift-operator/controllers/nodepool/instancetype/azure/provider.go`:
- Around line 48-50: The cache is being set before the full Azure SKU pagination
succeeds, causing partial results to persist after pager/NextPage() failures;
modify loadSKUs to populate a local temporary map (e.g., tempCache) while
walking pages and only assign it to p.cache (and any related fields) after the
entire walk succeeds, and ensure GetInstanceTypeInfo still checks p.cache==nil
to trigger reloads; apply the same pattern to the other similar block around the
63-75 logic so that p.cache is only updated on successful completion of the full
SKU load.

In `@hypershift-operator/controllers/nodepool/nodepool_controller.go`:
- Line 437: The current platform check returns true for Azure unconditionally
which lets Azure NodePools enter the scale-from-zero path even when the
configured provider (scale-from-zero-provider) is AWS; update the platform gate
to require both the cluster platform and the configured InstanceTypeProvider
match: modify the function that currently returns "return platform ==
hyperv1.AWSPlatform || platform == hyperv1.AzurePlatform" to instead check the
configured provider (e.g., scaleFromZeroProvider/InstanceTypeProvider) and only
return true when platform==AWS && provider==aws OR platform==Azure &&
provider==azure (use the actual flag/field name used to hold the
--scale-from-zero-provider value and the InstanceTypeProvider symbol in the
reconciler).

In `@hypershift-operator/main.go`:
- Around line 487-499: The code parses Azure credentials into the azureCreds
struct but only validates Location; update the validation after json.Unmarshal
to ensure SubscriptionID, ClientID, ClientSecret and TenantID are non-empty
before creating the client. Specifically, in the block that defines azureCreds
and calls json.Unmarshal, add checks for azureCreds.SubscriptionID,
azureCreds.ClientID, azureCreds.ClientSecret and azureCreds.TenantID and return
descriptive fmt.Errorf errors (or a single aggregated error) if any are empty so
client creation logic (using these fields) never runs with missing values.

---

Outside diff comments:
In `@cmd/install/install.go`:
- Around line 251-263: The CLI help text still advertises AWS-only for
--scale-from-zero-provider even though supportedProviders includes "azure";
update the help string where the flag for ScaleFromZeroProvider (the option
described as "Platform type for scale-from-zero autoscaling (aws)") to list both
aws and azure (or a dynamic list based on supportedProviders) so the help
matches validation in supportedProviders and the ScaleFromZeroProvider flag
behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: ffada2e8-18eb-4f47-9fc5-f9200533350a

📥 Commits

Reviewing files that changed from the base of the PR and between c1a8bb6 and 7c4cfe6.

📒 Files selected for processing (11)
  • api/hypershift/v1beta1/nodepool_types.go
  • cmd/install/install.go
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go
  • hypershift-operator/controllers/nodepool/conditions.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go
  • hypershift-operator/controllers/nodepool/scale_from_zero.go
  • hypershift-operator/controllers/nodepool/scale_from_zero_test.go
  • hypershift-operator/main.go

Comment thread api/hypershift/v1beta1/nodepool_types.go
Comment thread hypershift-operator/controllers/nodepool/nodepool_controller.go Outdated
Comment thread hypershift-operator/main.go
@codecov
Copy link
Copy Markdown

codecov Bot commented Apr 24, 2026

Codecov Report

❌ Patch coverage is 59.89305% with 75 lines in your changes missing coverage. Please review.
✅ Project coverage is 37.58%. Comparing base (37f46b9) to head (a4e2f66).
⚠️ Report is 17 commits behind head on main.

Files with missing lines Patch % Lines
hypershift-operator/main.go 0.00% 61 Missing ⚠️
...erator/controllers/nodepool/nodepool_controller.go 0.00% 10 Missing ⚠️
...ontrollers/nodepool/instancetype/azure/provider.go 97.16% 2 Missing and 1 partial ⚠️
...rshift-operator/controllers/nodepool/conditions.go 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8337      +/-   ##
==========================================
+ Coverage   37.53%   37.58%   +0.04%     
==========================================
  Files         751      752       +1     
  Lines       92025    92199     +174     
==========================================
+ Hits        34543    34649     +106     
- Misses      54841    54908      +67     
- Partials     2641     2642       +1     
Files with missing lines Coverage Δ
cmd/install/install.go 52.60% <100.00%> (ø)
hypershift-operator/controllers/nodepool/capi.go 68.87% <100.00%> (ø)
...t-operator/controllers/nodepool/scale_from_zero.go 100.00% <100.00%> (ø)
...rshift-operator/controllers/nodepool/conditions.go 54.06% <0.00%> (+0.13%) ⬆️
...ontrollers/nodepool/instancetype/azure/provider.go 97.16% <97.16%> (ø)
...erator/controllers/nodepool/nodepool_controller.go 40.96% <0.00%> (-0.34%) ⬇️
hypershift-operator/main.go 0.00% <0.00%> (ø)

... and 1 file with indirect coverage changes

Flag Coverage Δ
cmd-support 32.76% <100.00%> (+<0.01%) ⬆️
cpo-hostedcontrolplane 36.77% <ø> (ø)
cpo-other 37.76% <ø> (ø)
hypershift-operator 48.02% <59.67%> (+0.09%) ⬆️
other 27.77% <ø> (ø)

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@jhjaggars jhjaggars force-pushed the azure-scale-from-zero branch 2 times, most recently from 0b01bc1 to b3477bf Compare May 7, 2026 15:31
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/nodepool/capi.go`:
- Around line 774-775: The code sets effectiveMin to 1 for Azure based solely on
nodePool.Spec.Platform.Type; change this to also require that the
operator/runtime has Azure scale-from-zero support wired up. Update the
condition around effectiveMin (the place that checks nodePool.Spec.Platform.Type
and sets effectiveMin) to call the runtime-configured check (e.g., a function or
map such as supportsScaleFromZero(platform) or scaleFromZeroProviders[platform])
and only force effectiveMin=1 when the platform is not supported for
scale-from-zero or when the runtime config does not indicate Azure
scale-from-zero is enabled; apply the same guarded change to the other identical
branch referenced by the comment (the block around the other effectiveMin
handling). Ensure you reference and use the existing runtime config/provider
flag/function rather than just nodePool.Spec.Platform.Type.

In `@hypershift-operator/main.go`:
- Around line 527-537: The Azure scale-from-zero path creates credentials and a
ResourceSKUs client with nil options, ignoring AZURE_CLOUD_NAME; update the
NewClientSecretCredential and armcompute.NewResourceSKUsClient calls to use the
same cloud-specific client options used elsewhere (the resolved
cloud/environment options derived from AZURE_CLOUD_NAME) so the credential and
skuClient target the correct sovereign endpoints (use azureCreds and the
resolved azure cloud options when constructing cred and skuClient before
assigning instanceTypeProvider and scaleFromZeroPlatform and logging).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 255fb7ac-fd14-43de-b4ad-7f7fc9cd02c4

📥 Commits

Reviewing files that changed from the base of the PR and between 0b01bc1 and b3477bf.

⛔ Files ignored due to path filters (8)
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • cmd/install/assets/crds/hypershift-operator/tests/nodepools.hypershift.openshift.io/stable.nodepools.autoscaling.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/nodepool_types.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (12)
  • api/hypershift/v1beta1/nodepool_types.go
  • cmd/install/install.go
  • go.mod
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go
  • hypershift-operator/controllers/nodepool/conditions.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go
  • hypershift-operator/controllers/nodepool/scale_from_zero.go
  • hypershift-operator/controllers/nodepool/scale_from_zero_test.go
  • hypershift-operator/main.go
✅ Files skipped from review due to trivial changes (1)
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
🚧 Files skipped from review as they are similar to previous changes (5)
  • hypershift-operator/controllers/nodepool/scale_from_zero_test.go
  • cmd/install/install.go
  • api/hypershift/v1beta1/nodepool_types.go
  • hypershift-operator/controllers/nodepool/conditions.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go

Comment thread hypershift-operator/controllers/nodepool/capi.go Outdated
Comment thread hypershift-operator/main.go Outdated
@jhjaggars jhjaggars force-pushed the azure-scale-from-zero branch from b3477bf to 242417f Compare May 7, 2026 19:22
@openshift-ci openshift-ci Bot added the area/documentation Indicates the PR includes changes for documentation label May 7, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
hypershift-operator/controllers/nodepool/capi.go (1)

774-775: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Azure min=0 guard still checks only platform type, not runtime scale-from-zero configuration.

The exemption added for hyperv1.AzurePlatform allows effectiveMin=0 for any Azure NodePool, regardless of whether the operator was started with the Azure scale-from-zero provider wired (--scale-from-zero-provider=azure). Without that provider, the scale-from-zero capacity annotations are never written, so the autoscaler receives a zero-minimum pool with no instance-type metadata and cannot scale back up — permanently stalling the pool.

The fix should key this exemption off the runtime-configured provider set rather than the static platform type. The identical issue exists in both setMachineDeploymentReplicas (Line 774) and setMachineSetReplicas (Line 1083).

#!/bin/bash
# Look for any existing runtime check or helper that exposes whether a given platform
# has scale-from-zero support configured (e.g. supportedScaleFromZeroPlatform,
# scaleFromZeroProviders, or similar).
rg -n --type=go -C4 'scaleFromZero|ScaleFromZero|scale_from_zero' \
  --glob '!*_test.go'

Also applies to: 1083-1085

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/nodepool/capi.go` around lines 774 - 775, The
Azure min=0 exemption currently checks nodePool.Spec.Platform.Type directly;
update the condition in both setMachineDeploymentReplicas and
setMachineSetReplicas so it verifies the runtime-configured scale-from-zero
provider set (e.g., call the existing helper/flag that exposes configured
providers such as scaleFromZeroProviders / supportedScaleFromZeroPlatform or the
operator config tied to --scale-from-zero-provider) instead of checking
hyperv1.AzurePlatform; change the if that sets effectiveMin to 0 to require the
runtime provider to include "azure" (or the helper to return true) before
allowing effectiveMin==0 so pools only get zero-min when the operator actually
supports Azure scale-from-zero.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@hypershift-operator/controllers/nodepool/instancetype/azure/provider.go`:
- Around line 130-136: The code currently ignores parse errors and accepts
negative GPU counts; instead validate and fail fast: after calling
getCapabilityValue and obtaining gpuStr, attempt strconv.ParseInt(gpuStr, 10,
32) and if err != nil or the parsed value is negative, return an error (or log
and propagate) with context including gpuStr and the SKU identifier rather than
silently setting info.GPU; only assign info.GPU = int32(gpu) when parsing
succeeds and gpu >= 0. Use the existing gpuStr, getCapabilityValue,
strconv.ParseInt and info.GPU symbols to locate and implement the checks and
error propagation.

---

Duplicate comments:
In `@hypershift-operator/controllers/nodepool/capi.go`:
- Around line 774-775: The Azure min=0 exemption currently checks
nodePool.Spec.Platform.Type directly; update the condition in both
setMachineDeploymentReplicas and setMachineSetReplicas so it verifies the
runtime-configured scale-from-zero provider set (e.g., call the existing
helper/flag that exposes configured providers such as scaleFromZeroProviders /
supportedScaleFromZeroPlatform or the operator config tied to
--scale-from-zero-provider) instead of checking hyperv1.AzurePlatform; change
the if that sets effectiveMin to 0 to require the runtime provider to include
"azure" (or the helper to return true) before allowing effectiveMin==0 so pools
only get zero-min when the operator actually supports Azure scale-from-zero.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 4a1461b0-6355-45e0-b5b5-14af5c2bc3e3

📥 Commits

Reviewing files that changed from the base of the PR and between b3477bf and 242417f.

⛔ Files ignored due to path filters (10)
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/AAA_ungated.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/GCPPlatform.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/OpenStack.yaml is excluded by !**/zz_generated.featuregated-crd-manifests/**
  • cmd/install/assets/crds/hypershift-operator/tests/nodepools.hypershift.openshift.io/stable.nodepools.autoscaling.testsuite.yaml is excluded by !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-CustomNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-Default.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-TechPreviewNoUpgrade.crd.yaml is excluded by !**/zz_generated.crd-manifests/**, !cmd/install/assets/**/*.yaml
  • docs/content/reference/aggregated-docs.md is excluded by !docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md is excluded by !docs/content/reference/api.md
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/nodepool_types.go is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (12)
  • api/hypershift/v1beta1/nodepool_types.go
  • cmd/install/install.go
  • go.mod
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go
  • hypershift-operator/controllers/nodepool/conditions.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go
  • hypershift-operator/controllers/nodepool/scale_from_zero.go
  • hypershift-operator/controllers/nodepool/scale_from_zero_test.go
  • hypershift-operator/main.go
🚧 Files skipped from review as they are similar to previous changes (9)
  • api/hypershift/v1beta1/nodepool_types.go
  • hypershift-operator/controllers/nodepool/conditions.go
  • cmd/install/install.go
  • hypershift-operator/controllers/nodepool/capi_test.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
  • hypershift-operator/controllers/nodepool/scale_from_zero_test.go
  • hypershift-operator/main.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go
  • hypershift-operator/controllers/nodepool/scale_from_zero.go

@hypershift-jira-solve-ci
Copy link
Copy Markdown

hypershift-jira-solve-ci Bot commented May 7, 2026

Now I have all the evidence needed. Let me compile the final report.

Test Failure Analysis Complete

Job Information

  • Prow Job: Red Hat Konflux / hypershift-operator-main-on-pull-request, hypershift-cli-mce-50-on-pull-request, hypershift-release-mce-50-on-pull-request
  • Build ID: hypershift-operator-main-on-pull-request-lrw27, hypershift-cli-mce-50-on-pull-request-95tgt, hypershift-release-mce-50-on-pull-request-csprs
  • PR: #8337CNTRLPLANE-2262: Add Azure scale-from-zero support
  • Namespace: crt-redhat-acm-tenant
  • Infrastructure: Konflux stone-prd-rh01
  • Failure Window: 2026-05-07T19:22:29Z – 2026-05-07T19:40:13Z

Test Failure Analysis

Error

task clone-repository has the status "Failed":
error: Command failed after 10 tries with status 1
Command exited with non-zero status 1

Summary

All three Konflux pipeline runs failed identically at the clone-repository Tekton task, which uses the /ko-app/git-init binary (from quay.io/konflux-ci/git-clone) with 10 retry attempts. Each attempt averaged ~1.7 minutes (17 minutes total / 10 retries), far exceeding the normal <30-second shallow clone time for this repository. The failures are not caused by the PR code changes — they are a Konflux infrastructure issue: the init task succeeded (container image pull worked), but outbound network connectivity from the Konflux cluster to GitHub was degraded, causing every git clone attempt to timeout. An active Quay.io HTTP 502 outage on the same Red Hat infrastructure (started 2026-05-07T18:05Z, status: investigating at failure time) correlates with the degraded connectivity. Identical Konflux checks for the same three pipelines passed on PR #8459 five hours earlier, and all non-Konflux CI jobs (GitHub Actions) on this PR cloned and ran successfully.

Root Cause

Transient Konflux infrastructure networking issue causing git clone timeouts from the Konflux build cluster (stone-prd-rh01.pg1f.p1.openshiftapps.com) to GitHub.

The failure chain:

  1. A head_ref_force_pushed event at 19:22:19Z triggered all three Konflux pipelines at 19:22:29Z
  2. The init task succeeded in 5 seconds on all three runs, confirming pod scheduling and container image pulls worked normally
  3. The clone-repository task's /ko-app/git-init binary attempted to clone openshift/hypershift (204 MB repo, shallow depth=1) 10 times, each attempt failing with exit code 1 after ~1.7 minutes
  4. The ~1.7-minute-per-attempt timing indicates network-level timeouts (a healthy clone takes <30 seconds)
  5. An active Quay.io HTTP 502 outage (created 2026-05-07T18:05:44Z, impact: critical, status: investigating) was ongoing on the same Red Hat managed infrastructure. While Quay pulls were reported "restored" at 18:20Z, the incident remained under active investigation through the failure window, indicating persistent infrastructure instability

Evidence this is NOT a code issue:

Recommendations
  1. Re-trigger the Konflux pipelines — Push an empty commit or use the Konflux re-run mechanism. The Quay.io outage should have resolved by now, restoring normal network connectivity from the Konflux cluster.

  2. No code changes needed — The Azure scale-from-zero PR changes are not related to this failure. The PR's GitHub Actions CI (envtest, build, codespell) all passed, confirming the code compiles and tests pass.

  3. Monitor Red Hat status page — Before re-triggering, verify the Quay.io HTTP 502 incident has been resolved to avoid another clone failure.

  4. Address the separate lint/verify failures — The GitHub Actions lint and verify checks also failed on this PR. These are code-related issues unrelated to the Konflux clone failures and should be investigated separately.

Evidence
Evidence Detail
Failed Task clone-repository — identical across all 3 pipeline runs
Error Message Command failed after 10 tries with status 1
Duration 17 minutes (10 retries × ~1.7 min each; normal clone <30s)
Init Task ✅ Succeeded in 5s (image pull works, pod scheduling works)
Pipeline Run IDs lrw27, 95tgt, csprs (namespace: crt-redhat-acm-tenant)
Force Push Timing head_ref_force_pushed at 19:22:19Z → pipelines triggered 19:22:29Z
Quay.io Outage HTTP 502 on Pull/Push — created 18:05:44Z, impact: critical, status: investigating at failure time
Infrastructure stone-prd-rh01.pg1f.p1.openshiftapps.com (shared with Quay.io)
PR #8459 Konflux ✅ All 4 pipelines passed at 14:10–14:30Z same day (same infra)
GitHub Actions CI ✅ 15+ checks passed (envtest, build, codespell, container-sync)
Repo Size 204,695 KB (~200 MB) — large but within normal limits
PR Source Fork: jhjaggars/hypershift — other fork PRs passed Konflux

jhjaggars and others added 5 commits May 7, 2026 16:41
Extend the existing scale-from-zero autoscaling framework to support
Azure by implementing an Azure instance type provider that queries the
Azure Resource SKUs API for VM size specifications and writing capacity
annotations on MachineDeployments.

Changes:
- Add Azure instancetype.Provider using armcompute.ResourceSKUsClient
- Add AzureMachineTemplate case to scale_from_zero.go type switch
- Extend supportedScaleFromZeroPlatform() for Azure
- Extend reconcileScaleFromZeroAnnotations() for Azure
- Update autoscalerEnabledCondition() to accept Azure with min=0
- Update effectiveMin guard in capi.go to allow min=0 for Azure
- Add "azure" to supportedProviders in main.go and install.go
- Add Azure provider initialization with credential file parsing
- Update CRD CEL validation to allow min=0 for Azure platform
- Add unit tests for Azure provider and extended type switches

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Update NodePoolAutoScaling.Min field comment and CRD validation rule
  to reflect Azure support alongside AWS
- Regenerate CRD manifests with updated docs and validation
- Fix partial SKU cache on Azure pager failure: build into local map
  and assign to cache only after full walk succeeds
- Tighten platform gate: add ScaleFromZeroPlatform field so annotations
  are only set when nodepool platform matches the configured provider
- Validate all required Azure credential fields (subscriptionId,
  clientId, clientSecret, tenantId, location) upfront with a clear
  error listing missing fields

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Update CRD test suite to match the updated validation rule that
allows autoScaling.min=0 on Azure platform:
- Change Azure min=0 test from expecting failure to expecting success
- Update Agent and KubeVirt error messages to include Azure

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Run make generate update to sync generated API docs,
aggregated docs, go.mod, and vendored type files.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Lowercase error string for Azure scale-from-zero credentials
- Fix gci import ordering in main.go, provider_test.go,
  scale_from_zero_test.go, and nodepool_test.go

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jhjaggars jhjaggars force-pushed the azure-scale-from-zero branch from 242417f to 826f26a Compare May 7, 2026 20:42
@openshift-ci openshift-ci Bot added the area/testing Indicates the PR includes changes for e2e testing label May 7, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (2)
hypershift-operator/controllers/nodepool/capi.go (2)

1083-1085: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Same unguarded Azure platform-type check in setMachineSetReplicas.

Same issue as lines 774–776: effectiveMin=0 is allowed for Azure based on platform type alone, with no check for runtime-configured Azure scale-from-zero support.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/nodepool/capi.go` around lines 1083 - 1085,
In setMachineSetReplicas the AzurePlatform check is unguarded so effectiveMin
may be set to 1 based solely on platform type; update the condition to also
verify the runtime-configured Azure scale-from-zero feature flag (the same
runtime check used earlier around lines with the guarded Azure check) before
allowing effectiveMin to remain 0. Specifically, change the if that references
nodePool.Spec.Platform.Type == hyperv1.AzurePlatform to require the
runtime-scale-from-zero check (e.g., call the existing
isAzureScaleFromZeroEnabled/clusterConfig.ScaleFromZero.Azure/feature helper
used elsewhere) so Azure only gets scale-from-zero behavior when the runtime
config enables it, leaving AWS and other logic unchanged.

774-776: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

[Still unresolved from previous review] Gate Azure min=0 on runtime-configured provider, not just platform type.

This guard continues to permit effectiveMin=0 for any Azure NodePool based solely on nodePool.Spec.Platform.Type, regardless of whether the operator was started with --scale-from-zero-provider=azure. If the Azure instancetype provider is not wired up at startup, the scale-from-zero annotation path won't populate the capacity metadata the autoscaler needs—leaving the pool permanently stuck at 0 replicas with no recovery path.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@hypershift-operator/controllers/nodepool/capi.go` around lines 774 - 776, The
current guard sets effectiveMin=1 for non-AWS/Azure based only on
nodePool.Spec.Platform.Type, which still allows Azure pools to stay at 0 when
the operator wasn't started with the Azure scale-from-zero provider; update the
condition in the block that assigns effectiveMin (around the effectiveMin
variable usage) to also check the operator's runtime configuration for enabled
scale-from-zero providers (e.g., consult the operator config or the
ScaleFromZeroProviders/scaleFromZeroProvider flag mechanism used at startup) and
only permit min=0 for Azure when the Azure provider is actually enabled; in
practice change the if that references nodePool.Spec.Platform.Type and
hyperv1.AzurePlatform to require both platform==Azure and
providerEnabled("azure") (or the equivalent runtime-config boolean/collection
used by the operator) before allowing effectiveMin to remain 0.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@hypershift-operator/controllers/nodepool/capi.go`:
- Around line 1083-1085: In setMachineSetReplicas the AzurePlatform check is
unguarded so effectiveMin may be set to 1 based solely on platform type; update
the condition to also verify the runtime-configured Azure scale-from-zero
feature flag (the same runtime check used earlier around lines with the guarded
Azure check) before allowing effectiveMin to remain 0. Specifically, change the
if that references nodePool.Spec.Platform.Type == hyperv1.AzurePlatform to
require the runtime-scale-from-zero check (e.g., call the existing
isAzureScaleFromZeroEnabled/clusterConfig.ScaleFromZero.Azure/feature helper
used elsewhere) so Azure only gets scale-from-zero behavior when the runtime
config enables it, leaving AWS and other logic unchanged.
- Around line 774-776: The current guard sets effectiveMin=1 for non-AWS/Azure
based only on nodePool.Spec.Platform.Type, which still allows Azure pools to
stay at 0 when the operator wasn't started with the Azure scale-from-zero
provider; update the condition in the block that assigns effectiveMin (around
the effectiveMin variable usage) to also check the operator's runtime
configuration for enabled scale-from-zero providers (e.g., consult the operator
config or the ScaleFromZeroProviders/scaleFromZeroProvider flag mechanism used
at startup) and only permit min=0 for Azure when the Azure provider is actually
enabled; in practice change the if that references nodePool.Spec.Platform.Type
and hyperv1.AzurePlatform to require both platform==Azure and
providerEnabled("azure") (or the equivalent runtime-config boolean/collection
used by the operator) before allowing effectiveMin to remain 0.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository: openshift/coderabbit/.coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: d92e2f97-9c70-40b5-9329-3c634ae4f483

📥 Commits

Reviewing files that changed from the base of the PR and between 242417f and 826f26a.

⛔ Files ignored due to path filters (1)
  • vendor/github.com/openshift/hypershift/api/hypershift/v1beta1/nodepool_types.go is excluded by !**/vendor/**, !vendor/**
📒 Files selected for processing (22)
  • api/hypershift/v1beta1/nodepool_types.go
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/AAA_ungated.yaml
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/GCPPlatform.yaml
  • api/hypershift/v1beta1/zz_generated.featuregated-crd-manifests/nodepools.hypershift.openshift.io/OpenStack.yaml
  • cmd/install/assets/crds/hypershift-operator/tests/nodepools.hypershift.openshift.io/stable.nodepools.autoscaling.testsuite.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-CustomNoUpgrade.crd.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-Default.crd.yaml
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-TechPreviewNoUpgrade.crd.yaml
  • cmd/install/install.go
  • docs/content/reference/aggregated-docs.md
  • docs/content/reference/api.md
  • go.mod
  • hypershift-operator/controllers/nodepool/capi.go
  • hypershift-operator/controllers/nodepool/capi_test.go
  • hypershift-operator/controllers/nodepool/conditions.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go
  • hypershift-operator/controllers/nodepool/scale_from_zero.go
  • hypershift-operator/controllers/nodepool/scale_from_zero_test.go
  • hypershift-operator/main.go
  • test/e2e/nodepool_test.go
✅ Files skipped from review due to trivial changes (3)
  • test/e2e/nodepool_test.go
  • docs/content/reference/aggregated-docs.md
  • cmd/install/assets/crds/hypershift-operator/zz_generated.crd-manifests/nodepools-Default.crd.yaml
🚧 Files skipped from review as they are similar to previous changes (9)
  • hypershift-operator/controllers/nodepool/conditions.go
  • cmd/install/install.go
  • hypershift-operator/controllers/nodepool/capi_test.go
  • go.mod
  • hypershift-operator/controllers/nodepool/scale_from_zero.go
  • hypershift-operator/controllers/nodepool/nodepool_controller.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider.go
  • hypershift-operator/controllers/nodepool/instancetype/azure/provider_test.go
  • hypershift-operator/main.go

@jhjaggars
Copy link
Copy Markdown
Contributor Author

/test all

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

@jhjaggars: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

- Gate effectiveMin=0 on runtime-configured scaleFromZeroPlatform instead
  of static platform type check, preventing stalled pools when the
  scale-from-zero provider isn't wired up
- Resolve AZURE_CLOUD_NAME for credential and SKU client construction in
  scale-from-zero init, matching sovereign cloud support used elsewhere
- Return errors on invalid/negative GPU values in transformSKU instead of
  silently skipping, with VM size in error messages for debuggability

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@jhjaggars jhjaggars marked this pull request as ready for review May 8, 2026 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area/api Indicates the PR includes changes for the API area/cli Indicates the PR includes changes for CLI area/documentation Indicates the PR includes changes for documentation area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/platform/azure PR/issue for Azure (AzurePlatform) platform area/testing Indicates the PR includes changes for e2e testing do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants