Skip to content

OCPCLOUD-1401: refactor to run same test against both mapi and capi#442

Open
sunzhaohua2 wants to merge 4 commits intoopenshift:masterfrom
sunzhaohua2:interface1
Open

OCPCLOUD-1401: refactor to run same test against both mapi and capi#442
sunzhaohua2 wants to merge 4 commits intoopenshift:masterfrom
sunzhaohua2:interface1

Conversation

@sunzhaohua2
Copy link
Copy Markdown
Contributor

Test result:

For backend MAPI, authority CAPI

zhsun:cluster-api-actuator-pkg/ (interface1) $ export TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=CAPI
zhsun:cluster-api-actuator-pkg/ (interface1) $ ./hack/ci-integration.sh -focus "Unified MachineSet creation on aws" -v
2025/10/24 16:24:31 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1761294271

Will run 2 of 73 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:75
[BeforeSuite] PASSED [1.850 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [153.377 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
  STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 16:30:12.73
• [173.657 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.009 seconds]
------------------------------

Ran 2 of 73 Specs in 328.887 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

Ginkgo ran 1 suite in 5m41.706834708s
Test Suite Passed

For MAPI

zhsun:cluster-api-actuator-pkg/ (interface1✗) $ export TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI
zhsun:cluster-api-actuator-pkg/ (interface1✗) $ ./hack/ci-integration.sh -focus "Unified MachineSet creation on aws" -v                   [16:30:44]
2025/10/24 16:30:47 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1761294647

Will run 2 of 73 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:75
[BeforeSuite] PASSED [15.585 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [175.443 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
  STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 16:37:05.831
• [178.894 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.009 seconds]
------------------------------

Ran 2 of 73 Specs in 369.925 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

Ginkgo ran 1 suite in 6m18.131194917s
Test Suite Passed

For CAPI

zhsun:cluster-api-actuator-pkg/ (interface1✗) $ export TEST_BACKEND_TYPE=CAPI TEST_AUTHORITATIVE_API=CAPI
zhsun:cluster-api-actuator-pkg/ (interface1✗) $ ./hack/ci-integration.sh -focus "Unified MachineSet creation on aws" -v
2025/10/24 17:41:44 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1761298904

Will run 2 of 73 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:75
[BeforeSuite] PASSED [1.716 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [173.566 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
  STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 17:47:41.688
• [171.759 seconds]
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.009 seconds]
------------------------------

Ran 2 of 73 Specs in 347.050 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

Ginkgo ran 1 suite in 5m58.1822375s
Test Suite Passed

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Oct 27, 2025
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Oct 27, 2025

@sunzhaohua2: This pull request references OCPCLOUD-1401 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Test result:

For backend MAPI, authority CAPI

zhsun:cluster-api-actuator-pkg/ (interface1) $ export TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=CAPI
zhsun:cluster-api-actuator-pkg/ (interface1) $ ./hack/ci-integration.sh -focus "Unified MachineSet creation on aws" -v
2025/10/24 16:24:31 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1761294271

Will run 2 of 73 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:75
[BeforeSuite] PASSED [1.850 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [153.377 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
 STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 16:30:12.73
• [173.657 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.009 seconds]
------------------------------

Ran 2 of 73 Specs in 328.887 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

Ginkgo ran 1 suite in 5m41.706834708s
Test Suite Passed

For MAPI

zhsun:cluster-api-actuator-pkg/ (interface1✗) $ export TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI
zhsun:cluster-api-actuator-pkg/ (interface1✗) $ ./hack/ci-integration.sh -focus "Unified MachineSet creation on aws" -v                   [16:30:44]
2025/10/24 16:30:47 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1761294647

Will run 2 of 73 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:75
[BeforeSuite] PASSED [15.585 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [175.443 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
 STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 16:37:05.831
• [178.894 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.009 seconds]
------------------------------

Ran 2 of 73 Specs in 369.925 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

Ginkgo ran 1 suite in 6m18.131194917s
Test Suite Passed

For CAPI

zhsun:cluster-api-actuator-pkg/ (interface1✗) $ export TEST_BACKEND_TYPE=CAPI TEST_AUTHORITATIVE_API=CAPI
zhsun:cluster-api-actuator-pkg/ (interface1✗) $ ./hack/ci-integration.sh -focus "Unified MachineSet creation on aws" -v
2025/10/24 17:41:44 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1761298904

Will run 2 of 73 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:75
[BeforeSuite] PASSED [1.716 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [173.566 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
 STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 17:47:41.688
• [171.759 seconds]
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.009 seconds]
------------------------------

Ran 2 of 73 Specs in 347.050 seconds
SUCCESS! -- 2 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

Ginkgo ran 1 suite in 5m58.1822375s
Test Suite Passed

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

Comment thread e2e.test Outdated
@miyadav
Copy link
Copy Markdown
Member

miyadav commented Nov 4, 2025

/lgtm

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 4, 2025
@damdo
Copy link
Copy Markdown
Member

damdo commented Nov 5, 2025

/assign @theobarberbany @mdbooth

Comment thread Makefile.test Outdated
Comment thread pkg/unified/backends/capi_backend.go Outdated
Comment thread pkg/unified/framework.go Outdated
Comment thread pkg/unified/e2e/test_helpers.go Outdated
Comment thread docs/unified.md Outdated
Comment thread docs/unified.md Outdated
Comment thread docs/unified.md Outdated
Comment thread docs/unified.md Outdated
Comment thread docs/unified.md Outdated
@openshift-ci openshift-ci Bot removed the lgtm Indicates that a PR is ready to be merged. label Nov 10, 2025
@RadekManak
Copy link
Copy Markdown
Contributor

/lgtm
/approve
Thank you!

@openshift-ci openshift-ci Bot added the lgtm Indicates that a PR is ready to be merged. label Nov 12, 2025
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 12, 2025

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: RadekManak

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 12, 2025
Comment thread pkg/unified/backends/capi_backend.go Outdated
Comment thread pkg/unified/config/test_config.go Outdated
}

// Read configuration from environment variables
if backendType := os.Getenv("TEST_BACKEND_TYPE"); backendType != "" {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have CI jobs setting this? How do we plan on using this? I reckon a follow up PR would be good to demonstrate :)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe like this openshift/release#71453

@theobarberbany
Copy link
Copy Markdown
Contributor

theobarberbany commented Nov 13, 2025

Code looks OK.

I'm not sure i get the why of why we've done this. I think I understand the desire to de-duplicate a lot of code between our CAPI and MAPI tests. Is doing this in this way a good idea? To add additional abstraction to e2es?

I'm struggling to understand how we'll use this, in order to review it - e.g - how do I know which backend I'm using, the logs look identical across CAPI and MAPI?

Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [175.443 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
  STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 16:37:05.831
• [178.894 seconds]

I also wonder, do we then need tests for our testing framework? Does it reduce confidence in the tests built this way, or make debugging harder?

Comment thread pkg/unified/e2e/aws_machineset.go Outdated
"sigs.k8s.io/controller-runtime/pkg/envtest/komega"
)

var _ = Describe("Unified MachineSet creation on aws", testframework.LabelDisruptive, Ordered, func() {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think these Describes contain enough context for someone without context to understand what's going on.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have updated

Comment thread pkg/unified/config/template_config.go Outdated
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Nov 20, 2025

New changes are detected. LGTM label has been removed.

@sunzhaohua2
Copy link
Copy Markdown
Contributor Author

Hi @theobarberbany thanks for your questions.

I'm not sure i get the why of why we've done this. I think I understand the desire to de-duplicate a lot of code between our CAPI and MAPI tests. Is doing this in this way a good idea? To add additional abstraction to e2es?

This idea comes from Joel: https://redhat-internal.slack.com/archives/GE2HQ9QP4/p1727430915984039?thread_ts=1727419633.062849&cid=GE2HQ9QP4. Based on my previous understanding from @shellyyang1989 , MAPI and CAPI will coexist for multiple releases during the migration period and may not even be droped.

This way can ensure same test coverage for both APIs and we can test mixed authority cases, such as MachineSet is MAPI-authoritative and a Machine is CAPI-authoritative, these cases are real production cases we must support.

Now QE are required to migrate all the cases from OpenShift-test-private repo to public repo, almost all the mapi cases are in OpenShift-test-private repo
if we move directory directly from private repo to this repo, the code style is not same. OpenShift-test-private primarily uses oc CLI, while this repo use client-go directly.
if we rewrite these mapi cases in this repo, we also need to write same for capi, require more work than the unified way.

For some complex cases, if this way is not suitable, we can use the original method. when selecting cases in the Makefile, we can choose the unified||mapi and unified||capi cases.

I'm struggling to understand how we'll use this, in order to review it - e.g - how do I know which backend I'm using, the logs look identical across CAPI and MAPI?

Unified MachineSet creation on aws creates a new MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:44
• [175.443 seconds]
------------------------------
Unified MachineSet creation on aws creates a spot instance MachineSet via unified backend [disruptive]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:55
  STEP: Verifying spot instance configuration contains spot-related fields @ 10/24/25 16:37:05.831
• [178.894 seconds]

I updated the Describe description, do you think this is ok?

$ make test-mapi                                   
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI hack/ci-integration.sh  --label-filter='unified'
2025/11/19 20:44:38 maxprocs: Leaving GOMAXPROCS=14: CPU quota undefined
Running Suite: Machine Suite - /Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg
====================================================================================================
Random Seed: 1763556278

Will run 3 of 74 specs
------------------------------
[BeforeSuite] 
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/e2e_test.go:76
[BeforeSuite] PASSED [1.964 seconds]
------------------------------
SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
------------------------------
MachineSet creation on AWS (backend: MAPI, authority: MAPI) creates a new MachineSet [disruptive, unified]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:43
  STEP: Framework initialized with backend: MAPI, authoritativeAPI: MAPI @ 11/19/25 20:44:51.28
  STEP: Creating a MachineTemplate @ 11/19/25 20:44:51.283
  STEP: Creating a MachineSet from the template @ 11/19/25 20:44:52.035
  STEP: Waiting for Machines to become Running @ 11/19/25 20:44:52.546
• [290.928 seconds]
------------------------------
MachineSet creation on AWS (backend: MAPI, authority: MAPI) creates a spot instance MachineSet [disruptive, unified]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:56
  STEP: Creating a MachineTemplate with spot instance configuration @ 11/19/25 20:49:42.21
  STEP: Creating a MachineSet from the spot template @ 11/19/25 20:49:42.463
  STEP: Waiting for Machines to become Running or skipping on capacity error @ 11/19/25 20:49:42.969
  STEP: Verifying spot instance configuration is correctly applied @ 11/19/25 20:53:47.003
• [245.036 seconds]
------------------------------
MachineSet creation on AWS (backend: MAPI, authority: MAPI) creates a MachineSet with EFA network interface type [disruptive, unified]
/Users/zhsun/go/src/github.com/openshift/cluster-api-actuator-pkg/pkg/unified/e2e/aws_machineset.go:86
  STEP: Checking if region supports EFA @ 11/19/25 20:53:47.249
  STEP: Creating a MachineTemplate with EFA network interface and c5n.9xlarge instance type @ 11/19/25 20:53:47.503
  STEP: Creating a MachineSet from the EFA template @ 11/19/25 20:53:47.762
  STEP: Waiting for Machines to become Running or skipping on capacity error @ 11/19/25 20:53:48.32
  STEP: Verifying EFA network interface configuration is correctly applied @ 11/19/25 20:57:28.333
• [221.337 seconds]
------------------------------
[ReportAfterSuite] Autogenerated ReportAfterSuite for --junit-report
autogenerated by Ginkgo
[ReportAfterSuite] PASSED [0.006 seconds]
------------------------------

Ran 3 of 74 Specs in 759.270 seconds
SUCCESS! -- 3 Passed | 0 Failed | 0 Pending | 71 Skipped
PASS

I also wonder, do we then need tests for our testing framework? Does it reduce confidence in the tests built this way, or make debugging harder?

yes, good concern. I think if MAPI and CAPI both pass the same test, this prove the framework is working and we could add unit tests for some functions (like configureAWSMAPIProviderSpec and configureAWSCAPIProviderSpec) to increase confidence.

@openshift-bot
Copy link
Copy Markdown

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci Bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Feb 19, 2026
@openshift-bot
Copy link
Copy Markdown

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci Bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Mar 21, 2026
@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 21, 2026

Walkthrough

Adds a unified E2E testing framework with pluggable MAPI/CAPI backends, two backend implementations, template configuration and test config loading, AWS-specific e2e tests and helpers, Make targets for backend-specific test runs, a small .gitignore update, and test bootstrap wiring for controller-runtime komega.

Changes

Cohort / File(s) Summary
Build & CI
\.gitignore, Makefile
Ignore *.test; add phony Make targets test-mapi, test-capi, test-mapi-with-capi-auth, and aggregate test-all that call hack/ci-integration.sh with backend/authority env vars and Ginkgo label filters.
Test bootstrap
pkg/e2e_test.go
Blank-import unified e2e package and register controller-runtime client with komega.SetClient in BeforeSuite.
Framework labels & utils
pkg/framework/ginkgo-labels.go, pkg/framework/utils.go
Add exported LabelUnified ginkgo label and exported MergeLabels(...map[string]string) map[string]string.
Unified backend interface & factory
pkg/unified/backends/backend_interface.go
Add MachineBackend interface, param/status structs (BackendMachineSetParams, BackendMachineTemplateParams, MachineSetStatus), and NewBackend/NewMAPIBackend/NewCAPIBackend` factory functions.
CAPI backend implementation
pkg/unified/backends/capi_backend.go
Add capiBackend implementing MachineBackend for CAPI: MachineSet/template create/delete/wait/status, AWS CAPI template handling and default spec derivation; Azure/GCP paths not implemented.
MAPI backend implementation
pkg/unified/backends/mapi_backend.go
Add mapiBackend implementing MachineBackend for MAPI: MachineSet/template create/delete/wait/status, ProviderSpec conversion and AWS MAPI provider spec defaults; Azure/GCP not implemented.
Template configuration
pkg/unified/config/template_config.go
Add MachineTemplateConfig (AWS/Azure/GCP structs) and ConfigureMachineTemplate to apply conditional mutations to AWS CAPI templates and AWS MAPI provider specs (instance type, spot, tenancy, EFA, tags, volumes, KMS).
Test config loader
pkg/unified/config/test_config.go
Add TestConfig and BackendType (MAPI/CAPI), LoadTestConfig() parsing TEST_BACKEND_TYPE and TEST_AUTHORITATIVE_API, and GetTestConfigOrDie().
Unified framework coordinator
pkg/unified/framework.go
Add unified.UnifiedFramework that loads test config, constructs a backend via factory, exposes backend-type accessors and forwards template/machineset operations to the backend.
E2E helpers & AWS tests
pkg/unified/e2e/test_helpers.go, pkg/unified/e2e/aws_machineset.go
Add TestHelper with lifecycle/creation/wait/delete/verification helpers, region detection and capacity-error skip logic; add AWS MachineSet e2e suite covering creation, spot instance, and EFA interface tests with platform/region gating.

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@theobarberbany
Copy link
Copy Markdown
Contributor

/remove-lifecycle rotten

@openshift-ci openshift-ci Bot removed the lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. label Mar 27, 2026
@theobarberbany
Copy link
Copy Markdown
Contributor

@coderabbitai full

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Mar 27, 2026

@theobarberbany Sure, I'll redo the full review of this PR now.

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@pkg/unified/backends/capi_backend.go`:
- Around line 25-28: The function getDefaultAWSCAPIMachineSpec is currently
non-deterministic because it picks the first AWSMachineTemplate from a list and
falls back to a hardcoded DefaultAMIID when none are found; change it to
deterministically select a known worker template (e.g., look up by a specific
name or label/annotation) instead of using the list order, and if no matching
AWSMachineTemplate exists return an explicit error rather than substituting
DefaultAMIID; update the code paths that reference AWSMachineTemplate and
DefaultAMIID to either use the selected template's AMI/fields or propagate the
failure so callers can handle the missing template rather than relying on a
region-specific hardcoded AMI.
- Line 82: The Eventually(...) calls are invoking client.Create(...) immediately
so transient failures aren't retried; wrap the call in a closure so Eventually
re-executes it. Replace patterns like Eventually(client.Create(ctx, machineSet),
framework.WaitMedium, framework.RetryMedium).Should(...) with Eventually(func()
error { return client.Create(ctx, machineSet) }, framework.WaitMedium,
framework.RetryMedium).Should(Succeed(), ...). Do the same for the other
client.Create(...) occurrence (the create for the other resource), keeping the
original wait/retry args and message.

In `@pkg/unified/backends/mapi_backend.go`:
- Around line 147-150: CreateMachineTemplate currently drops ctx and client when
calling createAWSMachineTemplate, causing getDefaultAWSMAPIProviderSpec to rely
on global komega state (komega.List / komega.SetClient) and to pick the first
MachineSet arbitrarily; update the call chain to pass ctx and client through:
change createAWSMachineTemplate(params) to createAWSMachineTemplate(ctx, client,
params) and update getDefaultAWSMAPIProviderSpec to accept (ctx, client, ...)
and use the provided runtimeclient.Client to list/fetch MachineSets (instead of
komega.List) and choose a deliberate source (e.g., a named or filtered
MachineSet) for defaults; apply the same ctx/client pass-through refactor for
the other platform-specific creation paths that currently discard ctx/client
(the other create*MachineTemplate callers in this file).
- Line 79: The Eventually call currently evaluates client.Create(ctx,
machineSet) immediately, so wrap the create in a closure like Eventually(func()
error { ... }) to ensure it is executed on each poll; inside the closure call
client.Create(ctx, machineSet) and treat k8s apierrors.IsAlreadyExists(err) as a
nil/success return (i.e., return nil when AlreadyExists) so transient failures
are retried and a late-created object doesn’t fail the test; update the call
site that references machineSet and client to this pattern and keep the same
timeout/assertion (Should(Succeed(), ...)).

In `@pkg/unified/config/template_config.go`:
- Around line 23-32: The MAPI path ignores AWSMachineConfig fields
NonRootVolumes and KMSKey causing inconsistent machine specs; update
configureAWSMAPIProviderSpec to map AWSMachineConfig.KMSKey and NonRootVolumes
into the MAPI provider spec (or explicitly return an error when those fields are
set) so behavior matches the CAPI path; locate the configureAWSMAPIProviderSpec
function and add handling to translate KMSKey into the MAPI KMS/CSE fields and
convert each VolumeConfig from NonRootVolumes into the equivalent MAPI block, or
add validation that rejects unsupported fields when present.

In `@pkg/unified/config/test_config.go`:
- Around line 29-31: The TestConfig loader currently defaults BackendType and
AuthoritativeAPI to BackendTypeMAPI when env values are unknown; change it to
trim incoming TEST_BACKEND_TYPE and TEST_AUTHORITATIVE_API, validate against
allowed backend enum values, and return an error (fail fast) if the trimmed
value is empty or unrecognized instead of silently assigning BackendTypeMAPI;
update the TestConfig construction/loader (fields BackendType and
AuthoritativeAPI and any function that reads those env vars) to perform
strings.TrimSpace on inputs and explicit validation against the allowed backend
constants, returning an error to the caller on invalid input so CI will fail for
typos or space-padded values.

In `@pkg/unified/e2e/test_helpers.go`:
- Around line 77-93: The check in WaitForMachinesRunningOrSkipOnCapacityError
currently marshals the original machineSet (parameter machineSet) which is stale
and misses per-machine InsufficientInstanceCapacity errors; instead, poll the
backing Machine and AWSMachine resources for updated status/errors before
deciding to Skip: inside WaitForMachinesRunningOrSkipOnCapacityError (which
currently calls helper.framework.WaitForMachinesRunning), query the cluster for
the current Machines (and their provider-specific AWSMachine status) that belong
to the MachineSet, inspect each machine/AWSMachine status/conditions/messages
for "InsufficientInstanceCapacity" (or equivalent provider error), and only call
Skip when such an error is observed on a current backing resource; otherwise
continue to call helper.framework.WaitForMachinesRunning as the success
condition.
- Around line 158-166: The test currently marshals the entire awsTemplate and
searches that string, which can match metadata/status; change it to marshal and
search only awsTemplate.Spec.Template.Spec (or directly assert the structured
fields) so the assertions verify the applied machine spec; update the code
around the fullTemplateBytes/configString and the loop over searchStrings to use
the Spec.Template.Spec object (or its marshalled string) when calling
Expect(...).To(ContainSubstring(...)) so matches come from the actual machine
spec rather than the whole AWSMachineTemplate.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 29bc2f36-ac76-4ff6-bcbb-e385264af812

📥 Commits

Reviewing files that changed from the base of the PR and between cc0dbce and 31b84a6.

📒 Files selected for processing (13)
  • .gitignore
  • Makefile
  • pkg/e2e_test.go
  • pkg/framework/ginkgo-labels.go
  • pkg/framework/utils.go
  • pkg/unified/backends/backend_interface.go
  • pkg/unified/backends/capi_backend.go
  • pkg/unified/backends/mapi_backend.go
  • pkg/unified/config/template_config.go
  • pkg/unified/config/test_config.go
  • pkg/unified/e2e/aws_machineset.go
  • pkg/unified/e2e/test_helpers.go
  • pkg/unified/framework.go

Comment thread pkg/unified/backends/capi_backend.go Outdated
Comment on lines +25 to +28
const (
// DefaultAMIID is a example AMI ID for testing.
DefaultAMIID = "ami-0c55b159cbfafe1d0"
)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

getDefaultAWSCAPIMachineSpec is non-deterministic and region-specific.

Line 215 blindly copies the first listed AWSMachineTemplate, and Line 223 falls back to a single hardcoded AMI when none exist. That makes the CAPI path depend on API list order and AWS region/account state, which will make these unified tests flaky across environments. Prefer selecting a known worker template explicitly, or fail fast if no suitable template can be discovered.

Also applies to: 203-247

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/capi_backend.go` around lines 25 - 28, The function
getDefaultAWSCAPIMachineSpec is currently non-deterministic because it picks the
first AWSMachineTemplate from a list and falls back to a hardcoded DefaultAMIID
when none are found; change it to deterministically select a known worker
template (e.g., look up by a specific name or label/annotation) instead of using
the list order, and if no matching AWSMachineTemplate exists return an explicit
error rather than substituting DefaultAMIID; update the code paths that
reference AWSMachineTemplate and DefaultAMIID to either use the selected
template's AMI/fields or propagate the failure so callers can handle the missing
template rather than relying on a region-specific hardcoded AMI.

Copy link
Copy Markdown
Contributor

@theobarberbany theobarberbany Apr 7, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunzhaohua2 I think we may want to look at this? It's surprising behaviour. If we're selecting this default we should let the user / test viewer know.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment thread pkg/unified/backends/capi_backend.go Outdated
Comment thread pkg/unified/backends/mapi_backend.go Outdated
machineSet.Spec.Template.Spec.ProviderSpec = *providerSpec
}

Eventually(client.Create(ctx, machineSet), framework.WaitMedium, framework.RetryMedium).Should(Succeed(), "Should create MAPI MachineSet %s", machineSet.Name)
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Mar 27, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
set -euo pipefail

rg -nP 'Eventually\(\s*client\.(Create|Update|Patch|Delete)\s*\(' pkg/unified/backends

Repository: openshift/cluster-api-actuator-pkg

Length of output: 699


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check imports in mapi_backend.go
echo "=== Checking imports in mapi_backend.go ==="
head -30 pkg/unified/backends/mapi_backend.go | grep -E "^import|apierrors"

echo ""
echo "=== Full imports section ==="
sed -n '/^import/,/^)/p' pkg/unified/backends/mapi_backend.go

echo ""
echo "=== Checking for apierrors usage in the file ==="
rg -n "apierrors" pkg/unified/backends/mapi_backend.go

Repository: openshift/cluster-api-actuator-pkg

Length of output: 1057


🏁 Script executed:

#!/bin/bash
set -euo pipefail

# Check if there are any correct Eventually patterns with function closures in the codebase
echo "=== Looking for Eventually patterns with closures (correct usage) ==="
rg -B2 -A2 'Eventually\(func\(\)' pkg/unified/backends --max-count=5

Repository: openshift/cluster-api-actuator-pkg

Length of output: 850


Fix Eventually to actually retry the create.

client.Create(ctx, machineSet) is evaluated immediately before Eventually starts polling, so transient API failures are never retried. Wrap the call in a closure so it executes on each poll attempt. Also handle AlreadyExists as success to avoid failing if the object was created but the response was lost.

The codebase already demonstrates the correct pattern—see the client.Delete calls that use Eventually(func() error { ... }).

Suggested fix
-	Eventually(client.Create(ctx, machineSet), framework.WaitMedium, framework.RetryMedium).Should(Succeed(), "Should create MAPI MachineSet %s", machineSet.Name)
+	Eventually(func() error {
+		err := client.Create(ctx, machineSet)
+		if apierrors.IsAlreadyExists(err) {
+			return nil
+		}
+		return err
+	}, framework.WaitMedium, framework.RetryMedium).Should(Succeed(), "Should create MAPI MachineSet %s", machineSet.Name)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
Eventually(client.Create(ctx, machineSet), framework.WaitMedium, framework.RetryMedium).Should(Succeed(), "Should create MAPI MachineSet %s", machineSet.Name)
Eventually(func() error {
err := client.Create(ctx, machineSet)
if apierrors.IsAlreadyExists(err) {
return nil
}
return err
}, framework.WaitMedium, framework.RetryMedium).Should(Succeed(), "Should create MAPI MachineSet %s", machineSet.Name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/mapi_backend.go` at line 79, The Eventually call
currently evaluates client.Create(ctx, machineSet) immediately, so wrap the
create in a closure like Eventually(func() error { ... }) to ensure it is
executed on each poll; inside the closure call client.Create(ctx, machineSet)
and treat k8s apierrors.IsAlreadyExists(err) as a nil/success return (i.e.,
return nil when AlreadyExists) so transient failures are retried and a
late-created object doesn’t fail the test; update the call site that references
machineSet and client to this pattern and keep the same timeout/assertion
(Should(Succeed(), ...)).

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to fix this @sunzhaohua2

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment thread pkg/unified/backends/mapi_backend.go Outdated
Comment thread pkg/unified/config/template_config.go
Comment thread pkg/unified/config/test_config.go
Comment thread pkg/unified/e2e/test_helpers.go
Comment thread pkg/unified/e2e/test_helpers.go Outdated
@theobarberbany
Copy link
Copy Markdown
Contributor

theobarberbany commented Apr 7, 2026

@sunzhaohua2 Can we address these please? :) We need to cover all the things coderabbit flagged (i've confirmed above) plus this:

(claude assisted review with the test standards skill from CCAPIO)

Existing Feedback

Comment Status
miyadav: e2e.test binary → .gitignore Fixed
RadekManak: factory pattern in framework.go unnecessary Fixed
RadekManak: short variable names (ms, uf, tpl) Noted by author
RadekManak: docs/unified.md — remove it Fixed
theobarberbany: DeleteMachineTemplate type switch is pointless Open
theobarberbany: CI jobs setting TEST_BACKEND_TYPE? Author linked openshift/release#71453
theobarberbany: Describe blocks lack context Author says updated

New Issues

1. Pervasive use of interface{} where concrete types or generics would work

MachineBackend methods all accept and return interface{} for templates and machinesets, then immediately type-assert with Expect(ok).To(BeTrue()). The compiler can't help you at all here.

The backend pattern could use generics, or at minimum split into typed MAPIBackend/CAPIBackend interfaces with the UnifiedFramework holding the dispatch logic. The current approach pushes all type safety to runtime, which is the opposite of what
you want in test infrastructure.

2. Duplicate BackendType definitions

config.BackendType and backends.BackendType are both type BackendType string with identical MAPI/CAPI constants. framework.go casts between them. Pick one package to own the type.

3. WaitForMachinesRunningOrSkipOnCapacityError — nested Eventually, stale state, and existing helpers ignored

This has three compounding problems:

Nested Eventually: This wraps framework.WaitForMachinesRunning inside an outer Eventually(..., WaitLong, RetryMedium). But WaitForMachinesRunning delegates to framework.WaitForCAPIMachinesRunning / framework.WaitForMachineSet, both of
which contain their own internal Eventually with WaitOverLong (30 minutes). The inner timeout will always dominate — it either succeeds or exhausts 30 minutes, at which point the outer Eventually also fails on the same iteration. The outer
Eventually will never actually retry.

Stale state: The capacity-error check marshals the local in-memory machineSet object, which is never re-fetched from the API server. InsufficientInstanceCapacity will never appear in the serialized JSON, so this skip can never trigger.

Existing helpers: framework.WaitForSpotMachineSet (MAPI) and framework.WaitForCAPIMachinesRunningWithRetry (CAPI) already exist in this repo and solve this exact problem correctly — they poll individual Machine objects for capacity errors via
hasInsufficientCapacity / HasCAPIInsufficientCapacity. This hand-rolled helper should be replaced with calls to those.

4. Missing GinkgoHelper() on test helpers

CreateTemplate, CreateMachineSet, DeleteTemplate, DeleteMachineSet all call Expect internally. Without GinkgoHelper(), failures will point at the helper source line, not the test that called it.

5. StringPtr helper already exists

k8s.io/utils/ptr.To[string]() is already used elsewhere in the diff (ptr.To(DefaultAMIID)). Use that instead of a local StringPtr.

6. Makefile *.test gitignore collision

The *.test gitignore rule would also match a file named Makefile.test if one were added. If the intent is a separate Make include file (as RadekManak suggested), the gitignore should be more specific (e.g., /e2e.test) or the include file named
differently (e.g., tests.mk).

7. Ordered container with shared BeforeAll state

If the first It leaks a resource (cleanup fails), subsequent It blocks run against a dirty cluster. Since they share no isolation beyond name uniqueness (unified-machineset, spot-machineset, efa-machineset), a leaked MachineSet won't directly
collide, but leaked Machines could affect capacity or node counts in later tests.

Test Standards Assessment

Standard Status
Right test level Good — these are e2e tests requiring real infrastructure
Existing helpers Mixed — framework.WaitForMachineSet, framework.DeleteCAPIMachineSets etc. are reused, but WaitForSpotMachineSet / WaitForCAPIMachinesRunningWithRetry are ignored in favour of a broken hand-rolled replacement (see issue 3)
Debuggable failures Mixed — most Expect calls have messages, but helpers call Expect without GinkgoHelper() so failures will point at the wrong line (see issue 4)
BDD names It blocks use action descriptions ("creates a new MachineSet") rather than behaviour specs ("should create...") — minor style nit
Concise assertions Fine
No sleeps No time.Sleep found

Summary

Issue 3 is the most serious — it's a combination of dead code (capacity skip never fires), misleading structure (outer Eventually never retries), and reinvention of existing helpers that already handle this correctly. Issue 1 will cause increasing
maintenance pain as more backends and tests are added. The rest are cleanup.

@sunzhaohua2
Copy link
Copy Markdown
Contributor Author

@coderabbitai full

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 8, 2026

@sunzhaohua2 Understood — I’ll redo the full review now and post updated findings inline shortly.

✅ Actions performed

Full review triggered.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

♻️ Duplicate comments (3)
pkg/unified/backends/mapi_backend.go (2)

214-221: ⚠️ Potential issue | 🟠 Major

Don’t derive defaults from an arbitrary MachineSet.

Kubernetes list order is not stable, and different worker MachineSets can carry different subnet, security-group, or zone defaults. Basing new provider specs on Items[0] makes these unified tests nondeterministic and can stamp out an invalid config on multi-set clusters. Select a deliberate source MachineSet before unmarshalling its ProviderSpec.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/mapi_backend.go` around lines 214 - 221, Don't pick an
arbitrary MachineSet by using machineSetList.Items[0]; instead choose a
deterministic source MachineSet before reading its ProviderSpec. Update the code
in mapi_backend.go so that after listing (machineSetList) you iterate and select
a MachineSet by explicit criteria (e.g., matching a known label/role, matching
cluster infra ID/worker pool name, or a specific zone/subnet tag) and assign
that element to the machineSet variable; then unmarshal
machineSet.Spec.Template.Spec.ProviderSpec from that selected MachineSet. Ensure
any helper or selector logic is clearly referenced (machineSetList, machineSet,
ProviderSpec) so the ProviderSpec is derived from a deliberate, deterministic
MachineSet rather than Items[0].

81-83: ⚠️ Potential issue | 🟠 Major

Treat AlreadyExists as a successful retry outcome.

If the create succeeds server-side and the client retries, the next poll sees AlreadyExists and this Eventually will fail until timeout. Handle that as success inside the closure.

Suggested change
 Eventually(func() error {
-	return client.Create(ctx, machineSet)
+	err := client.Create(ctx, machineSet)
+	if apierrors.IsAlreadyExists(err) {
+		return nil
+	}
+	return err
 }, framework.WaitMedium, framework.RetryMedium).Should(Succeed(), "Should create MAPI MachineSet %s", machineSet.Name)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/mapi_backend.go` around lines 81 - 83, The Eventually
closure that calls client.Create(ctx, machineSet) should treat an AlreadyExists
response as success; update the closure used in Eventually(...) so it captures
the error from client.Create, returns nil when the error satisfies
apierrors.IsAlreadyExists(err) and otherwise returns the original error,
ensuring the Eventually succeeds on idempotent create retries; also add or
ensure the import for "k8s.io/apimachinery/pkg/api/errors" (commonly aliased
apierrors) is available for the IsAlreadyExists check.
pkg/unified/backends/capi_backend.go (1)

217-231: ⚠️ Potential issue | 🟠 Major

Pick a deterministic source for the default AWS machine spec.

Choosing Items[0] / workers[0] makes the generated template depend on Kubernetes list order. In clusters with multiple AWSMachineTemplates or worker MachineSets, the same test can inherit a different spec across runs. Select a source by stable criteria, or fail when the selection is ambiguous.

Also applies to: 235-243

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/capi_backend.go` around lines 217 - 231, The code
currently picks Items[0] (in getDefaultAWSCAPIMachineSpec) which is
non-deterministic; change selection to a stable rule (e.g., sort
awsTemplateList.Items by a stable key such as metadata.Name or by a specific
label value) and then return the spec from the first item after sorting, or
explicitly fail when more than one candidate exists to avoid ambiguity; apply
the same deterministic-selection or ambiguity-fail logic to the analogous
selection at the other block (the createDefaultAWSCAPIMachineSpec/worker
MachineSet selection referenced in the comment) so both functions use the same
stable criteria.
🧹 Nitpick comments (1)
pkg/unified/backends/backend_interface.go (1)

15-25: Tighten this backend contract before it spreads further.

The interface{} inputs/outputs here force every implementation to downcast at runtime and turn backend mismatches into helper-side assertions instead of ordinary returned errors. A typed boundary here (or at least runtimeclient.Object plus backend-specific param types) would make the unified layer much safer to evolve.

Also applies to: 29-45

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/backend_interface.go` around lines 15 - 25, The
MachineBackend interface uses interface{} for template and machines which forces
unsafe downcasts; change the signatures to use typed Kubernetes objects
(runtimeclient.Object) and concrete param/result types instead of interface{}:
update MachineBackend, specifically CreateMachineTemplate,
DeleteMachineTemplate, CreateMachineSet, DeleteMachineSet,
WaitForMachineSetDeleted, WaitForMachinesRunning, GetMachineSetStatus and
GetNodesFromMachineSet to accept/return runtimeclient.Object (or a
backend-specific typed struct) and keep
BackendMachineTemplateParams/BackendMachineSetParams and *MachineSetStatus as
the typed parameter/result types; then update all implementing types to
return/accept these typed objects and handle conversion at the backend
implementation boundary rather than in the unified layer.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@Makefile`:
- Around line 96-109: The test-matrix is inconsistent because test-mapi uses
--label-filter='unified||mapi' while test-capi and test-mapi-with-capi-auth use
'unified', causing mismatched suites; change the Makefile so all three matrix
targets (test-mapi, test-capi, test-mapi-with-capi-auth) use the same
--label-filter='unified' to ensure parity, and if you still need legacy-only
MAPI runs extract the 'unified||mapi' filter into a separate PHONY target (e.g.,
legacy-mapi) instead of including it in the matrix.

In `@pkg/unified/backends/capi_backend.go`:
- Around line 120-135: GetMachineSetStatus currently reads status from the
original machineSet object instead of the live API object; update
capiBackend.GetMachineSetStatus to re-fetch the MachineSet from the API using
the provided runtimeclient.Client before reading
AvailableReplicas/ReadyReplicas: perform the type assertion on the input
(machineSet.(*clusterv1beta1.MachineSet)) to obtain name/namespace, then use
client.Get to load the current clusterv1beta1.MachineSet into a new variable and
build MachineSetStatus from that live object (keep existing handling of
ms.Spec.Replicas).

In `@pkg/unified/config/template_config.go`:
- Around line 120-143: The template currently replaces spec.NonRootVolumes (and
elsewhere similar code at the other block) instead of appending, which causes
inconsistent semantics between CAPI and MAPI; change the merge to append
semantics and ensure tag merging is consistent: for AdditionalTags keep the
current per-key assignment into spec.AdditionalTags (creating the map if nil)
and for NonRootVolumes, build the slice from config.NonRootVolumes then if
spec.NonRootVolumes is nil set it to that slice else append the new volumes to
the existing spec.NonRootVolumes (do the same fix in the analogous block around
lines 207-233).

In `@pkg/unified/e2e/test_helpers.go`:
- Around line 141-147: The current TestHelper.DeleteMachineSet only issues the
delete and returns immediately; modify it so after calling
helper.framework.DeleteMachineSet(helper.ctx, helper.client, machineSet) it
waits for the MachineSet and its machines to be fully cleaned up before
returning. Implement a poll-with-timeout using helper.ctx/helper.client (or
existing framework wait helpers) to repeatedly check that the MachineSet
resource no longer exists and that there are no associated Machine/Node objects
(or until helper.framework reports cleanup complete), failing if the timeout
elapses; keep references to DeleteMachineSet, helper.framework.DeleteMachineSet,
helper.ctx and helper.client to locate and update the code.

---

Duplicate comments:
In `@pkg/unified/backends/capi_backend.go`:
- Around line 217-231: The code currently picks Items[0] (in
getDefaultAWSCAPIMachineSpec) which is non-deterministic; change selection to a
stable rule (e.g., sort awsTemplateList.Items by a stable key such as
metadata.Name or by a specific label value) and then return the spec from the
first item after sorting, or explicitly fail when more than one candidate exists
to avoid ambiguity; apply the same deterministic-selection or ambiguity-fail
logic to the analogous selection at the other block (the
createDefaultAWSCAPIMachineSpec/worker MachineSet selection referenced in the
comment) so both functions use the same stable criteria.

In `@pkg/unified/backends/mapi_backend.go`:
- Around line 214-221: Don't pick an arbitrary MachineSet by using
machineSetList.Items[0]; instead choose a deterministic source MachineSet before
reading its ProviderSpec. Update the code in mapi_backend.go so that after
listing (machineSetList) you iterate and select a MachineSet by explicit
criteria (e.g., matching a known label/role, matching cluster infra ID/worker
pool name, or a specific zone/subnet tag) and assign that element to the
machineSet variable; then unmarshal machineSet.Spec.Template.Spec.ProviderSpec
from that selected MachineSet. Ensure any helper or selector logic is clearly
referenced (machineSetList, machineSet, ProviderSpec) so the ProviderSpec is
derived from a deliberate, deterministic MachineSet rather than Items[0].
- Around line 81-83: The Eventually closure that calls client.Create(ctx,
machineSet) should treat an AlreadyExists response as success; update the
closure used in Eventually(...) so it captures the error from client.Create,
returns nil when the error satisfies apierrors.IsAlreadyExists(err) and
otherwise returns the original error, ensuring the Eventually succeeds on
idempotent create retries; also add or ensure the import for
"k8s.io/apimachinery/pkg/api/errors" (commonly aliased apierrors) is available
for the IsAlreadyExists check.

---

Nitpick comments:
In `@pkg/unified/backends/backend_interface.go`:
- Around line 15-25: The MachineBackend interface uses interface{} for template
and machines which forces unsafe downcasts; change the signatures to use typed
Kubernetes objects (runtimeclient.Object) and concrete param/result types
instead of interface{}: update MachineBackend, specifically
CreateMachineTemplate, DeleteMachineTemplate, CreateMachineSet,
DeleteMachineSet, WaitForMachineSetDeleted, WaitForMachinesRunning,
GetMachineSetStatus and GetNodesFromMachineSet to accept/return
runtimeclient.Object (or a backend-specific typed struct) and keep
BackendMachineTemplateParams/BackendMachineSetParams and *MachineSetStatus as
the typed parameter/result types; then update all implementing types to
return/accept these typed objects and handle conversion at the backend
implementation boundary rather than in the unified layer.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: d28df0c9-e108-46a0-a61d-055905d8ae07

📥 Commits

Reviewing files that changed from the base of the PR and between 3bab34a and ced93f7.

⛔ Files ignored due to path filters (2)
  • vendor/k8s.io/apimachinery/pkg/util/rand/rand.go is excluded by !vendor/**, !**/vendor/**
  • vendor/modules.txt is excluded by !vendor/**, !**/vendor/**
📒 Files selected for processing (13)
  • .gitignore
  • Makefile
  • pkg/e2e_test.go
  • pkg/framework/ginkgo-labels.go
  • pkg/framework/utils.go
  • pkg/unified/backends/backend_interface.go
  • pkg/unified/backends/capi_backend.go
  • pkg/unified/backends/mapi_backend.go
  • pkg/unified/config/template_config.go
  • pkg/unified/config/test_config.go
  • pkg/unified/e2e/aws_machineset.go
  • pkg/unified/e2e/test_helpers.go
  • pkg/unified/framework.go

Comment thread Makefile
Comment on lines +96 to +109
.PHONY: test-mapi
test-mapi: ## Run MAPI authoritative testing (MAPI auth, CAPI mirrors)
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified||mapi'

.PHONY: test-capi
test-capi: ## Run pure CAPI testing (no MAPI involvement)
TEST_BACKEND_TYPE=CAPI TEST_AUTHORITATIVE_API=CAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'

.PHONY: test-mapi-with-capi-auth
test-mapi-with-capi-auth: ## Run tests with MAPI backend and CAPI authority (conversion layer)
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=CAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'

.PHONY: test-all
test-all: test-mapi test-capi test-mapi-with-capi-auth ## Run all unified framework test scenarios
Copy link
Copy Markdown

@coderabbitai coderabbitai Bot Apr 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, let's confirm the exact content of the Makefile at lines 96-109
sed -n '96,109p' Makefile

Repository: openshift/cluster-api-actuator-pkg

Length of output: 840


🏁 Script executed:

# Search for references to these test targets and label filters
rg 'test-mapi|test-capi|label-filter' -A 2 -B 2

Repository: openshift/cluster-api-actuator-pkg

Length of output: 8592


🏁 Script executed:

# Check the hack/ci-integration.sh to understand how label-filter is used
head -50 hack/ci-integration.sh

Repository: openshift/cluster-api-actuator-pkg

Length of output: 502


🏁 Script executed:

# Search for test definitions to understand what unified, mapi labels mean
fd -e go | xargs grep -l 'Label.*unified\|Label.*mapi' 2>/dev/null | head -5

Repository: openshift/cluster-api-actuator-pkg

Length of output: 180


🏁 Script executed:

# Look for any documentation or comments about the test matrix strategy
rg 'test matrix|parity|backend.*authority' -i

Repository: openshift/cluster-api-actuator-pkg

Length of output: 20847


🏁 Script executed:

# Check the ginkgo-labels.go file to understand test labels
cat pkg/framework/ginkgo-labels.go

Repository: openshift/cluster-api-actuator-pkg

Length of output: 1762


🏁 Script executed:

# Search for tests with "mapi" label to see what's included
rg 'Label.*mapi|Label.*unified' -A 2 -B 2

Repository: openshift/cluster-api-actuator-pkg

Length of output: 2555


🏁 Script executed:

# Check if there's any documentation or comments about the test matrix design
rg 'test-mapi|test-capi|test-all' -B 5 -A 5 | head -100

Repository: openshift/cluster-api-actuator-pkg

Length of output: 3860


Keep the backend matrix on the same suite.

test-mapi runs unified||mapi while test-capi and test-mapi-with-capi-auth both run only unified, so test-all does not compare the same test suite across backends. This breaks parity interpretation and allows legacy MAPI-only tests to skew the matrix results.

Suggested change
 .PHONY: test-mapi
 test-mapi: ## Run MAPI authoritative testing (MAPI auth, CAPI mirrors)
-	TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified||mapi'
+	TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'

If legacy MAPI coverage is needed, keep it under a separate target instead of including it in this parity matrix.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
.PHONY: test-mapi
test-mapi: ## Run MAPI authoritative testing (MAPI auth, CAPI mirrors)
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified||mapi'
.PHONY: test-capi
test-capi: ## Run pure CAPI testing (no MAPI involvement)
TEST_BACKEND_TYPE=CAPI TEST_AUTHORITATIVE_API=CAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'
.PHONY: test-mapi-with-capi-auth
test-mapi-with-capi-auth: ## Run tests with MAPI backend and CAPI authority (conversion layer)
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=CAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'
.PHONY: test-all
test-all: test-mapi test-capi test-mapi-with-capi-auth ## Run all unified framework test scenarios
.PHONY: test-mapi
test-mapi: ## Run MAPI authoritative testing (MAPI auth, CAPI mirrors)
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=MAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'
.PHONY: test-capi
test-capi: ## Run pure CAPI testing (no MAPI involvement)
TEST_BACKEND_TYPE=CAPI TEST_AUTHORITATIVE_API=CAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'
.PHONY: test-mapi-with-capi-auth
test-mapi-with-capi-auth: ## Run tests with MAPI backend and CAPI authority (conversion layer)
TEST_BACKEND_TYPE=MAPI TEST_AUTHORITATIVE_API=CAPI hack/ci-integration.sh $(GINKGO_ARGS) --label-filter='unified'
.PHONY: test-all
test-all: test-mapi test-capi test-mapi-with-capi-auth ## Run all unified framework test scenarios
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@Makefile` around lines 96 - 109, The test-matrix is inconsistent because
test-mapi uses --label-filter='unified||mapi' while test-capi and
test-mapi-with-capi-auth use 'unified', causing mismatched suites; change the
Makefile so all three matrix targets (test-mapi, test-capi,
test-mapi-with-capi-auth) use the same --label-filter='unified' to ensure
parity, and if you still need legacy-only MAPI runs extract the 'unified||mapi'
filter into a separate PHONY target (e.g., legacy-mapi) instead of including it
in the matrix.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I chose to make test-capi and test-mapi both running unified and backend-specific tests in order to support full backend testing.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sunzhaohua2, understood — that's a valid design goal. One thing worth noting for consistency: if the intent is full backend coverage per target, then test-capi would also need --label-filter='unified||capi' (assuming a capi label exists or gets added) to achieve the same depth of coverage on the CAPI side. As it stands, MAPI gets more test coverage than CAPI in test-all, which is an asymmetry. If that's intentional (e.g., CAPI doesn't have backend-specific tests yet), it's fine to leave as-is — just worth documenting.

Happy to leave this as-is given your explanation.

Comment on lines +120 to +135
func (c *capiBackend) GetMachineSetStatus(ctx context.Context, client runtimeclient.Client, machineSet interface{}) (*MachineSetStatus, error) {
GinkgoHelper()

ms, ok := machineSet.(*clusterv1beta1.MachineSet)
Expect(ok).To(BeTrue(), "Should be CAPI MachineSet, got %T", machineSet)

status := &MachineSetStatus{
Replicas: 0,
AvailableReplicas: ms.Status.AvailableReplicas,
ReadyReplicas: ms.Status.ReadyReplicas,
}
if ms.Spec.Replicas != nil {
status.Replicas = *ms.Spec.Replicas
}

return status, nil
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Re-read the MachineSet before reporting status.

This builds status from the original object passed in, not the live object in the API. After CreateMachineSet, AvailableReplicas and ReadyReplicas here can stay stale unless the caller refreshes first, so backend assertions can report zero even after the wait helper succeeds.

Suggested fix
 func (c *capiBackend) GetMachineSetStatus(ctx context.Context, client runtimeclient.Client, machineSet interface{}) (*MachineSetStatus, error) {
 	GinkgoHelper()

 	ms, ok := machineSet.(*clusterv1beta1.MachineSet)
 	Expect(ok).To(BeTrue(), "Should be CAPI MachineSet, got %T", machineSet)
+
+	current := &clusterv1beta1.MachineSet{}
+	if err := client.Get(ctx, runtimeclient.ObjectKeyFromObject(ms), current); err != nil {
+		return nil, err
+	}

 	status := &MachineSetStatus{
 		Replicas:          0,
-		AvailableReplicas: ms.Status.AvailableReplicas,
-		ReadyReplicas:     ms.Status.ReadyReplicas,
+		AvailableReplicas: current.Status.AvailableReplicas,
+		ReadyReplicas:     current.Status.ReadyReplicas,
 	}
-	if ms.Spec.Replicas != nil {
-		status.Replicas = *ms.Spec.Replicas
+	if current.Spec.Replicas != nil {
+		status.Replicas = *current.Spec.Replicas
 	}

 	return status, nil
 }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
func (c *capiBackend) GetMachineSetStatus(ctx context.Context, client runtimeclient.Client, machineSet interface{}) (*MachineSetStatus, error) {
GinkgoHelper()
ms, ok := machineSet.(*clusterv1beta1.MachineSet)
Expect(ok).To(BeTrue(), "Should be CAPI MachineSet, got %T", machineSet)
status := &MachineSetStatus{
Replicas: 0,
AvailableReplicas: ms.Status.AvailableReplicas,
ReadyReplicas: ms.Status.ReadyReplicas,
}
if ms.Spec.Replicas != nil {
status.Replicas = *ms.Spec.Replicas
}
return status, nil
func (c *capiBackend) GetMachineSetStatus(ctx context.Context, client runtimeclient.Client, machineSet interface{}) (*MachineSetStatus, error) {
GinkgoHelper()
ms, ok := machineSet.(*clusterv1beta1.MachineSet)
Expect(ok).To(BeTrue(), "Should be CAPI MachineSet, got %T", machineSet)
current := &clusterv1beta1.MachineSet{}
if err := client.Get(ctx, runtimeclient.ObjectKeyFromObject(ms), current); err != nil {
return nil, err
}
status := &MachineSetStatus{
Replicas: 0,
AvailableReplicas: current.Status.AvailableReplicas,
ReadyReplicas: current.Status.ReadyReplicas,
}
if current.Spec.Replicas != nil {
status.Replicas = *current.Spec.Replicas
}
return status, nil
}
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/backends/capi_backend.go` around lines 120 - 135,
GetMachineSetStatus currently reads status from the original machineSet object
instead of the live API object; update capiBackend.GetMachineSetStatus to
re-fetch the MachineSet from the API using the provided runtimeclient.Client
before reading AvailableReplicas/ReadyReplicas: perform the type assertion on
the input (machineSet.(*clusterv1beta1.MachineSet)) to obtain name/namespace,
then use client.Get to load the current clusterv1beta1.MachineSet into a new
variable and build MachineSetStatus from that live object (keep existing
handling of ms.Spec.Replicas).

Comment on lines +120 to +143
// Configure additional tags
if len(config.AdditionalTags) > 0 {
if spec.AdditionalTags == nil {
spec.AdditionalTags = make(map[string]string)
}

for k, v := range config.AdditionalTags {
spec.AdditionalTags[k] = v
}
}

// Configure non-root volumes
if len(config.NonRootVolumes) > 0 {
volumes := make([]awsv1.Volume, len(config.NonRootVolumes))
for i, v := range config.NonRootVolumes {
volumes[i] = awsv1.Volume{
DeviceName: v.DeviceName,
Size: v.Size,
Type: awsv1.VolumeType(v.Type),
}
}

spec.NonRootVolumes = volumes
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Normalize merge semantics across CAPI and MAPI.

The same MachineTemplateConfig does not mean the same thing on both paths today: CAPI merges/replaces existing tags and non-root volumes, while MAPI appends new entries. If the base template/provider spec already contains tags or block devices, unified tests will create materially different machine specs depending on backend.

Also applies to: 207-233

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/config/template_config.go` around lines 120 - 143, The template
currently replaces spec.NonRootVolumes (and elsewhere similar code at the other
block) instead of appending, which causes inconsistent semantics between CAPI
and MAPI; change the merge to append semantics and ensure tag merging is
consistent: for AdditionalTags keep the current per-key assignment into
spec.AdditionalTags (creating the map if nil) and for NonRootVolumes, build the
slice from config.NonRootVolumes then if spec.NonRootVolumes is nil set it to
that slice else append the new volumes to the existing spec.NonRootVolumes (do
the same fix in the analogous block around lines 207-233).

Comment on lines +141 to +147
// DeleteMachineSet safely deletes a machine set.
func (helper *TestHelper) DeleteMachineSet(machineSet interface{}) {
GinkgoHelper()

err := helper.framework.DeleteMachineSet(helper.ctx, helper.client, machineSet)
Expect(err).NotTo(HaveOccurred(), "Should delete MachineSet")
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

# First, verify the file exists and check the code at lines 141-147
cat -n pkg/unified/e2e/test_helpers.go | sed -n '141,147p'

Repository: openshift/cluster-api-actuator-pkg

Length of output: 408


🏁 Script executed:

# Search for WaitForMachineSetDeleted or similar wait methods in the framework
rg "WaitForMachineSetDeleted|WaitForMachineSet|DeleteMachineSet" --type go -A 3

Repository: openshift/cluster-api-actuator-pkg

Length of output: 18562


🏁 Script executed:

# Check the test_helpers.go file structure to understand the TestHelper type
grep -n "type TestHelper" pkg/unified/e2e/test_helpers.go -A 10

Repository: openshift/cluster-api-actuator-pkg

Length of output: 551


🏁 Script executed:

# Search for framework interface/struct definition to understand available methods
fd "framework" pkg/unified/e2e --type go -exec grep -l "DeleteMachineSet\|WaitForMachineSet" {} \;

Repository: openshift/cluster-api-actuator-pkg

Length of output: 247


Wait for MachineSet cleanup to finish before returning.

This helper only issues the delete. In the Ordered AWS suite, the next spec can start while the previous machines are still terminating, which leaves quota/capacity behind and makes the later spot/EFA cases flaky.

Suggested change
 func (helper *TestHelper) DeleteMachineSet(machineSet interface{}) {
 	GinkgoHelper()

 	err := helper.framework.DeleteMachineSet(helper.ctx, helper.client, machineSet)
 	Expect(err).NotTo(HaveOccurred(), "Should delete MachineSet")
+	err = helper.framework.WaitForMachineSetDeleted(helper.ctx, helper.client, machineSet)
+	Expect(err).NotTo(HaveOccurred(), "Should wait for MachineSet deletion")
 }
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pkg/unified/e2e/test_helpers.go` around lines 141 - 147, The current
TestHelper.DeleteMachineSet only issues the delete and returns immediately;
modify it so after calling helper.framework.DeleteMachineSet(helper.ctx,
helper.client, machineSet) it waits for the MachineSet and its machines to be
fully cleaned up before returning. Implement a poll-with-timeout using
helper.ctx/helper.client (or existing framework wait helpers) to repeatedly
check that the MachineSet resource no longer exists and that there are no
associated Machine/Node objects (or until helper.framework reports cleanup
complete), failing if the timeout elapses; keep references to DeleteMachineSet,
helper.framework.DeleteMachineSet, helper.ctx and helper.client to locate and
update the code.

@theobarberbany
Copy link
Copy Markdown
Contributor

/test e2e-aws-operator

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 14, 2026

@sunzhaohua2: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants