Skip to content

CNTRLPLANE-3314: add unified image resolution package#8468

Draft
bryan-cox wants to merge 5 commits intoopenshift:mainfrom
bryan-cox:CNTRLPLANE-3314
Draft

CNTRLPLANE-3314: add unified image resolution package#8468
bryan-cox wants to merge 5 commits intoopenshift:mainfrom
bryan-cox:CNTRLPLANE-3314

Conversation

@bryan-cox
Copy link
Copy Markdown
Member

@bryan-cox bryan-cox commented May 8, 2026

Summary

  • Adds support/imageresolution package consolidating scattered image resolution logic (5 decorator types, 3 utility functions) into a single ProviderSet with a builder API
  • Implements dual resolution paths: resolveForDirectFetch (CLI overrides + ICSP/IDMS mirrors for registry I/O) and resolveForPodSpec (CLI overrides only for management cluster pod specs)
  • ForDataPlane() builder preset enforces that CLI overrides and image overrides cannot leak to data-plane components at Build() time, while still allowing ICSP/IDMS mirrors for disconnected release image fetch (per OCPBUGS-44470)
  • Adds a go/analysis linter (lint-imageresolution) wired into make lint to prevent the old patterns from returning
  • Wires into all 5 operator binaries: hypershift-operator, control-plane-operator, HCCO, ignition-server, karpenter-operator
  • Deletes old decorator chain: RegistryMirrorProviderDecorator, ProviderWithOpenShiftImageRegistryOverridesDecorator, CommonRegistryProvider, and utility functions

Enhancement

openshift/enhancements#1981

Commits

  1. Vendorgolang.org/x/tools/go/analysis + Makefile lint target
  2. New packagesupport/imageresolution/ core library with builder, resolver, caches, adapters, and tests
  3. Lintergo/analysis analyzer with 3 rules (raw map params, banned imports, deprecated types) + test fixtures
  4. HO wiring + deletions — Wire ProviderSet into HO/karpenter, delete old decorators/interfaces/utils, migrate catalogs to ResolverConfig
  5. CPO wiring — Wire dual ProviderSet into CPO (CP + data-plane), HCCO configoperator, ignition-server, OLM catalogs

Key design decisions

  • ForDataPlane() allows ICSP/IDMS mirrors — The CPO needs mirrors to fetch the release image in disconnected environments. resolveForPodSpec does NOT apply mirrors, so data-plane pod specs still see original pullspecs. CLI overrides, dynamic mirror refresh, and image overrides are rejected.
  • ProviderSet.Lookup() rewrites ImageStream tags — After resolving component images through the resolver, the tags are rewritten so releaseinfo.ReleaseImage.ComponentImages() returns the overridden refs.
  • Thread safetymirroredReleaseImage protected by sync.RWMutex, rawMetadata.OpenShiftImageRegistryOverrides synchronized in adapter methods, configSource uses sync.RWMutex for dynamic mirror updates.

Test plan

  • 80+ behavior-driven unit tests with -race detection
  • make verify clean (excluding verify-git-clean)
  • make test — zero failures
  • make lint-imageresolution — zero violations
  • TestDualProviderSetInvariant validates management vs data-plane isolation including mirror fetch
  • Error propagation tests for all configSource failure paths
  • Round-trip tests for ResolverConfig serialization/parsing

🤖 Generated with Claude Code

@openshift-merge-bot
Copy link
Copy Markdown
Contributor

Pipeline controller notification
This repo is configured to use the pipeline controller. Second-stage tests will be triggered either automatically or after lgtm label is added, depending on the repository configuration. The pipeline controller will automatically detect which contexts are required and will utilize /test Prow commands to trigger the second stage.

For optional jobs, comment /test ? to see a list of all defined jobs. To trigger manually all jobs from second stage use /pipeline required command.

This repository is configured in: LGTM mode

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 8, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 8, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 8, 2026

@bryan-cox: This pull request references CNTRLPLANE-3314 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Summary

  • Adds support/imageresolution package that consolidates scattered image resolution logic into a single, testable package with a builder-based ProviderSet API
  • Implements resolveForDirectFetch (CLI overrides + ICSP/IDMS mirrors for registry I/O) and resolveForPodSpec (CLI overrides only for mgmt cluster pod specs)
  • Includes ForDataPlane() safety invariant on the builder that rejects management-cluster overrides for data-plane image resolution at Build() time

What's included

Component Files
ResolverConfig + serialization config.go, config_test.go
Instance-level caches (release, mirror, metadata) cache.go, cache_test.go
configSource interface config_source.go, config_source_test.go
imageResolver (dual resolution methods) resolver.go, resolver_test.go, testutil.go
releaseInfoProvider release.go, release_test.go
componentProvider component.go, component_test.go
metadataProvider metadata.go, metadata_test.go
ProviderSet builder provider_set.go, provider_set_test.go
Production registry client wrappers registry_client.go

Enhancement

Test plan

  • 69 unit tests passing with -race detection
  • go build and go vet clean
  • Dual-ProviderSet invariant test proves management vs data plane isolation
  • No callers yet — this is the foundation package (PR 1 of multi-PR series)
  • PR 2 will wire operator entrypoints to use this package

🤖 Generated with Claude Code

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

  • @coderabbitai resume to resume automatic reviews.
  • @coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

  • ▶️ Resume reviews
  • 🔍 Trigger review
📝 Walkthrough

Walkthrough

This PR adds a new image resolution subsystem under support/imageresolution: exported ReleaseImage and cloning helpers; TTL in-memory caches for releases, mirror availability, and metadata; ResolverConfig parsing/serialization and static/mutable config sources; an imageResolver that applies CLI registry overrides and mirror selection with availability checks; cached release and metadata providers plus registry-backed fetchers and HTTP mirror checker; a ProviderSet builder with reconcile/mirror-refresh support and adapters exposing previous provider interfaces; and comprehensive unit tests. Integrations update multiple operators and services to use imageresolution.ProviderSet replacing older releaseinfo-based construction.

Sequence Diagram(s)

sequenceDiagram
    participant Client
    participant ReleaseProvider as releaseInfoProvider
    participant Cache as releaseCache
    participant Resolver as imageResolver
    participant ConfigSource as configSource
    participant ReleaseFetcher as releaseFetcher

    Client->>ReleaseProvider: Lookup(pullSpec)
    ReleaseProvider->>Cache: get(pullSpec)
    alt cache miss
        Cache-->>ReleaseProvider: nil
        ReleaseProvider->>Resolver: resolveForDirectFetch(pullSpec)
        Resolver->>ConfigSource: current()
        ConfigSource-->>Resolver: ResolverConfig
        Resolver-->>ReleaseProvider: resolvedSpec
        ReleaseProvider->>ReleaseFetcher: fetch(resolvedSpec, pullSecret)
        ReleaseFetcher-->>ReleaseProvider: ReleaseImage
        ReleaseProvider->>Resolver: resolveForPodSpec(componentImage)
        Resolver-->>ReleaseProvider: podSpecImage
        ReleaseProvider->>Cache: put(pullSpec, ReleaseImage)
        Cache-->>ReleaseProvider: stored
    else cache hit
        Cache-->>ReleaseProvider: ReleaseImage
    end
    ReleaseProvider-->>Client: ReleaseImage
Loading
sequenceDiagram
    participant Client
    participant Resolver as imageResolver
    participant ConfigSource as configSource
    participant MirrorCache as mirrorAvailabilityCache
    participant MirrorChecker as mirrorAvailabilityChecker

    Client->>Resolver: resolveForDirectFetch(ref)
    Resolver->>ConfigSource: current()
    ConfigSource-->>Resolver: ResolverConfig
    Resolver->>Resolver: apply CLI registry overrides
    Resolver->>MirrorCache: get(mirror)
    alt cache miss
        MirrorCache-->>Resolver: miss
        Resolver->>MirrorChecker: isAvailable(mirror)
        MirrorChecker-->>Resolver: available/unavailable
        Resolver->>MirrorCache: set(mirror, available)
    else cache hit
        MirrorCache-->>Resolver: available/unavailable
    end
    alt mirror available
        Resolver-->>Client: resolvedRef (mirror applied)
    else no mirror available
        Resolver-->>Client: resolvedRef (original or CLI-overridden)
    end
Loading

Changes

Image Resolution Core

Layer / File(s) Summary
Data Types and Cloning
support/imageresolution/cache.go
Exports ReleaseImage struct with component images/versions, image stream, and stream metadata. Adds cloning helpers (cloneReleaseImage, cloneStringMap, cloneStringSliceMap).
Caches Implementation
support/imageresolution/cache.go
Implements TTL-based in-memory caches: releaseCache (string → *ReleaseImage), mirrorAvailabilityCache (string → bool), metadataCache (string → any), with expiry handling and concurrency locking.
Component Image Provider
support/imageresolution/component.go
Adds internal componentProvider exposing mapped images and versions, tracking missing component names, with cloning of returned maps.
Resolver Configuration
support/imageresolution/config.go
Adds ResolverConfig with registry overrides and image registry mirrors, plus deterministic serialization, cloning, emptiness checks, and parsing from CLI/env var strings with error handling.
Config Sources
support/imageresolution/config_source.go
Defines configSource interface and two implementations: immutable (staticConfigSource), and thread-safe mutable (mutableConfigSource) with mirror update support.
Image Resolver Logic
support/imageresolution/resolver.go
Implements imageResolver applying registry overrides and mapping source registries to mirror registries with availability checks and caching; supports direct-fetch and pod-spec resolution paths.
Release Fetching and Caching
support/imageresolution/release.go
Defines releaseFetcher interface and releaseInfoProvider with cached release lookup; resolves pullspec, fetches release image, applies pod spec translation and overrides, stores clones in cache.
Metadata Fetching and Caching
support/imageresolution/metadata.go
Adds metadataProvider to resolve image ref, fetch image config via fetcher, with caching keyed by resolved ref, error wrapping, and concurrency control.
Registry-backed Fetchers & Mirror Checker
support/imageresolution/registry_client.go
Adds HTTP-based httpMirrorChecker (HEAD to https://<registry>/v2/ with TLS config), registryReleaseFetcher resolving and converting releaseinfo.ReleaseImage, and registryMetadataFetcher delegating fetchConfig to hyperutil client.
Provider Set and Builder
support/imageresolution/provider_set.go
Implements ProviderSet, ProviderSetBuilder, including mirror refresh function, registry override & mirror configuration, wiring of resolver and cached providers, and data-plane mode with validations.

Adapters Exposing Prior Interfaces

Layer / File(s) Summary
Adapters Implementation
support/imageresolution/adapters.go
Adds adapter wrappers exposing ProviderSet’s release provider and registry metadata provider with interface compatibility, forwarding lookups and exposing config.
Adapter Unit Tests
support/imageresolution/adapters_test.go
Tests ReleaseProvider adapter lookup and config access; ImageMetadataProvider delegation; ProviderSet accessor methods; Reconcile updating config with mirror refresh results and no-op behavior; verifies mirrored release pullspec retrieval.

Unit Tests

Layer / File(s) Summary
Component Provider Tests
support/imageresolution/component_test.go
Tests GetImage behavior and missing tracking, ImageExist lookup, version and component maps.
Cache Tests
support/imageresolution/cache_test.go
Tests release cache hit/miss and expiry, mirror availability cache correctness and independence between instances.
Config Parsing & Serialization Tests
support/imageresolution/config_test.go
Validates serialization and parsing for registry overrides and image registry mirrors; checks empty/config states.
Config Source Tests
support/imageresolution/config_source_test.go
Tests static and mutable config sources retain or update mirror mappings as expected.
Resolver Tests
support/imageresolution/resolver_test.go
Covers direct fetch resolution with overrides and ICSP mirror selection/fallback; pod spec resolution behavior with/without mirrors.
Metadata Provider Tests
support/imageresolution/metadata_test.go
Verifies metadata fetching logic, cache hits prevent repeated fetches, error handling for missing configs, and resolution of overridden refs.
Release Info Provider Tests
support/imageresolution/release_test.go
Validates release lookup uses cached results, applies mirror overrides and image overrides, and errors correctly; invariant test across data-plane and control-plane providers.
Registry Client Conversion Test
support/imageresolution/registry_client_test.go
Validates deep-copying and preservation behavior of release image conversion.

Test Utilities

Layer / File(s) Summary
Test Utility
support/imageresolution/testutil.go
Defines internal fakeMirrorChecker supporting controlled mirror availability simulations.

Operator & Service Integrations

Layer / File(s) Summary
Control Plane Operator
control-plane-operator/main.go
Replaces legacy releaseinfo-based release and metadata provider construction with imageresolution.ProviderSet use; builds separate control-plane and data-plane provider sets; updates HostedControlPlaneReconciler wiring to new APIs.
Hypershift Operator Core & Controllers
Multiple files (hypershift-operator/*)
Updates imports to imageresolution; changes provider parameter types from concrete *RegistryClientImageMetadataProvider and decorator types to interfaces; reconstructs imageresolution provider set with mirror refresh support; updates controller and webhook wiring to use provider set accessor methods.
Ignition Server
ignition-server/cmd/start.go, ignition-server/controllers/local_ignitionprovider.go
Constructs ignition image resolution ProviderSet with registry overrides and mirrors; replaces custom releaseinfo chain with provider set; updates controller wiring to interface-based metadata provider.
Karpenter Operator
karpenter-operator/main.go
Switches ignition controller’s image release and metadata providers to use imageresolution.ProviderSet; replaces layered decorators with unified provider set; changes wiring to use accessor methods.

Estimated Code Review Effort

🎯 4 (Complex) | ⏱️ ~50 minutes

🚥 Pre-merge checks | ✅ 11 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 9.43% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (11 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and specifically identifies the main change: adding a new unified image resolution package to consolidate scattered image resolution logic.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. The PR adds 25 top-level Test functions and 48 unique subtests with descriptive, static names. No dynamic information found in test titles.
Test Structure And Quality ✅ Passed Custom check requires Ginkgo test code (Describe/It blocks). PR uses standard Go testing with Gomega assertions. No Ginkgo usage found in codebase. Check not applicable.
Microshift Test Compatibility ✅ Passed No Ginkgo e2e tests are added in this PR. All test files use standard Go testing patterns with testing.T and Gomega. The check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed PR contains only standard Go unit tests in support/imageresolution package. No Ginkgo e2e tests (It, Describe, Context patterns) are present. SNO compatibility check applies only to Ginkgo e2e tests.
Topology-Aware Scheduling Compatibility ✅ Passed PR adds topology-agnostic image resolution library. No deployment manifests, pod specs, or scheduling constraints (affinity/toleration/nodeSelector) are introduced.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations found. New imageresolution package contains no process-level code or stdout writes. Modified main.go files use os.Stderr, maintaining JSON protocol compliance.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed Custom check not applicable. PR adds 25 standard Go unit tests (not Ginkgo e2e tests) with mocked dependencies. No IPv4 assumptions or external connectivity requirements detected.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release and removed do-not-merge/needs-area labels May 8, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: bryan-cox

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2026
@codecov
Copy link
Copy Markdown

codecov Bot commented May 8, 2026

Codecov Report

❌ Patch coverage is 86.76093% with 103 lines in your changes missing coverage. Please review.
✅ Project coverage is 37.96%. Comparing base (bded456) to head (22feb6d).

Files with missing lines Patch % Lines
hypershift-operator/main.go 0.00% 26 Missing ⚠️
support/imageresolution/registry_client.go 67.64% 22 Missing ⚠️
...-plane-operator/hostedclusterconfigoperator/cmd.go 0.00% 14 Missing ⚠️
karpenter-operator/main.go 0.00% 13 Missing ⚠️
ignition-server/cmd/start.go 0.00% 12 Missing ⚠️
ignition-server/cmd/run_local_ignitionprovider.go 0.00% 5 Missing ⚠️
support/catalogs/images.go 72.72% 3 Missing ⚠️
.../hostedcontrolplane/v2/configoperator/component.go 0.00% 2 Missing ⚠️
support/imageresolution/lint/cmd/main.go 0.00% 2 Missing ⚠️
.../hostedcontrolplane/v2/ignitionserver/component.go 0.00% 1 Missing ⚠️
... and 3 more
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #8468      +/-   ##
==========================================
+ Coverage   37.53%   37.96%   +0.42%     
==========================================
  Files         751      761      +10     
  Lines       92026    92537     +511     
==========================================
+ Hits        34544    35132     +588     
+ Misses      54841    54762      -79     
- Partials     2641     2643       +2     
Files with missing lines Coverage Δ
...ostedcontrolplane/hostedcontrolplane_controller.go 38.66% <100.00%> (ø)
...hostedcontrolplane/v2/configoperator/deployment.go 68.88% <100.00%> (+58.88%) ⬆️
...hostedcontrolplane/v2/ignitionserver/deployment.go 34.69% <100.00%> (ø)
...configoperator/controllers/resources/olm/params.go 83.33% <100.00%> (+83.33%) ⬆️
...hift-operator/controllers/etcdbackup/reconciler.go 74.33% <ø> (ø)
...trollers/hostedcluster/hostedcluster_controller.go 43.23% <100.00%> (ø)
...tedclustersizing/hostedclustersizing_controller.go 69.00% <ø> (ø)
...ition-server/controllers/local_ignitionprovider.go 11.85% <ø> (+0.01%) ⬆️
support/globalconfig/imagecontentsource.go 57.03% <ø> (-2.31%) ⬇️
support/imageresolution/adapters.go 100.00% <100.00%> (ø)
... and 24 more

... and 1 file with indirect coverage changes

Flag Coverage Δ
cmd-support 33.88% <96.02%> (+1.12%) ⬆️
cpo-hostedcontrolplane 37.12% <69.23%> (+0.35%) ⬆️
cpo-other 37.91% <26.31%> (+0.15%) ⬆️
hypershift-operator 47.90% <22.22%> (-0.04%) ⬇️
other 27.80% <0.00%> (+0.02%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 9

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@support/imageresolution/cache.go`:
- Around line 32-39: The get methods currently treat expired entries as misses
but leave stale keys in the map; update releaseCache.get (and similarly
mirrorAvailabilityCache.get and metadataCache.get) to remove expired entries
when observed: when a read finds entry missing or expired, upgrade to a write
lock (or unlock the read lock and take the write lock), check expiry again, and
delete c.entries[key] before returning nil so the map doesn't grow unbounded;
ensure you reference c.entries, c.ttl, entry.timestamp and maintain proper
locking (mu.RLock -> mu.RUnlock -> mu.Lock -> mu.Unlock) to avoid races.

In `@support/imageresolution/component.go`:
- Around line 30-35: componentProvider's accessors ComponentVersions() and
ComponentImages() currently return the backing maps directly; change both to
return defensive copies instead: allocate new maps, copy all key/value pairs
from p.release.ComponentVersions and p.release.ComponentImages into them (handle
nil by returning an empty map or nil consistently), and return those copies so
callers cannot mutate the provider's ReleaseImage state.

In `@support/imageresolution/config_source.go`:
- Around line 13-18: The staticConfigSource currently stores and returns
s.config by value but that is a shallow copy — update newStaticConfigSource to
deep-clone the incoming ResolverConfig (including reference-typed fields like
RegistryOverrides) before storing it on the staticConfigSource, and change
staticConfigSource.current to return a deep-cloned ResolverConfig copy (not the
stored instance) so callers cannot mutate internal maps/slices; implement a
robust clone helper used by both newStaticConfigSource and current, and add a
regression test that mutates a returned config and asserts subsequent current()
calls are unchanged.

In `@support/imageresolution/config.go`:
- Around line 53-64: ParseRegistryOverridesFlag currently silently skips
malformed tokens; change it to validate each token and fail fast by returning an
error instead of dropping invalid entries: update ParseRegistryOverridesFlag to
return (map[string]string, error), check for tokens missing '=' or with empty
key or value and return a descriptive error on first invalid token; apply the
same change to ParseImageRegistryMirrorsEnvVar so both functions validate
"key=value" pairs, return errors on malformed input, and update any callers to
handle the new (map, error) signature.

In `@support/imageresolution/provider_set.go`:
- Around line 47-53: WithRegistryOverrides and WithImageRegistryMirrors
currently store caller-provided maps/slices by reference (fields
registryOverrides and mirrors), so callers can mutate them later; defensively
clone the inputs before assigning to the builder fields and when returning built
objects. Implement small helpers (e.g., cloneStringMap and cloneStringSliceMap)
that copy map[string]string and map[string][]string (deep-copying slices), use
them in ProviderSetBuilder.WithRegistryOverrides,
ProviderSetBuilder.WithImageRegistryMirrors and the other similar setters
referenced in the review (the methods around lines 64-66 and 94-97) so the
builder holds independent copies to avoid shared-state/race hazards.
- Around line 58-62: The builder currently records dynamic mirror settings in
ProviderSetBuilder.WithDynamicImageRegistryMirrors (setting hasDynamicMirrors
and dynamicInterval) but Build() always wires the static source; change Build
(and its callers) to fail fast when dynamic mirrors are requested: detect
b.hasDynamicMirrors inside ProviderSetBuilder.Build (or the function that
constructs the ProviderSet) and return an explicit error like "dynamic image
registry mirrors not implemented" (or propagate an error) instead of silently
ignoring the flag; ensure the error includes the symbol names
(ProviderSetBuilder, WithDynamicImageRegistryMirrors, hasDynamicMirrors,
dynamicInterval) so callers know the feature is unimplemented and can be updated
later.

In `@support/imageresolution/registry_client.go`:
- Around line 18-26: In isAvailable on httpMirrorChecker: stop ignoring the
passed context, stop creating a per-call http.Client with TLS
InsecureSkipVerify, and perform a context-aware HEAD request; specifically,
remove TLS InsecureSkipVerify from the Transport, use
http.NewRequestWithContext(ctx, "HEAD", "https://"+registry+"/v2/", nil) (or
client.Head with context if available), and reuse a shared http.Client instance
on httpMirrorChecker instead of allocating one per call so the request respects
cancellation/timeouts and TLS verification remains enabled.

In `@support/imageresolution/release.go`:
- Around line 37-62: The cached ReleaseImage objects (from p.cache.get /
p.cache.put) are being returned and mutated in place (e.g.,
release.ComponentImages updated after resolveForPodSpec and maps.Copy with
p.imageOverrides), corrupting shared cache state; instead, make and use a deep
copy of the ReleaseImage (including a new map copy of ComponentImages) before
applying resolver.resolveForPodSpec results and image overrides, store that copy
with p.cache.put and return that copy, and when p.cache.get returns a value
return a defensive copy rather than the original cached pointer so callers
cannot mutate the cached object.

In `@support/imageresolution/resolver.go`:
- Around line 51-56: The applyCLIOverrides function currently iterates the
overrides map nondeterministically and may match mid-string substrings; fix it
by collecting the keys of the overrides map, sort them (e.g., by length or
lexicographically) to enforce deterministic iteration, then iterate the sorted
keys and perform strict prefix checks (use ref == src or strings.HasPrefix(ref,
src+"/")) instead of strings.Contains before doing the single replacement;
update applyCLIOverrides to use the sorted keys and prefix-safe match so
mirrors/overrides are applied deterministically and only when the src is a true
prefix.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 0edb3e94-cf36-405a-9f5e-9006198cb503

📥 Commits

Reviewing files that changed from the base of the PR and between bded456 and de6b2e7.

📒 Files selected for processing (18)
  • support/imageresolution/cache.go
  • support/imageresolution/cache_test.go
  • support/imageresolution/component.go
  • support/imageresolution/component_test.go
  • support/imageresolution/config.go
  • support/imageresolution/config_source.go
  • support/imageresolution/config_source_test.go
  • support/imageresolution/config_test.go
  • support/imageresolution/metadata.go
  • support/imageresolution/metadata_test.go
  • support/imageresolution/provider_set.go
  • support/imageresolution/provider_set_test.go
  • support/imageresolution/registry_client.go
  • support/imageresolution/release.go
  • support/imageresolution/release_test.go
  • support/imageresolution/resolver.go
  • support/imageresolution/resolver_test.go
  • support/imageresolution/testutil.go

Comment thread support/imageresolution/cache.go Outdated
Comment thread support/imageresolution/component.go Outdated
Comment thread support/imageresolution/config_source.go Outdated
Comment thread support/imageresolution/config.go Outdated
Comment thread support/imageresolution/provider_set.go Outdated
Comment thread support/imageresolution/provider_set.go Outdated
Comment thread support/imageresolution/registry_client.go Outdated
Comment thread support/imageresolution/release.go
Comment thread support/imageresolution/resolver.go
@openshift-ci openshift-ci Bot added the area/karpenter-operator Indicates the PR includes changes related to the Karpenter operator label May 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
control-plane-operator/main.go (1)

450-478: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep management-cluster registry overrides out of the data-plane ProviderSet.

imageRegistryOverrides is mutated here to include registryOverrides, and then that same map is passed to dpProviderSet. So the data-plane resolver still inherits the management-cluster overrides through the mirror map, despite the comment saying it should not.

Suggested direction
-		if len(registryOverrides) > 0 {
-			if imageRegistryOverrides == nil {
-				imageRegistryOverrides = map[string][]string{}
-			}
-			for registry, override := range registryOverrides {
-				if _, exists := imageRegistryOverrides[registry]; !exists {
-					imageRegistryOverrides[registry] = []string{}
-				}
-				imageRegistryOverrides[registry] = append(imageRegistryOverrides[registry], override)
-			}
-		}
+		cpImageRegistryOverrides := cloneStringSliceMap(imageRegistryOverrides)
+		if len(registryOverrides) > 0 {
+			if cpImageRegistryOverrides == nil {
+				cpImageRegistryOverrides = map[string][]string{}
+			}
+			for registry, override := range registryOverrides {
+				cpImageRegistryOverrides[registry] = append(cpImageRegistryOverrides[registry], override)
+			}
+		}

 		cpProviderSet, err := imageresolution.NewProviderSet().
 			WithRegistryOverrides(registryOverrides).
-			WithImageRegistryMirrors(imageRegistryOverrides).
+			WithImageRegistryMirrors(cpImageRegistryOverrides).
 			WithImageOverrides(componentImages).
 			Build()

 		dpProviderSet, err := imageresolution.NewProviderSet().
 			WithImageRegistryMirrors(imageRegistryOverrides).
 			Build()
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@control-plane-operator/main.go` around lines 450 - 478, The code mutates
imageRegistryOverrides by merging registryOverrides into it, then passes that
mutated map to dpProviderSet which causes management-cluster registry overrides
to leak into the data-plane; to fix, avoid mutating the original
imageRegistryOverrides used for data-plane by creating a separate map for the
control-plane merge (e.g. copy imageRegistryOverrides to a new local map, merge
registryOverrides into that map) and pass that merged map into
cpProviderSet.WithImageRegistryMirrors while passing the unmodified
imageRegistryOverrides (or nil if originally nil) to
dpProviderSet.WithImageRegistryMirrors; update the code around
registryOverrides/imageRegistryOverrides and the calls that build cpProviderSet
and dpProviderSet
(imageresolution.NewProviderSet().WithImageRegistryMirrors(...)) accordingly.
♻️ Duplicate comments (5)
support/imageresolution/provider_set.go (2)

72-79: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clone caller-provided maps before storing or wiring them.

The builder keeps direct references to the input maps/slices and then reuses them in the live ProviderSet. Mutating an input after Build() will change resolution behavior in-place and can race with Reconcile()/lookup paths.

Also applies to: 94-96, 124-165

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/provider_set.go` around lines 72 - 79, The builder
currently stores and later exposes direct references to caller-provided
maps/slices (e.g., in WithRegistryOverrides and WithImageRegistryMirrors), which
allows external mutation to affect the built ProviderSet; clone all input maps
and any nested slices before storing them in the builder and again when wiring
into the live ProviderSet (apply the same cloning pattern referenced around
lines 94-96 and 124-165). Specifically, in
ProviderSetBuilder.WithRegistryOverrides(copy the overrides map into a new map)
and ProviderSetBuilder.WithImageRegistryMirrors(copy the mirrors map and for
each key copy the []string slice, set hasStaticMirrors accordingly), and ensure
Build() or the code that transfers builder fields into ProviderSet also performs
defensive copies so the runtime ProviderSet never holds references to
caller-owned collections.

83-86: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't silently accept WithDynamicImageRegistryMirrors() while Build() ignores it.

This setter records hasDynamicMirrors and dynamicInterval, but Build() never uses the interval and still succeeds even when nothing is wired to refresh mirrors. That leaves a public API that looks enabled but is effectively a no-op.

Also applies to: 119-175, 177-197

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/provider_set.go` around lines 83 - 86, The
WithDynamicImageRegistryMirrors setter records hasDynamicMirrors and
dynamicInterval but Build() ignores them, producing a misleading no-op public
API; update Build (method on ProviderSetBuilder) to check hasDynamicMirrors and
either (a) wire up the dynamic refresh component using dynamicInterval (e.g.,
register/start the mirror refresher) or (b) return an error or panic if the
necessary dynamic refresher implementation is not present, so callers enabling
dynamic mirrors actually get the scheduled behavior; touch
ProviderSetBuilder.Build, ensure dynamicInterval is used to configure the
refresher, and keep hasDynamicMirrors consistent with the final built
ProviderSet.
support/imageresolution/registry_client.go (1)

19-26: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Keep TLS verification enabled and honor the caller's context.

InsecureSkipVerify makes the availability probe MITM-able, and client.Head() here can't be cancelled by the caller. This should be a context-aware HEAD on a reused client with normal TLS verification.

Suggested direction
-type httpMirrorChecker struct{}
+type httpMirrorChecker struct {
+	client *http.Client
+}

-func (h *httpMirrorChecker) isAvailable(_ context.Context, registry string) bool {
-	client := &http.Client{
-		Timeout: 15 * time.Second,
-		Transport: &http.Transport{
-			TLSClientConfig: &tls.Config{InsecureSkipVerify: true},
-		},
-	}
-	resp, err := client.Head("https://" + registry + "/v2/")
+func (h *httpMirrorChecker) isAvailable(ctx context.Context, registry string) bool {
+	req, err := http.NewRequestWithContext(ctx, http.MethodHead, "https://"+registry+"/v2/", nil)
+	if err != nil {
+		return false
+	}
+	resp, err := h.client.Do(req)
 	if err != nil {
 		return false
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/registry_client.go` around lines 19 - 26, The
isAvailable method currently creates a new http.Client with TLS
InsecureSkipVerify and calls client.Head which ignores the provided context;
change httpMirrorChecker.isAvailable to use a reused http.Client on the
httpMirrorChecker struct (remove TLSClientConfig.InsecureSkipVerify so default
TLS verification is used), create a context-aware HEAD request via
http.NewRequestWithContext(ctx, "HEAD", "https://"+registry+"/v2/", nil),
execute it with client.Do(req), ensure resp.Body is closed, and respect context
cancellation/timeouts instead of calling client.Head.
support/imageresolution/config_source.go (1)

16-21: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Deep-copy ResolverConfig at every source boundary.

These value copies are shallow. current() hands out aliases to the internal maps/slices, and updateMirrors() stores the caller's map directly, so external code can mutate live resolver state outside the lock.

Suggested direction
 func newStaticConfigSource(cfg ResolverConfig) *staticConfigSource {
-	return &staticConfigSource{config: cfg}
+	return &staticConfigSource{config: cfg.Clone()}
 }

 func (s *staticConfigSource) current(_ context.Context) (ResolverConfig, error) {
-	return s.config, nil
+	return s.config.Clone(), nil
 }

 func newMutableConfigSource(cfg ResolverConfig) *mutableConfigSource {
-	return &mutableConfigSource{config: cfg}
+	return &mutableConfigSource{config: cfg.Clone()}
 }

 func (m *mutableConfigSource) current(_ context.Context) (ResolverConfig, error) {
 	m.mu.RLock()
 	defer m.mu.RUnlock()
-	return m.config, nil
+	return m.config.Clone(), nil
 }

 func (m *mutableConfigSource) updateMirrors(mirrors map[string][]string) {
 	m.mu.Lock()
 	defer m.mu.Unlock()
-	m.config.ImageRegistryMirrors = mirrors
+	m.config.ImageRegistryMirrors = cloneStringSliceMap(mirrors)
 }

Also applies to: 29-42

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/config_source.go` around lines 16 - 21, The
ResolverConfig and its internals are being aliased instead of cloned; update
newStaticConfigSource, staticConfigSource.current, and updateMirrors to perform
a deep copy of the ResolverConfig (clone maps, slices and any nested structures)
whenever storing or returning a ResolverConfig so callers never get direct
references to internal maps/slices; specifically, in newStaticConfigSource clone
the incoming cfg before assigning to staticConfigSource.config, in
staticConfigSource.current return a deep-copied ResolverConfig (not s.config),
and in updateMirrors copy the caller-provided map/slices into new map/slice
instances before storing them.
support/imageresolution/cache.go (1)

38-45: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Evict expired entries when they are observed.

These get paths treat expired data as a miss but keep the stale key in the map, so the caches still grow unbounded over time in a long-lived controller.

Also applies to: 75-82, 112-119

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/cache.go` around lines 38 - 45, The get method in
releaseCache currently treats expired entries as misses but leaves them in
c.entries, so fix releaseCache.get by evicting observed expired entries: when
you detect expiry (time.Since(entry.timestamp) > c.ttl) while holding
c.mu.RLock(), release the RLock(), acquire c.mu.Lock(), re-check the entry still
exists and is still expired (compare entry.timestamp and c.ttl), delete it from
c.entries, release the Lock, and return nil; use c.mu, c.entries,
entry.timestamp, c.ttl and releaseCache.get as the targets. Apply the same
read->upgrade-to-write-delete pattern to the other cache getter functions that
check time.Since(... ) so stale keys are removed when observed.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@karpenter-operator/main.go`:
- Around line 162-166: The provider set is being given CLI registry overrides as
mirror entries (WithImageRegistryMirrors(imageRegistryOverrides)), which turns
mandatory CLI rewrites into fallbacks and leaves GetRegistryOverrides() empty;
instead pass the CLI overrides to the provider set via a dedicated method (e.g.,
add WithRegistryOverrides(registryOverrides) or
WithImageRegistryOverrides(registryOverrides) on
imageresolution.NewProviderSet()) and leave imageRegistryOverrides for mirror
configuration only; if the builder lacks that method, add it to the
imageresolution.ProviderSet API and ensure Build() stores the overrides so
GetRegistryOverrides() returns them.

In `@support/imageresolution/adapters.go`:
- Around line 36-44: The Lookup method on releaseProviderAdapter returns the
cached ri.ImageStream and ri.StreamMetadata directly; create and return
defensive copies instead to avoid shared-state mutation. In
releaseProviderAdapter.Lookup, after receiving ri from a.ps.release.Lookup,
deep-copy ri.ImageStream and ri.StreamMetadata (e.g., allocate new
releaseinfo.ReleaseImage and copy fields or use existing clone/copy helpers) and
populate the returned releaseinfo.ReleaseImage with those copies rather than
ri.ImageStream/ri.StreamMetadata so callers cannot mutate provider-held objects.

---

Outside diff comments:
In `@control-plane-operator/main.go`:
- Around line 450-478: The code mutates imageRegistryOverrides by merging
registryOverrides into it, then passes that mutated map to dpProviderSet which
causes management-cluster registry overrides to leak into the data-plane; to
fix, avoid mutating the original imageRegistryOverrides used for data-plane by
creating a separate map for the control-plane merge (e.g. copy
imageRegistryOverrides to a new local map, merge registryOverrides into that
map) and pass that merged map into cpProviderSet.WithImageRegistryMirrors while
passing the unmodified imageRegistryOverrides (or nil if originally nil) to
dpProviderSet.WithImageRegistryMirrors; update the code around
registryOverrides/imageRegistryOverrides and the calls that build cpProviderSet
and dpProviderSet
(imageresolution.NewProviderSet().WithImageRegistryMirrors(...)) accordingly.

---

Duplicate comments:
In `@support/imageresolution/cache.go`:
- Around line 38-45: The get method in releaseCache currently treats expired
entries as misses but leaves them in c.entries, so fix releaseCache.get by
evicting observed expired entries: when you detect expiry
(time.Since(entry.timestamp) > c.ttl) while holding c.mu.RLock(), release the
RLock(), acquire c.mu.Lock(), re-check the entry still exists and is still
expired (compare entry.timestamp and c.ttl), delete it from c.entries, release
the Lock, and return nil; use c.mu, c.entries, entry.timestamp, c.ttl and
releaseCache.get as the targets. Apply the same read->upgrade-to-write-delete
pattern to the other cache getter functions that check time.Since(... ) so stale
keys are removed when observed.

In `@support/imageresolution/config_source.go`:
- Around line 16-21: The ResolverConfig and its internals are being aliased
instead of cloned; update newStaticConfigSource, staticConfigSource.current, and
updateMirrors to perform a deep copy of the ResolverConfig (clone maps, slices
and any nested structures) whenever storing or returning a ResolverConfig so
callers never get direct references to internal maps/slices; specifically, in
newStaticConfigSource clone the incoming cfg before assigning to
staticConfigSource.config, in staticConfigSource.current return a deep-copied
ResolverConfig (not s.config), and in updateMirrors copy the caller-provided
map/slices into new map/slice instances before storing them.

In `@support/imageresolution/provider_set.go`:
- Around line 72-79: The builder currently stores and later exposes direct
references to caller-provided maps/slices (e.g., in WithRegistryOverrides and
WithImageRegistryMirrors), which allows external mutation to affect the built
ProviderSet; clone all input maps and any nested slices before storing them in
the builder and again when wiring into the live ProviderSet (apply the same
cloning pattern referenced around lines 94-96 and 124-165). Specifically, in
ProviderSetBuilder.WithRegistryOverrides(copy the overrides map into a new map)
and ProviderSetBuilder.WithImageRegistryMirrors(copy the mirrors map and for
each key copy the []string slice, set hasStaticMirrors accordingly), and ensure
Build() or the code that transfers builder fields into ProviderSet also performs
defensive copies so the runtime ProviderSet never holds references to
caller-owned collections.
- Around line 83-86: The WithDynamicImageRegistryMirrors setter records
hasDynamicMirrors and dynamicInterval but Build() ignores them, producing a
misleading no-op public API; update Build (method on ProviderSetBuilder) to
check hasDynamicMirrors and either (a) wire up the dynamic refresh component
using dynamicInterval (e.g., register/start the mirror refresher) or (b) return
an error or panic if the necessary dynamic refresher implementation is not
present, so callers enabling dynamic mirrors actually get the scheduled
behavior; touch ProviderSetBuilder.Build, ensure dynamicInterval is used to
configure the refresher, and keep hasDynamicMirrors consistent with the final
built ProviderSet.

In `@support/imageresolution/registry_client.go`:
- Around line 19-26: The isAvailable method currently creates a new http.Client
with TLS InsecureSkipVerify and calls client.Head which ignores the provided
context; change httpMirrorChecker.isAvailable to use a reused http.Client on the
httpMirrorChecker struct (remove TLSClientConfig.InsecureSkipVerify so default
TLS verification is used), create a context-aware HEAD request via
http.NewRequestWithContext(ctx, "HEAD", "https://"+registry+"/v2/", nil),
execute it with client.Do(req), ensure resp.Body is closed, and respect context
cancellation/timeouts instead of calling client.Head.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 1eb395c5-bf2c-448b-b19a-0089253112a9

📥 Commits

Reviewing files that changed from the base of the PR and between de6b2e7 and 1701d94.

📒 Files selected for processing (17)
  • control-plane-operator/main.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_webhook.go
  • hypershift-operator/controllers/hostedclustersizing/hostedclustersizing_controller.go
  • hypershift-operator/controllers/hostedclustersizing/setup.go
  • hypershift-operator/main.go
  • ignition-server/cmd/start.go
  • ignition-server/controllers/local_ignitionprovider.go
  • karpenter-operator/main.go
  • support/imageresolution/adapters.go
  • support/imageresolution/adapters_test.go
  • support/imageresolution/cache.go
  • support/imageresolution/config_source.go
  • support/imageresolution/config_source_test.go
  • support/imageresolution/provider_set.go
  • support/imageresolution/registry_client.go
  • support/imageresolution/registry_client_test.go
  • support/imageresolution/release.go
✅ Files skipped from review due to trivial changes (1)
  • support/imageresolution/config_source_test.go
🚧 Files skipped from review as they are similar to previous changes (1)
  • support/imageresolution/release.go

Comment thread karpenter-operator/main.go
Comment thread support/imageresolution/adapters.go Outdated
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🧹 Nitpick comments (1)
support/imageresolution/provider_set.go (1)

45-52: ⚡ Quick win

Defensively clone refreshed mirrors before storing shared state.

Reconcile() currently assigns the returned mirrors map directly into internal fields. Cloning here avoids aliasing and surprise mutation/race hazards if the refresh implementation reuses that map later.

Suggested fix
 func (ps *ProviderSet) Reconcile(ctx context.Context, client crclient.Client) error {
 	if ps.mirrorRefresh == nil {
 		return nil
 	}
 	mirrors, err := ps.mirrorRefresh(ctx, client)
 	if err != nil {
 		return fmt.Errorf("refreshing image registry mirrors: %w", err)
 	}
-	ps.configSource.updateMirrors(mirrors)
+	mirrorsCopy := cloneStringSliceMap(mirrors)
+	ps.configSource.updateMirrors(mirrorsCopy)
 	if ps.rawMetadata != nil {
-		ps.rawMetadata.OpenShiftImageRegistryOverrides = mirrors
+		ps.rawMetadata.OpenShiftImageRegistryOverrides = cloneStringSliceMap(mirrorsCopy)
 	}
 	return nil
 }
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/provider_set.go` around lines 45 - 52, The refreshed
mirrors map returned by ps.mirrorRefresh should be defensively cloned before
storing into shared state to avoid aliasing and mutation races; create a new map
variable, copy all key/value pairs from mirrors into it (handling nil returned
maps by creating an empty map), and then pass that cloned map to
ps.configSource.updateMirrors and assign it to
ps.rawMetadata.OpenShiftImageRegistryOverrides so the internal state holds an
independent copy.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@support/imageresolution/provider_set.go`:
- Around line 187-198: The ForDataPlane() invariant is bypassable via
WithMirrorRefresh(); update ProviderSetBuilder.validate() to reject
WithMirrorRefresh when b.forDataPlane is true by adding a check (e.g., if
b.forDataPlane && <mirrorRefreshFlag> { return fmt.Errorf("ForDataPlane() is
incompatible with WithMirrorRefresh()") }), where <mirrorRefreshFlag> is the
builder field set by WithMirrorRefresh(); also audit Reconcile() to ensure it
does not inject mirrors into resolver state when forDataPlane is true.

---

Nitpick comments:
In `@support/imageresolution/provider_set.go`:
- Around line 45-52: The refreshed mirrors map returned by ps.mirrorRefresh
should be defensively cloned before storing into shared state to avoid aliasing
and mutation races; create a new map variable, copy all key/value pairs from
mirrors into it (handling nil returned maps by creating an empty map), and then
pass that cloned map to ps.configSource.updateMirrors and assign it to
ps.rawMetadata.OpenShiftImageRegistryOverrides so the internal state holds an
independent copy.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 445d3ec6-b69f-4769-b8f5-36e4af898615

📥 Commits

Reviewing files that changed from the base of the PR and between 1701d94 and 9054b91.

📒 Files selected for processing (29)
  • control-plane-operator/main.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_webhook.go
  • hypershift-operator/controllers/hostedclustersizing/hostedclustersizing_controller.go
  • hypershift-operator/controllers/hostedclustersizing/setup.go
  • hypershift-operator/main.go
  • ignition-server/cmd/start.go
  • ignition-server/controllers/local_ignitionprovider.go
  • karpenter-operator/main.go
  • support/imageresolution/adapters.go
  • support/imageresolution/adapters_test.go
  • support/imageresolution/cache.go
  • support/imageresolution/cache_test.go
  • support/imageresolution/component.go
  • support/imageresolution/component_test.go
  • support/imageresolution/config.go
  • support/imageresolution/config_source.go
  • support/imageresolution/config_source_test.go
  • support/imageresolution/config_test.go
  • support/imageresolution/metadata.go
  • support/imageresolution/metadata_test.go
  • support/imageresolution/provider_set.go
  • support/imageresolution/provider_set_test.go
  • support/imageresolution/registry_client.go
  • support/imageresolution/registry_client_test.go
  • support/imageresolution/release.go
  • support/imageresolution/release_test.go
  • support/imageresolution/resolver.go
  • support/imageresolution/resolver_test.go
  • support/imageresolution/testutil.go
✅ Files skipped from review due to trivial changes (6)
  • support/imageresolution/provider_set_test.go
  • support/imageresolution/cache_test.go
  • support/imageresolution/component_test.go
  • support/imageresolution/adapters_test.go
  • support/imageresolution/testutil.go
  • support/imageresolution/config_test.go
🚧 Files skipped from review as they are similar to previous changes (20)
  • support/imageresolution/component.go
  • hypershift-operator/controllers/hostedclustersizing/hostedclustersizing_controller.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_webhook.go
  • support/imageresolution/config_source_test.go
  • support/imageresolution/cache.go
  • support/imageresolution/release.go
  • support/imageresolution/config_source.go
  • support/imageresolution/registry_client_test.go
  • support/imageresolution/resolver_test.go
  • hypershift-operator/main.go
  • support/imageresolution/release_test.go
  • ignition-server/controllers/local_ignitionprovider.go
  • support/imageresolution/metadata_test.go
  • hypershift-operator/controllers/hostedclustersizing/setup.go
  • support/imageresolution/resolver.go
  • support/imageresolution/metadata.go
  • karpenter-operator/main.go
  • ignition-server/cmd/start.go
  • support/imageresolution/adapters.go
  • control-plane-operator/main.go

Comment thread support/imageresolution/provider_set.go
@bryan-cox bryan-cox force-pushed the CNTRLPLANE-3314 branch 2 times, most recently from 5b2759c to 61076c0 Compare May 8, 2026 19:43
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
support/imageresolution/registry_client.go (1)

28-31: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Re-enable certificate validation for mirror probes.

Line [29] still sets InsecureSkipVerify: true. Even for a HEAD probe, this allows MITM spoofing of mirror availability decisions. Keep TLS verification on and trust self-signed mirrors via CA roots instead.

Suggested minimal change
 			Transport: &http.Transport{
 				// `#nosec` G402 -- mirrors may use self-signed certs; availability
 				// probe only, no data is transferred.
 				TLSClientConfig: &tls.Config{
-					InsecureSkipVerify: true,
 					MinVersion:         tls.VersionTLS12,
 				},
 			},
#!/bin/bash
# Verify all insecure TLS skips and inspect whether CA-root wiring exists for mirror checks.
rg -n --type=go 'InsecureSkipVerify\s*:\s*true'
rg -n --type=go 'httpMirrorChecker|newHTTPMirrorChecker|RootCAs|TLSClientConfig|additional.*trust|ca.*bundle'
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@support/imageresolution/registry_client.go` around lines 28 - 31, The
TLSClientConfig for mirror probes currently sets InsecureSkipVerify: true which
disables certificate validation; change this in the TLSClientConfig used by the
mirror probe (the struct instance created where TLSClientConfig is set) to
enable verification (remove or set InsecureSkipVerify: false) and wire up
RootCAs instead: load x509.SystemCertPool(), append any configured mirror CA
certs if present, and assign that pool to TLSClientConfig.RootCAs while keeping
MinVersion: tls.VersionTLS12; ensure the HTTP mirror checker (the code that
constructs the TLSClientConfig/transport used by the HEAD probe) uses this
verified config so self-signed mirrors are trusted via CA roots rather than
skipping verification.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Duplicate comments:
In `@support/imageresolution/registry_client.go`:
- Around line 28-31: The TLSClientConfig for mirror probes currently sets
InsecureSkipVerify: true which disables certificate validation; change this in
the TLSClientConfig used by the mirror probe (the struct instance created where
TLSClientConfig is set) to enable verification (remove or set
InsecureSkipVerify: false) and wire up RootCAs instead: load
x509.SystemCertPool(), append any configured mirror CA certs if present, and
assign that pool to TLSClientConfig.RootCAs while keeping MinVersion:
tls.VersionTLS12; ensure the HTTP mirror checker (the code that constructs the
TLSClientConfig/transport used by the HEAD probe) uses this verified config so
self-signed mirrors are trusted via CA roots rather than skipping verification.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 1f77d81e-16e8-4cf9-921c-9a8d68d13a62

📥 Commits

Reviewing files that changed from the base of the PR and between 9054b91 and 5b2759c.

📒 Files selected for processing (29)
  • control-plane-operator/main.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_webhook.go
  • hypershift-operator/controllers/hostedclustersizing/hostedclustersizing_controller.go
  • hypershift-operator/controllers/hostedclustersizing/setup.go
  • hypershift-operator/main.go
  • ignition-server/cmd/start.go
  • ignition-server/controllers/local_ignitionprovider.go
  • karpenter-operator/main.go
  • support/imageresolution/adapters.go
  • support/imageresolution/adapters_test.go
  • support/imageresolution/cache.go
  • support/imageresolution/cache_test.go
  • support/imageresolution/component.go
  • support/imageresolution/component_test.go
  • support/imageresolution/config.go
  • support/imageresolution/config_source.go
  • support/imageresolution/config_source_test.go
  • support/imageresolution/config_test.go
  • support/imageresolution/metadata.go
  • support/imageresolution/metadata_test.go
  • support/imageresolution/provider_set.go
  • support/imageresolution/provider_set_test.go
  • support/imageresolution/registry_client.go
  • support/imageresolution/registry_client_test.go
  • support/imageresolution/release.go
  • support/imageresolution/release_test.go
  • support/imageresolution/resolver.go
  • support/imageresolution/resolver_test.go
  • support/imageresolution/testutil.go
✅ Files skipped from review due to trivial changes (5)
  • support/imageresolution/component_test.go
  • support/imageresolution/config_test.go
  • support/imageresolution/testutil.go
  • support/imageresolution/config_source_test.go
  • support/imageresolution/adapters_test.go
🚧 Files skipped from review as they are similar to previous changes (22)
  • support/imageresolution/provider_set_test.go
  • control-plane-operator/main.go
  • support/imageresolution/config_source.go
  • ignition-server/cmd/start.go
  • ignition-server/controllers/local_ignitionprovider.go
  • karpenter-operator/main.go
  • support/imageresolution/metadata_test.go
  • hypershift-operator/controllers/hostedclustersizing/hostedclustersizing_controller.go
  • support/imageresolution/registry_client_test.go
  • support/imageresolution/release.go
  • support/imageresolution/config.go
  • support/imageresolution/resolver.go
  • support/imageresolution/cache.go
  • hypershift-operator/main.go
  • support/imageresolution/adapters.go
  • support/imageresolution/metadata.go
  • support/imageresolution/release_test.go
  • support/imageresolution/component.go
  • support/imageresolution/resolver_test.go
  • support/imageresolution/cache_test.go
  • hypershift-operator/controllers/hostedcluster/hostedcluster_webhook.go
  • support/imageresolution/provider_set.go

@bryan-cox bryan-cox force-pushed the CNTRLPLANE-3314 branch 2 times, most recently from 0b41b8c to 1cab572 Compare May 8, 2026 20:42
@hypershift-jira-solve-ci
Copy link
Copy Markdown

I now have all the evidence needed. Here is the analysis:

Test Failure Analysis Complete

Job Information

  • Prow Job: N/A — these are GitHub Check Runs from Codecov, not Prow CI jobs
  • Build ID: Check Run IDs 75092260536 (codecov/patch) and 75092258838 (codecov/project)
  • PR: openshift/hypershift#8468CNTRLPLANE-3314: add unified image resolution package
  • Commit: 1cab5722981f3af28e4345a6cc1825fe338060ca
  • Current Status: Both codecov checks have self-resolved — later runs on the same commit passed (codecov/patch: 84.21%, codecov/project: 37.89% +0.35%)

Test Failure Analysis

Error

codecov/patch: 0.00% of diff hit (target 37.53%) — FAILURE
codecov/project: 27.81% (-9.73%) compared to bded456 — FAILURE

Summary

Both Codecov failures were transient race-condition failures caused by Codecov's wait_for_ci: false configuration. Codecov evaluated coverage at 20:47:13 UTC before any of the 5 unit test shards had finished uploading their coverage reports (the earliest shard completed at 20:49:26 UTC — 2 minutes later). With zero coverage uploads received, Codecov reported 0.00% patch coverage and a -9.73% project coverage drop. When the unit tests completed and uploaded their coverage data, Codecov re-evaluated at 21:29:12 UTC and both checks flipped to success (84.21% patch, 37.89% project +0.35%). No code change or re-push was needed.

Root Cause

The root cause is a race condition between Codecov status reporting and GitHub Actions coverage uploads, caused by the wait_for_ci: false setting in the repository's codecov.yml.

Detailed timeline on commit 1cab572:

  1. 20:42:25 UTC — GitHub Actions workflows triggered (Unit Tests, Lint, Verify, etc.)
  2. 20:43:01 UTCtest / Detect Changes job started
  3. 20:44:00–20:44:09 UTC — Unit test shard jobs started (5 shards: cpo-hostedcontrolplane, cpo-other, hypershift-operator, cmd-support, other)
  4. 20:47:12–20:47:14 UTC⚠️ Codecov evaluated coverage with ZERO uploads received → reported 0.00% patch and 27.81% (-9.73%) project → posted failure status checks (IDs 75092258838 and 75092260536)
  5. 20:49:26–20:51:03 UTC — First 3 unit test shards completed and uploaded coverage (other, cpo-hostedcontrolplane, cmd-support)
  6. 21:25:54–21:30:35 UTC — Remaining 2 shards completed (hypershift-operator, cpo-other)
  7. 21:29:11–21:29:12 UTC — ✅ Codecov re-evaluated with all uploads → reported 84.21% patch and 37.89% (+0.35%) project → posted success status checks (IDs 75097918235 and 75097919735)

The wait_for_ci: false setting tells Codecov to publish status checks immediately upon receiving its first webhook notification, without waiting for CI jobs to complete and upload coverage data. Since the unit tests run as a 5-shard matrix (each taking 5–46 minutes), Codecov's initial evaluation at 20:47 UTC had no coverage data to work with — all shards were still running.

The failing check run IDs referenced by the user (75092260536, 75092258838) are the early, premature evaluations. Codecov later created new check runs (75097919735, 75097918235) with the correct results, which superseded the failures.

Recommendations
  1. No action required on this PR — both Codecov checks have already self-resolved to success. The current status is:

    • codecov/patch: ✅ 84.21% (target 37.53%)
    • codecov/project: ✅ 37.89% (+0.35% vs base)
  2. Prevent future false failures — change wait_for_ci: false to wait_for_ci: true in codecov.yml. This tells Codecov to wait until all CI jobs finish uploading coverage before posting status checks, eliminating the race condition:

    codecov:
      branch: main
      notify:
        wait_for_ci: true  # Wait for all CI uploads before reporting
  3. Alternative: configure expected upload count — if wait_for_ci: true causes delays, configure Codecov to expect exactly 5 coverage uploads (matching the 5 test shards) using the after_n_builds setting:

    codecov:
      notify:
        after_n_builds: 5  # Wait for all 5 test shards
  4. These failures can be safely ignored by reviewers — GitHub shows the latest check run result, which is success. The failure IDs in the PR's check run history are stale.

Evidence
Evidence Detail
Failing check run IDs 75092260536 (codecov/patch), 75092258838 (codecov/project)
Failure timestamp 2026-05-08T20:47:12–14Z
Failure message (patch) 0.00% of diff hit (target 37.53%) — zero coverage uploads received
Failure message (project) 27.81% (-9.73%) compared to bded456 — massive drop from missing data
Earliest test shard completion test / Unit Tests (other) at 20:49:26 UTC — 2+ min after Codecov evaluated
Latest test shard completion test / Unit Tests (cpo-other) at 21:30:35 UTC — 43 min after Codecov evaluated
Succeeding check run IDs 75097919735 (codecov/patch), 75097918235 (codecov/project)
Success timestamp 2026-05-08T21:29:11–12Z — after all shards uploaded
Success message (patch) 84.21% of diff hit (target 37.53%)
Success message (project) 37.89% (+0.35%) compared to bded456
Root cause config codecov.ymlnotify.wait_for_ci: false
Same commit, no re-push HEAD SHA unchanged at 1cab5722981f3af28e4345a6cc1825fe338060ca
PR current check status Both codecov checks show ✅ success on latest evaluation

@bryan-cox bryan-cox force-pushed the CNTRLPLANE-3314 branch 3 times, most recently from 6f8e3f4 to 0c4cee3 Compare May 9, 2026 00:55
…esolution target

- Vendor golang.org/x/tools/{go/analysis,go/ast,txtar,internal} for the
  custom go/analysis linter
- Add IMAGERESOLUTION_LINT binary build target and lint-imageresolution
  Makefile target
- Wire lint-imageresolution into lint with generate dependency

Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
Add support/imageresolution with a ProviderSet builder that consolidates
the scattered decorator chain into a single construction API:

- ProviderSet builder: NewProviderSet().WithRegistryOverrides().
  WithImageRegistryMirrors().WithImageOverrides().ForDataPlane().Build()
- ResolverConfig with RegistryOverridesFlag/ImageRegistryMirrorsEnvVar
  serialization and parsing helpers
- Internal imageResolver with resolveForDirectFetch (CLI overrides +
  ICSP/IDMS mirrors) and resolveForPodSpec (CLI overrides only)
- Thread-safe releaseCache, mirrorAvailabilityCache, and metadataCache
- Mutex-protected mirroredReleaseImage field
- Error propagation from configSource through resolver methods
- ProviderSet.Lookup rewrites ImageStream tags from resolved component
  images so ComponentImages() returns overridden refs
- Adapter layer: ImageMetadataProvider() with synchronized rawMetadata
- Comprehensive behavior-driven tests with Gherkin naming

Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
bryan-cox added 3 commits May 8, 2026 21:04
Add a custom go/analysis linter that flags bypass of the imageresolution
abstraction boundary:

- Raw override map parameters and struct fields with override-related
  names (registryOverrides, imageRegistryMirrors, etc.)
- Direct imports of go-containerregistry and containers/image
- References to deprecated decorator types (RegistryMirrorProviderDecorator,
  ProviderWithOpenShiftImageRegistryOverridesDecorator, etc.)
- Exempt: imageresolution package, support/util, ignition-server/cmd,
  hostedclusterconfigoperator root package, main packages, test files
- analysistest-based test suite with testdata fixtures

Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
…types

Wire imageresolution.ProviderSet into HO, karpenter-operator, and
catalog image construction, replacing the old decorator chain:

- HO main.go: ProviderSet with CLI overrides, static mirrors, dynamic
  mirror refresh via WithMirrorRefresh callback
- Karpenter main.go: ProviderSet with mirrors only (no CLI overrides)
- Catalogs images.go: accept ResolverConfig instead of raw map
- HC controller: use ProviderSet directly as ReleaseProvider, remove
  dead RegistryOverrides field
- HC webhook and sizing controller: widen to interface types
- Etcd backup reconciler: remove dead mock methods from fake provider

Delete old abstractions:
- RegistryMirrorProviderDecorator (support/releaseinfo)
- ProviderWithOpenShiftImageRegistryOverridesDecorator (support/releaseinfo)
- ProviderWithRegistryOverrides, ProviderWithOpenShiftImageRegistryOverrides
  interfaces (support/releaseinfo)
- CommonRegistryProvider, RegistryProvider (support/globalconfig)
- ConvertRegistryOverridesToCommandLineFlag and related (support/util)

Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
… HCCO, and ignition-server

Wire imageresolution.ProviderSet into CPO, HCCO, and ignition-server:

- CPO main.go: dual ProviderSet -- control-plane with CLI overrides +
  mirrors + image overrides, data-plane with ForDataPlane() + ICSP/IDMS
  mirrors for disconnected fetch
- HCP controller: receive both ProviderSets, pass Config() to
  configoperator and ignition-server components
- HCCO configoperator: accept ResolverConfig for --registry-overrides
  and OPENSHIFT_IMG_OVERRIDES serialization
- HCCO cmd.go: ProviderSet with mirrors only
- Ignition-server: ProviderSet with overrides and mirrors
- OLM catalogs and HCCO OLM params: wrap mirrors in ResolverConfig

Signed-off-by: Bryan Cox <brcox@redhat.com>
Commit-Message-Assisted-by: Claude (via Claude Code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. area/control-plane-operator Indicates the PR includes changes for the control plane operator - in an OCP release area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release area/karpenter-operator Indicates the PR includes changes related to the Karpenter operator do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants