Skip to content

OCPCLOUD-2950: Adopt projected-service-account-tokens-for-credential-providers#1987

Open
theobarberbany wants to merge 5 commits intoopenshift:masterfrom
theobarberbany:tb/cred-provider-projected-service-accounts
Open

OCPCLOUD-2950: Adopt projected-service-account-tokens-for-credential-providers#1987
theobarberbany wants to merge 5 commits intoopenshift:masterfrom
theobarberbany:tb/cred-provider-projected-service-accounts

Conversation

@theobarberbany
Copy link
Copy Markdown

No description provided.

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 27, 2026
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 27, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 27, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign jhjaggars for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@theobarberbany theobarberbany changed the title Adds projected-service-account-tokens-for-credential-providers OCPCLOUD-2950: Adds projected-service-account-tokens-for-credential-providers Apr 27, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Apr 27, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented Apr 27, 2026

@theobarberbany: This pull request references OCPCLOUD-2950 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the spike to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@theobarberbany theobarberbany force-pushed the tb/cred-provider-projected-service-accounts branch from ed89bf1 to 2262ea2 Compare April 27, 2026 13:23
Copy link
Copy Markdown
Member

@damdo damdo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @theobarberbany

Not an expert but on a first pass, the proposal looks sensible to me. Interested in hearing more from the other parties involved

@theobarberbany theobarberbany force-pushed the tb/cred-provider-projected-service-accounts branch from 2262ea2 to a31c15b Compare April 27, 2026 14:29
@theobarberbany theobarberbany force-pushed the tb/cred-provider-projected-service-accounts branch from ca4bf4e to baf3f25 Compare April 28, 2026 10:09
@theobarberbany theobarberbany changed the title OCPCLOUD-2950: Adds projected-service-account-tokens-for-credential-providers OCPCLOUD-2950: Adopt projected-service-account-tokens-for-credential-providers Apr 28, 2026
@theobarberbany theobarberbany marked this pull request as ready for review April 28, 2026 10:15
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 28, 2026
@theobarberbany theobarberbany force-pushed the tb/cred-provider-projected-service-accounts branch 3 times, most recently from a239a43 to 7f9b92f Compare April 28, 2026 18:44
@theobarberbany theobarberbany force-pushed the tb/cred-provider-projected-service-accounts branch from 7f9b92f to 63e3a61 Compare April 29, 2026 08:46
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented Apr 29, 2026

@theobarberbany: The following test failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/markdownlint 63e3a61 link true /test markdownlint

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.


* As a cluster administrator, I want the credential provider configuration to ship with sensible defaults but be customizable, so that I can adapt it to my organization's cloud identity setup without maintaining my own MCO overrides from scratch.

* As a security-conscious customer, I want the installer to not provision a default node identity for image pulls, so that I can reduce the blast radius of node compromise and use day-2 provisioned credentials instead of install-time baked-in identity.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems unnecessary especially since it's listed as a non-goal/out of scope below.

#### Out of scope but enabled by this work

- **Removing install-time node identity from the installer.** KEP-4412 adoption is a prerequisite for this work. See [Removing node identity from install](#removing-node-identity-from-install) for the proposed approach.
- **Removing the ROSA ECR token refresh shim.** ROSA currently works around the lack of credential provider support by injecting a Python script via MachineConfig as a systemd unit on a 4h timer to fetch ECR tokens. KEP-4412 replaces this with the kubelet's native credential provider framework. See [ARO-24037](https://redhat.atlassian.net/browse/ARO-24037).
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reference to ARO card but discusses ROSA. Assuming that's because the ARO card provides context is that RHCOS doesn't ship by default with the xcr-credentials-provider binaries and as a result ROSA is using this "hack" to do it.

#### Out of scope but enabled by this work

- **Removing install-time node identity from the installer.** KEP-4412 adoption is a prerequisite for this work. See [Removing node identity from install](#removing-node-identity-from-install) for the proposed approach.
- **Removing the ROSA ECR token refresh shim.** ROSA currently works around the lack of credential provider support by injecting a Python script via MachineConfig as a systemd unit on a 4h timer to fetch ECR tokens. KEP-4412 replaces this with the kubelet's native credential provider framework. See [ARO-24037](https://redhat.atlassian.net/browse/ARO-24037).
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How does this KEP support mirroring? I thought the issue was primarily the order in which these things were called? The mirroring config is after the credential provider today right?


### API Extensions

None. This enhancement does not introduce new Custom Resource Definitions or modify the OpenShift API surface. The `tokenAttributes` field is part of the upstream kubelet `CredentialProviderConfig` API.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users opt-in to this feature using annotations on the service level? Any dynamic config we need to add to allow them to do this or does it basically just work today if we configure the MCO correctly (which we don't today right?)

is there any reason why a cluster admin might not want this ability/therefore would want a knob to control if it's on/off?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the What KEP-4412 adoption looks like section, it seems that the "API surface" is "Admin wrote a SA with specific annotations regarding cloud provider"? The MCO enablement seems straightforward so the admin control would just be "I don't write a very specific annotation and the feature is off"


#### OpenShift Kubernetes Engine

TBD — needs input from OKE team.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think there's any difference between OCP and OKE in regards to this area

Comment on lines +307 to +309
1. **ECR (cloud-provider-aws)**: Make the annotation key configurable via args or environment variable. Currently hardcoded to `eks.amazonaws.com/ecr-role-arn` — EKS-specific, not usable for OpenShift without a carry patch.
2. **ACR (cloud-provider-azure)**: KEP-4412 support is already merged ([PR #9907](https://github.com/kubernetes-sigs/cloud-provider-azure/pull/9907)). Annotation keys (`kubernetes.azure.com/acr-client-id`, `kubernetes.azure.com/acr-tenant-id`) are hardcoded in `pkg/credentialprovider/consts.go` — need the same configurability fix as ECR.
3. **GCR (cloud-provider-gcp)**: Implement KEP-4412 support from scratch. Opportunity to establish the configurable-annotation-key pattern from the start.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need something configurable vs documenting the upstream annotations for each cloud?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking to avoid mentioning cloud provider managed Kubernetes offerings in our docs, e.g EKS, thinking that could cause confusion? Or just be unexpected and therefore odd UX.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the option work beyond eks? I'd probably just look for a way to add a new version of that name upstream that isn't eks specific

The upstream ask is making annotation keys configurable — not proposing specific keys upstream. Once configurability lands, OpenShift chooses its own defaults independently.

TBD — need to decide on OpenShift annotation keys for each provider. This is user-facing UX and we want reviewer input. Options:
- Per-provider annotations: `openshift.io/ecr-role-arn`, `openshift.io/acr-client-id`, `openshift.io/gcr-service-account`
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What benefit would this provide over just using the upstream defaults?

TBD — need to decide on OpenShift annotation keys for each provider. This is user-facing UX and we want reviewer input. Options:
- Per-provider annotations: `openshift.io/ecr-role-arn`, `openshift.io/acr-client-id`, `openshift.io/gcr-service-account`
- A single generic OpenShift annotation that each plugin interprets
- Accept the upstream vendor-specific keys as-is (requires no carry, but ties OpenShift UX to EKS/AKS conventions)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The azure ones seem fine. The AWS one is less good as it calls out EKS in the name.

Aligning to upstream has benefits in terms of this clearly being an upstream feature and therefore upstream docs making sense

Copy link
Copy Markdown
Contributor

@yuqi-zhang yuqi-zhang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems straightfoward enough from the MCO side, although I think I don't fully appreciate what Removing install-time node identity from the installer entails yet

4. Calls AWS Security Token Service `AssumeRoleWithWebIdentity` with the service account token as the web identity token
5. Returns temporary IAM credentials used for the ECR `GetAuthorizationToken` call

Kubelet config:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to double check, the existing CredentialProviderConfig template would look something like:

  apiVersion: kubelet.config.k8s.io/v1
  kind: CredentialProviderConfig                                                                                                                                     
  providers:
    - name: ecr-credential-provider                                                                                                                                  
      ...
  {{- if isWorkloadIdentity . }}                                                                                                                                     
      tokenAttributes:                                                                                                                                               
        serviceAccountTokenAudience: "sts.amazonaws.com"
        ...                                                                                                                                                          
  {{- end }}

?

How does the service account get managed? Is it purely user created?

The preferred direction is upstream work to allow credential provider plugins to accept a static credential (e.g. a GCP service account key, AWS access key pair) via config file or environment variable. This would allow:

- CCO (mint mode) provisions a scoped credential for image pulls
- MCO ships a CredentialsRequest declaring the credential requirement, following the same pattern used by other OpenShift operators (e.g. the image-registry operator, the ingress operator)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something we plan to do relatively soon, or is it still up for discussion whether we will go forward with this approach?


### API Extensions

None. This enhancement does not introduce new Custom Resource Definitions or modify the OpenShift API surface. The `tokenAttributes` field is part of the upstream kubelet `CredentialProviderConfig` API.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on the What KEP-4412 adoption looks like section, it seems that the "API surface" is "Admin wrote a SA with specific annotations regarding cloud provider"? The MCO enablement seems straightforward so the admin control would just be "I don't write a very specific annotation and the feature is off"

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants