Skip to content

Fix state reconciliation early return on unchanged DaemonSet#2062

Open
Mirza-Samad-Ahmed-Baig wants to merge 1 commit intoNVIDIA:mainfrom
Mirza-Samad-Ahmed-Baig:fix/state-skel-daemonset-short-circuit
Open

Fix state reconciliation early return on unchanged DaemonSet#2062
Mirza-Samad-Ahmed-Baig wants to merge 1 commit intoNVIDIA:mainfrom
Mirza-Samad-Ahmed-Baig:fix/state-skel-daemonset-short-circuit

Conversation

@Mirza-Samad-Ahmed-Baig
Copy link

Description

Fixed a critical reconciliation bug where stateSkel.createOrUpdateObjs would return nil (exiting early) when a DaemonSet’s hash was unchanged, potentially skipping reconciliation of later objects; now it continues.

Checklist

  • [x ] No secrets, sensitive information, or unrelated changes
  • Lint checks passing (make lint)
  • Generated assets in-sync (make validate-generated-assets)
  • Go mod artifacts in-sync (make validate-modules)
  • [ x] Test cases are added for new code paths

Testing

Added a regression unit test that fails on the old behavior and passes with the fix .

@copy-pr-bot
Copy link

copy-pr-bot bot commented Jan 23, 2026

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Signed-off-by: Mirza-Samad-Ahmed-Baig <Mirzasamadahmedbaig@gmail.com>
@Mirza-Samad-Ahmed-Baig Mirza-Samad-Ahmed-Baig force-pushed the fix/state-skel-daemonset-short-circuit branch from f3f4d0c to d3ce837 Compare January 23, 2026 19:00
@rajathagasthya
Copy link
Contributor

/ok-to-test d3ce837

err := stateSkel.createOrUpdateObjs(ctx, func(_ *unstructured.Unstructured) error { return nil }, []*unstructured.Unstructured{ds, cm})
require.NoError(t, err)

gotCm := &unstructured.Unstructured{}
Copy link
Contributor

@rajathagasthya rajathagasthya Feb 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also simulate a case where a ConfigMap exists and an update happens to it successfully (while the DaemonSet remains unchanged)?

@rahulait
Copy link
Contributor

rahulait commented Feb 3, 2026

This is kind of ok as its not really a bug. When nvidiaDriver is changed, hash for every daemonset owned by it changes. There isn't any case as of today where nvidiadriver manages multiple daemonsets and hash of only one daemonset changes. So returning early is kind of ok. But yeah, if others think we can just evaluate all than returning early, then that is fine as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants