Skip to content

USHIFT-6796: C2CC: DNS forwarding between clusters#6638

Open
pmtk wants to merge 4 commits intoopenshift:mainfrom
pmtk:c2cc/coredns
Open

USHIFT-6796: C2CC: DNS forwarding between clusters#6638
pmtk wants to merge 4 commits intoopenshift:mainfrom
pmtk:c2cc/coredns

Conversation

@pmtk
Copy link
Copy Markdown
Member

@pmtk pmtk commented May 8, 2026

Summary by CodeRabbit

  • New Features

    • Cross-cluster DNS blocks added to CoreDNS config enabling remote service discovery; remote cluster DNS addresses are now derived when applicable.
    • Source IP preservation implemented for cross-cluster traffic.
  • Tests

    • New end-to-end DNS test suite validating CoreDNS server blocks, DNS resolution, and HTTP access across clusters.
    • Connectivity tests enhanced to verify source-IP preservation; test helpers and test workloads updated (CGI-based hello service, per-cluster namespaces).
  • Documentation

    • Test keywords added for namespace creation and Corefile validation.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@openshift-ci openshift-ci Bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 8, 2026
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label May 8, 2026
@openshift-ci-robot
Copy link
Copy Markdown

openshift-ci-robot commented May 8, 2026

@pmtk: This pull request references USHIFT-6796 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "5.0.0" version, but no target version was set.

Details

In response to this:

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: pmtk

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented May 8, 2026

Walkthrough

Adds cross-cluster DNS rendering: compute a DNS IP per resolved remote cluster when a domain is set, render CoreDNS server blocks, expose them to the DNS controller template, and update tests and Robot Framework suites to validate DNS blocks, resolution, and connectivity.

Changes

Cross-Cluster DNS Rendering

Layer / File(s) Summary
Data Shape
pkg/config/c2cc.go
ResolvedRemoteCluster gains DNSIP string, populated during validateRemoteCluster when Domain is non-empty (computed from the first service network CIDR).
Validation
pkg/config/c2cc.go
validateRemoteCluster computes dnsIP via getClusterDNS for ServiceNetwork[0], stores it in res.DNSIP, and returns a validation error if computation fails.
Core DNS Rendering
pkg/config/c2cc.go
New exported RenderC2CCDNSBlocks(resolved []ResolvedRemoteCluster) string emits CoreDNS server blocks only for entries with Domain != ""; internal formatDNSBlock(domain, dnsIP string) formats each block.
DNS Controller Wiring
pkg/components/controllers.go
startDNSController adds C2CCDNSBlocks to DNS template params and sets it to the output of RenderC2CCDNSBlocks(cfg.C2CC.Resolved) when C2CC is enabled.
CoreDNS ConfigMap Template
assets/components/openshift-dns/dns/configmap.yaml
Corefile template updated to include {{- .C2CCDNSBlocks }} at the top of the CoreDNS configuration.
Unit Tests
pkg/config/c2cc_test.go
Added tests validating DNSIP population for IPv4/IPv6 and TestRenderC2CCDNSBlocks verifying generated CoreDNS snippets and formatting; parseCIDR helper added.
Test Helpers
pkg/controllers/c2cc/helpers_test.go
testRemoteConfig now includes domain; added testRemoteWithDomain helper; test fixtures set Domain on resolved clusters.
Test Asset
test/assets/c2cc/hello-microshift.yaml
Replaced nc-based responder with an httpd-served CGI /cgi-bin/hello that reports source IP and pod IP; added MY_POD_IP env var from status.podIP.
Test Resources & Keywords
test/resources/c2cc.resource
Added Create Unique Namespace On Cluster, Verify Corefile Contains C2CC Server Block, and Verify Corefile Does Not Contain C2CC Server Block Robot keywords to create per-run namespaces and assert Corefile content.
Integration Tests — connectivity
test/suites/c2cc/connectivity.robot
Switched to per-cluster namespaces (&{NAMESPACES}), expect "Hello from", added 4 "Source IP Preserved" tests, added Get Curl Pod IP, and updated curl target to /cgi-bin/hello.
Integration Tests — DNS suite
test/suites/c2cc/dns.robot
New Robot Framework suite validating Corefile server blocks presence, DNS resolution via getent hosts, and HTTP access to remote service by DNS name; includes setup/teardown and retries for DNS readiness.
Test Execution
test/scenarios-bootc/el9/presubmits/el98-src@c2cc.sh
Added suites/c2cc/dns.robot to presubmit run list.

Sequence Diagram

sequenceDiagram
    participant Config as C2CC Config
    participant Ctrl as DNS Controller
    participant CM as CoreDNS ConfigMap
    participant DNS as CoreDNS Server
    participant Client as Test Pod (curl)

    Config->>Config: validateRemoteCluster computes DNSIP from<br/>service network CIDR when Domain is set
    Config->>Ctrl: ResolvedRemoteCluster with Domain & DNSIP
    Ctrl->>Ctrl: RenderC2CCDNSBlocks(resolved)
    Ctrl->>CM: Inject C2CCDNSBlocks into Corefile template
    CM->>DNS: CoreDNS reloads updated Corefile with server blocks
    Client->>DNS: getent hosts remote.<domain>
    DNS->>Client: returns resolved IP
    Client->>Client: curl http://<resolved-ip>:8080/cgi-bin/hello
    Client->>Client: receives "Hello from" with source IP
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

🚥 Pre-merge checks | ✅ 10 | ❌ 2

❌ Failed checks (2 warnings)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 12.50% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
Test Structure And Quality ⚠️ Warning Go test assertions lack meaningful failure messages. Multiple assertions (e.g., assert.NoError(), assert.Empty()) omit context messages required for diagnosing test failures. Add context messages to assertions. E.g., require.NoError(t, cfg.C2CC.validate(cfg), "failed to validate C2CC config") to help diagnose failures.
✅ Passed checks (10 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'USHIFT-6796: C2CC: DNS forwarding between clusters' accurately describes the main change: implementing DNS forwarding between clusters.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Stable And Deterministic Test Names ✅ Passed All test names are stable and deterministic. Go tests use static subtitles, Robot Framework tests use static descriptive names. No dynamic values appear in test titles.
Microshift Test Compatibility ✅ Passed PR adds Go unit tests and Robot Framework tests, not Ginkgo e2e tests. Check is not applicable.
Single Node Openshift (Sno) Test Compatibility ✅ Passed This PR adds Robot Framework tests, not Ginkgo e2e tests. The custom check applies only to Ginkgo tests with patterns like It(), Describe(), Context(), When(). None are present.
Topology-Aware Scheduling Compatibility ✅ Passed Changes are DNS configuration (CoreDNS templates, DNS helpers) and test code. No pod affinity, topology constraints, nodeSelector targeting control-plane, or scheduling restrictions introduced.
Ote Binary Stdout Contract ✅ Passed No OTE Binary Stdout Contract violations found. PR adds DNS configuration code and tests with no process-level stdout writes, init() functions, or TestMain functions that would corrupt test listings.
Ipv6 And Disconnected Network Test Compatibility ✅ Passed The custom check targets Ginkgo e2e tests (It(), Describe(), etc.). This PR adds only Go unit tests (testing package) and Robot Framework tests, not Ginkgo e2e tests. Check is not applicable.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 golangci-lint (2.12.1)

level=warning msg="The linter 'gomodguard' is deprecated (since v2.12.0) due to: new major version. Replaced by gomodguard_v2."
level=warning msg="Suggested new configuration:\nlinters:\n enable:\n - gomodguard_v2\n"
level=error msg="Running error: context loading failed: failed to load packages: failed to load packages: failed to load with go/packages: err: exit status 1: stderr: go: inconsistent vendoring in :\n\tgithub.com/apparentlymart/go-cidr@v1.1.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/coreos/go-systemd@v0.0.0-20190321100706-95778dfbb74e: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/google/go-cmp@v0.7.0: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/miekg/dns@v1.1.63: is explicitly required in go.mod, but not marked as explicit in vendor/modules.txt\n\tgithub.com/openshift/api@v0.0.0-20260408092441-8b086e6b9eb9: is

... [truncated 31032 characters] ...

elet: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/metrics: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/mount-utils: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/pod-security-admission: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-apiserver: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-cli-plugin: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\tk8s.io/sample-controller: is replaced in go.mod, but not marked as replaced in vendor/modules.txt\n\n\tTo ignore the vendor directory, use -mod=readonly or -mod=mod.\n\tTo sync the vendor directory, run:\n\t\tgo mod vendor\n"


Comment @coderabbitai help to get the list of available commands and usage tips.

@openshift-ci openshift-ci Bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 8, 2026
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@test/suites/c2cc/dns.robot`:
- Around line 71-77: Make namespace creation idempotent: change the "Oc On
Cluster    ${alias}    oc create namespace ${NAMESPACE}" call in the "Deploy DNS
Test Workloads" block (and the similar calls at lines 88-93) so it doesn't fail
if the namespace already exists or is terminating — e.g., check for existence
before creating (using the same "Oc On Cluster" helper to run an "oc get
namespace ${NAMESPACE}" and only run create if absent) or replace create with an
idempotent operation; also ensure teardown ignores or handles delete errors
consistently.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 273f8e99-0d08-4cb7-bb27-08129a56d98f

📥 Commits

Reviewing files that changed from the base of the PR and between 987ae4d and 1a5258b.

📒 Files selected for processing (10)
  • assets/components/openshift-dns/dns/configmap.yaml
  • pkg/components/controllers.go
  • pkg/config/c2cc.go
  • pkg/config/c2cc_test.go
  • pkg/controllers/c2cc/helpers_test.go
  • test/assets/c2cc/hello-microshift.yaml
  • test/resources/c2cc.resource
  • test/scenarios-bootc/el9/presubmits/el98-src@c2cc.sh
  • test/suites/c2cc/connectivity.robot
  • test/suites/c2cc/dns.robot

Comment thread test/suites/c2cc/dns.robot Outdated
@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 8, 2026

/test verify

Avoid namespace collisions on reruns by generating a random namespace
per cluster instead of using a hardcoded name. Also flatten nested
validation logic in c2cc.go to satisfy the nestif linter.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
pkg/config/c2cc.go (1)

360-367: ⚡ Quick win

Skip domain blocks when DNSIP is empty.

If a ResolvedRemoteCluster has Domain but an empty DNSIP, the generated forward directive is invalid and can break Corefile rendering. Add a defensive guard here.

Suggested patch
 func RenderC2CCDNSBlocks(resolved []ResolvedRemoteCluster) string {
 	var blocks []string
 	for _, rc := range resolved {
-		if rc.Domain == "" {
+		if rc.Domain == "" || rc.DNSIP == "" {
 			continue
 		}
 		blocks = append(blocks, formatDNSBlock(rc.Domain, rc.DNSIP))
 	}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@pkg/config/c2cc.go` around lines 360 - 367, RenderC2CCDNSBlocks currently
appends DNS blocks for every ResolvedRemoteCluster with a Domain, but if
rc.DNSIP is empty the resulting forward directive is invalid; update
RenderC2CCDNSBlocks to skip entries where rc.DNSIP == "" (i.e., treat both
rc.Domain and rc.DNSIP as required) before calling formatDNSBlock, so only
clusters with non-empty DNSIP produce blocks.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@pkg/config/c2cc.go`:
- Around line 360-367: RenderC2CCDNSBlocks currently appends DNS blocks for
every ResolvedRemoteCluster with a Domain, but if rc.DNSIP is empty the
resulting forward directive is invalid; update RenderC2CCDNSBlocks to skip
entries where rc.DNSIP == "" (i.e., treat both rc.Domain and rc.DNSIP as
required) before calling formatDNSBlock, so only clusters with non-empty DNSIP
produce blocks.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository YAML (base), Central YAML (inherited)

Review profile: CHILL

Plan: Enterprise

Run ID: 36664956-24a7-47cc-ae45-f4a7e5e341dc

📥 Commits

Reviewing files that changed from the base of the PR and between 1a5258b and e4d674b.

📒 Files selected for processing (4)
  • pkg/config/c2cc.go
  • test/resources/c2cc.resource
  • test/suites/c2cc/connectivity.robot
  • test/suites/c2cc/dns.robot
🚧 Files skipped from review as they are similar to previous changes (2)
  • test/resources/c2cc.resource
  • test/suites/c2cc/connectivity.robot

@pmtk
Copy link
Copy Markdown
Member Author

pmtk commented May 8, 2026

/test verify

@pmtk pmtk marked this pull request as ready for review May 8, 2026 14:44
@openshift-ci openshift-ci Bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 8, 2026
@openshift-ci openshift-ci Bot requested review from eslutsky and kasturinarra May 8, 2026 14:45
@openshift-ci
Copy link
Copy Markdown
Contributor

openshift-ci Bot commented May 8, 2026

@pmtk: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/e2e-aws-tests-bootc-el9 e4d674b link true /test e2e-aws-tests-bootc-el9
ci/prow/e2e-aws-tests-bootc-arm-el9 e4d674b link true /test e2e-aws-tests-bootc-arm-el9

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants