Skip to content

Add support of vsock for intra agents communications#2336

Open
lebauce wants to merge 8 commits intomainfrom
lebauce/vsock-support
Open

Add support of vsock for intra agents communications#2336
lebauce wants to merge 8 commits intomainfrom
lebauce/vsock-support

Conversation

@lebauce
Copy link
Contributor

@lebauce lebauce commented Nov 20, 2025

What does this PR do?

Kata containers makes use of microVM to run pods. In these environments, network may not be available at all. vsock allows communication between the host and the guest through the hypervisor, without using TCP sockets.

DataDog/datadog-agent#39478 added support for vsock sockets in the agent.

This PR modifies the Helm Charts by:

  • setting the right configs (DD_VSOCK_ADDR, ...) to enable the feature
  • set the /etc/datadog-agent/auth as a hostpath so that it can be shared with the microVM through virtiofs

Motivation

Some environments do not allow workload to have network interfaces. Using vsock sockets
allow these workloads to communicate with the agent.

Additional Notes

Anything else we should know when reviewing?

Minimum Agent Versions

Are there minimum versions of the Datadog Agent and/or Cluster Agent required?

  • Agent: v7.74.0
  • Cluster Agent: v7.74.0

Describe your test plan

Write there any instructions and details you may have to test your PR.

Checklist

  • PR has at least one valid label: bug, enhancement, refactoring, documentation, tooling, and/or dependencies
  • PR has a milestone or the qa/skip-qa label

@lebauce lebauce requested review from a team as code owners November 20, 2025 14:37
@lebauce lebauce added enhancement New feature or request qa/skip-qa labels Nov 20, 2025
@lebauce lebauce added this to the v1.22.0 milestone Nov 20, 2025
@codecov-commenter
Copy link

codecov-commenter commented Nov 20, 2025

Codecov Report

❌ Patch coverage is 42.85714% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 38.78%. Comparing base (2655785) to head (3356941).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
internal/controller/datadogagent/common/volumes.go 0.00% 11 Missing ⚠️
...nal/controller/datadogagent/feature/cws/feature.go 20.00% 6 Missing and 2 partials ⚠️
...datadogagent/component/otelagentgateway/default.go 0.00% 1 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@           Coverage Diff           @@
##             main    #2336   +/-   ##
=======================================
  Coverage   38.78%   38.78%           
=======================================
  Files         309      309           
  Lines       26852    26882   +30     
=======================================
+ Hits        10414    10426   +12     
- Misses      15658    15674   +16     
- Partials      780      782    +2     
Flag Coverage Δ
unittests 38.78% <42.85%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
...controller/datadogagent/component/agent/default.go 44.68% <100.00%> (ø)
...ler/datadogagent/component/clusteragent/default.go 97.38% <100.00%> (ø)
internal/controller/datadogagent/global/agent.go 85.56% <100.00%> (+1.84%) ⬆️
...datadogagent/component/otelagentgateway/default.go 0.00% <0.00%> (ø)
...nal/controller/datadogagent/feature/cws/feature.go 75.62% <20.00%> (-3.03%) ⬇️
internal/controller/datadogagent/common/volumes.go 0.00% <0.00%> (ø)

Continue to review full report in Codecov by Sentry.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 2655785...3356941. Read the comment docs.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@lebauce lebauce force-pushed the lebauce/vsock-support branch 2 times, most recently from d7d0987 to 1163533 Compare December 3, 2025 17:29
@levan-m levan-m modified the milestones: v1.22.0, v1.23.0 Dec 10, 2025
@levan-m levan-m modified the milestones: v1.23.0, v1.24.0 Jan 8, 2026
@levan-m levan-m modified the milestones: v1.24.0, v1.25.0 Feb 3, 2026
@lebauce lebauce force-pushed the lebauce/vsock-support branch 3 times, most recently from d4e009f to c7ed7b7 Compare February 23, 2026 19:22
@levan-m levan-m removed this from the v1.25.0 milestone Mar 3, 2026
@lebauce lebauce force-pushed the lebauce/vsock-support branch from a5dc055 to fa9b312 Compare March 18, 2026 21:35
@lebauce lebauce requested review from a team as code owners March 18, 2026 21:35
@lebauce lebauce requested a review from clamoriniere March 19, 2026 09:30
@tbavelier tbavelier added this to the v1.26.0 milestone Mar 19, 2026
Copy link
Member

@tbavelier tbavelier left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wondering if this communication mode is CWS/some specific features only, in which case, all the code could be moved to said features instead of spec.global ? Features are applied after global, so they would take priority over the default for the auth volume ensuring it's hostPath instead of emptyDir. This would make it even cleaner/scoped.
If it's meant to be a real global for all sub-Agents in a supported communication mode (they'll all communicate to core Agent using this socket), then it's fine to keep in spec.global but it should be moved as per the comments below: see #2786

Comment on lines +294 to +306
// Enable VSock communication between the Agent and containerized workloads if specified
if config.UseVSock != nil && *config.UseVSock {
manager.EnvVar().AddEnvVar(&corev1.EnvVar{
Name: DDVSockAddr,
Value: "host",
})

// Remote agent doesn't work with vsock yet
manager.EnvVar().AddEnvVar(&corev1.EnvVar{
Name: DDRemoteAgentRegistryEnabled,
Value: "false",
})
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this applies solely from my understanding to node Agent, it's better suited to be moved in applyNodeAgentResources in internal/controller/datadogagent/global/agent.go. This also means we don't have to touch the signature of the default EDS/DS functions. See 6d158f7

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Done

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you see the draft PR I made, we don't need the signature changes to the defaultpodtempaltespec, the default shouldn't know what's configured in the global. We simply need to move the volume to global:

authVol := common.GetVolumeForAuth(true)
manager.Volume().AddVolume(&authVol)

My PR compared to your head now looks weird, but this simple change allows to avoid changing all the defaults

@lebauce
Copy link
Contributor Author

lebauce commented Mar 20, 2026

Wondering if this communication mode is CWS/some specific features only, in which case, all the code could be moved to said features instead of spec.global ? Features are applied after global, so they would take priority over the default for the auth volume ensuring it's hostPath instead of emptyDir. This would make it even cleaner/scoped. If it's meant to be a real global for all sub-Agents in a supported communication mode (they'll all communicate to core Agent using this socket), then it's fine to keep in spec.global but it should be moved as per the comments below: see #2786

This is not specific to CWS (you may want to use the vsock mode without CWS at all). So I left it in spec.global as suggested

@lebauce lebauce requested a review from tbavelier March 20, 2026 13:32
@lebauce lebauce force-pushed the lebauce/vsock-support branch from 07883db to 7d69a4e Compare March 20, 2026 13:32
@lebauce lebauce force-pushed the lebauce/vsock-support branch from 7d69a4e to 0093fe3 Compare March 20, 2026 13:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants