Add Fabric Manager Shared NVSwitch virtualization model support#538
Draft
mresvanis wants to merge 1 commit intoNVIDIA:mainfrom
Draft
Add Fabric Manager Shared NVSwitch virtualization model support#538mresvanis wants to merge 1 commit intoNVIDIA:mainfrom
mresvanis wants to merge 1 commit intoNVIDIA:mainfrom
Conversation
9332a4c to
8dea1b9
Compare
5 tasks
078ef34 to
468669a
Compare
cdesiniotis
reviewed
Jan 28, 2026
468669a to
a8beedb
Compare
The changes include:
- add the `FABRIC_MANAGER_FABRIC_MODE` env var that configures FM with
either full-passthrough (0) or shared-nvswitch (1) fabric mode. It
defaults to 0.
- when fabric manager mode is set to 0 no changes to the flow, i.e.
execute the fabric manager daemon with its default configuration.
- when fabric manager mode is set to 1:
- edit the fabric manager configuration file and set `FABRIC_MODE=1`.
- persist mapping of physical GPU module IDs to their PCIe address by
creating a JSON file on disk (the physical GPU module IDs are
available through nvidia-smi).
- disable `nvidia-persistenced`, as the GPU devices should be
unbound from the NVIDIA driver and bound to vfio-pci (a step
executed by the vfio-manager).
Signed-off-by: Michail Resvanis <mresvani@redhat.com>
a8beedb to
d1bf7f0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR adds Fabric Manager (FM) Shared NVSwitch virtualization model support when NVSwitch devices are detected and the newly introduced
FABRIC_MANAGER_FABRIC_MODEenv var is set to1(shared-nvswitch).No changes introduced when
FABRIC_MANAGER_FABRIC_MODE=0(default FM mode - full-passthrough), which is the current flow when NVSwitch devices are detected.Relates to: NVIDIA/gpu-operator#2045
Changes
FABRIC_MANAGER_FABRIC_MODEto control fabric managerFABRIC_MODE(defaults to0forfull-passthrough,1forshared-nvswitch).nvidia-smi.nvidia-persistencedsince GPU devices should be bound tovfio-pciby thevfio-managerin the next step.