Summary
The rdp role assumes a virtio-gpu VM throughout its GPU/acceleration logic and applies TCP networking sysctls unconditionally (no VM gating). On a physical machine this produces silent software (CPU) encoding and pushes VM-oriented network tuning onto bare metal.
Two asks, scoped to the rdp role only:
- Make GPU acceleration physical-/VM-/none-aware (detect the actual GPU and install the matching VA-API stack; degrade gracefully when there is no usable GPU).
- Never apply networking sysctl changes on a physical machine — gate them on the VM fact the repo already uses elsewhere.
Observed on a real host
Bare-metal HP Dragonfly G4 laptop (systemd-detect-virt → none) with an Intel Iris Xe iGPU (i915). After the role ran:
vainfo failed — iHD_drv_video.so was absent. The role installed mesa-va-drivers (Gallium/VirGL state tracker, for virtio-gpu), not the Intel media driver, so VA-API never initialised.
- No GStreamer VA encoder element present →
gnome-remote-desktop (GNOME 50, system "Remote Login" mode) fell back to software RFX. The Nice=-10 priority drop-in was masking that CPU cost.
- The TCP/MTU sysctl drop-ins (
99-rdp-optimization.conf, 99-rdp-mtu.conf) were applied on bare metal.
Manually installing intel-media-va-driver-non-free + gstreamer1.0-plugins-bad, adding the grd user to render/video, and removing the sysctl drop-ins gave confirmed hardware H.264 encode (iGPU VCS engine active during a session).
Problem 1 — GPU/VA-API driver selection is virtio-gpu-only
ansible/roles/rdp/tasks/vaapi.yml installs only mesa-va-drivers + vainfo (Ubuntu) / mesa-va-drivers + libva-utils (Fedora). The file comment even states it's "the Gallium VA-API state tracker for virtio-gpu". That is correct for VirGL VMs and for physical AMD (radeonsi), but wrong for physical Intel (needs intel-media-va-driver / iHD) and for NVIDIA (needs nvidia-vaapi-driver).
There is also no install of the GStreamer VA encoder plugin that grd 50 needs (vah264enc / vah264lpenc, from gstreamer1.0-plugins-bad on Ubuntu). Without it, HW encode can't happen even with a working driver.
Proposed: detect GPU type and branch:
| Target |
VA driver |
Notes |
| virtio-gpu (VM) |
mesa-va-drivers |
current behaviour |
| Intel (physical) |
intel-media-va-driver[-non-free] |
iHD; Iris Xe etc. |
| AMD (physical) |
mesa-va-drivers |
radeonsi/RADV |
| NVIDIA (physical) |
nvidia-vaapi-driver |
or document as unsupported |
none / no /dev/dri |
— |
skip; software encode |
Plus install the GStreamer VA encoder plugin (gstreamer1.0-plugins-bad on Ubuntu) wherever a HW encoder is expected. Detection can use PCI vendor (lspci/sysfs vendor IDs: 0x8086 Intel, 0x1002 AMD, 0x10de NVIDIA, 0x1af4 virtio) combined with the existing ansible_virtualization_role.
Problem 2 — HW-encode enablement targets the wrong systemd unit
ansible/roles/rdp/files/vaapi-check writes GRD_DEBUG=vkva-renderer into /etc/systemd/user/gnome-remote-desktop.service.d/vaapi.conf — the per-user service. Hosts using Remote Login run the system service (/usr/lib/systemd/system/gnome-remote-desktop.service), so this override never takes effect there.
Proposed: detect which mode is active (system Remote Login vs per-user Desktop Sharing) and write the override to the matching unit. Also re-evaluate whether GRD_DEBUG=vkva-renderer is still the right mechanism — recent grd auto-selects vah264enc when the VA encoder is present, so simply ensuring the driver+plugin exist may be sufficient.
Problem 3 — GPU device access uses world-open 0666 instead of group membership
ansible/roles/rdp/tasks/gpu_groups.yml deploys /etc/udev/rules.d/99-gpu-open-access.rules with SUBSYSTEM=="drm", MODE="0666", making all DRM devices world read/write. That's a least-privilege/security smell, especially on physical multi-user machines.
Proposed: for the GPU-using service account (the grd system user in Remote Login mode), add it to the render/video groups instead of opening the device to the world; if a udev rule is still wanted, scope it to KERNEL=="renderD*" rather than all of drm.
Problem 4 — networking sysctls applied to physical machines (no VM gating)
ansible/roles/rdp/tasks/tcp.yml (BBR, fq, 16 MB socket buffers, netdev_max_backlog=5000) and ansible/roles/rdp/tasks/mtu.yml (tcp_mtu_probing=1, ip_no_pmtu_disc=0) are gated only on:
when:
- ansible_facts['distribution'] in ['Fedora', 'Ubuntu']
- has_gnome
No virtualization check, so they land on bare metal. Networking sysctl changes must not be applied to physical machines. The repo already has the right idiom — vm_optimizer gates with ansible_virtualization_role == 'guest'.
Proposed: add - ansible_virtualization_role == 'guest' to tcp.yml and mtu.yml (or move these RDP network tunings into vm_optimizer). On physical hosts, skip them entirely.
Problem 5 — Nice=-10 priority drop-in is a software-encode band-aid
ansible/roles/rdp/files/grd-priority.conf (deployed by service.yml) compensates for CPU-bound software RFX. Once HW encode is correctly provisioned it's unnecessary.
Proposed: tie the priority boost to the software-encode/none path only; skip it when HW encode is active.
Acceptance criteria
Scope: rdp role only.
Summary
The
rdprole assumes a virtio-gpu VM throughout its GPU/acceleration logic and applies TCP networking sysctls unconditionally (no VM gating). On a physical machine this produces silent software (CPU) encoding and pushes VM-oriented network tuning onto bare metal.Two asks, scoped to the
rdprole only:Observed on a real host
Bare-metal HP Dragonfly G4 laptop (
systemd-detect-virt→none) with an Intel Iris Xe iGPU (i915). After the role ran:vainfofailed —iHD_drv_video.sowas absent. The role installedmesa-va-drivers(Gallium/VirGL state tracker, for virtio-gpu), not the Intel media driver, so VA-API never initialised.gnome-remote-desktop(GNOME 50, system "Remote Login" mode) fell back to software RFX. TheNice=-10priority drop-in was masking that CPU cost.99-rdp-optimization.conf,99-rdp-mtu.conf) were applied on bare metal.Manually installing
intel-media-va-driver-non-free+gstreamer1.0-plugins-bad, adding the grd user torender/video, and removing the sysctl drop-ins gave confirmed hardware H.264 encode (iGPU VCS engine active during a session).Problem 1 — GPU/VA-API driver selection is virtio-gpu-only
ansible/roles/rdp/tasks/vaapi.ymlinstalls onlymesa-va-drivers+vainfo(Ubuntu) /mesa-va-drivers+libva-utils(Fedora). The file comment even states it's "the Gallium VA-API state tracker for virtio-gpu". That is correct for VirGL VMs and for physical AMD (radeonsi), but wrong for physical Intel (needsintel-media-va-driver/ iHD) and for NVIDIA (needsnvidia-vaapi-driver).There is also no install of the GStreamer VA encoder plugin that grd 50 needs (
vah264enc/vah264lpenc, fromgstreamer1.0-plugins-badon Ubuntu). Without it, HW encode can't happen even with a working driver.Proposed: detect GPU type and branch:
mesa-va-driversintel-media-va-driver[-non-free]mesa-va-driversnvidia-vaapi-driver/dev/driPlus install the GStreamer VA encoder plugin (
gstreamer1.0-plugins-badon Ubuntu) wherever a HW encoder is expected. Detection can use PCI vendor (lspci/sysfsvendorIDs:0x8086Intel,0x1002AMD,0x10deNVIDIA,0x1af4virtio) combined with the existingansible_virtualization_role.Problem 2 — HW-encode enablement targets the wrong systemd unit
ansible/roles/rdp/files/vaapi-checkwritesGRD_DEBUG=vkva-rendererinto/etc/systemd/user/gnome-remote-desktop.service.d/vaapi.conf— the per-user service. Hosts using Remote Login run the system service (/usr/lib/systemd/system/gnome-remote-desktop.service), so this override never takes effect there.Proposed: detect which mode is active (system Remote Login vs per-user Desktop Sharing) and write the override to the matching unit. Also re-evaluate whether
GRD_DEBUG=vkva-rendereris still the right mechanism — recent grd auto-selectsvah264encwhen the VA encoder is present, so simply ensuring the driver+plugin exist may be sufficient.Problem 3 — GPU device access uses world-open
0666instead of group membershipansible/roles/rdp/tasks/gpu_groups.ymldeploys/etc/udev/rules.d/99-gpu-open-access.ruleswithSUBSYSTEM=="drm", MODE="0666", making all DRM devices world read/write. That's a least-privilege/security smell, especially on physical multi-user machines.Proposed: for the GPU-using service account (the grd system user in Remote Login mode), add it to the
render/videogroups instead of opening the device to the world; if a udev rule is still wanted, scope it toKERNEL=="renderD*"rather than all ofdrm.Problem 4 — networking sysctls applied to physical machines (no VM gating)
ansible/roles/rdp/tasks/tcp.yml(BBR,fq, 16 MB socket buffers,netdev_max_backlog=5000) andansible/roles/rdp/tasks/mtu.yml(tcp_mtu_probing=1,ip_no_pmtu_disc=0) are gated only on:No virtualization check, so they land on bare metal. Networking sysctl changes must not be applied to physical machines. The repo already has the right idiom —
vm_optimizergates withansible_virtualization_role == 'guest'.Proposed: add
- ansible_virtualization_role == 'guest'totcp.ymlandmtu.yml(or move these RDP network tunings intovm_optimizer). On physical hosts, skip them entirely.Problem 5 —
Nice=-10priority drop-in is a software-encode band-aidansible/roles/rdp/files/grd-priority.conf(deployed byservice.yml) compensates for CPU-bound software RFX. Once HW encode is correctly provisioned it's unnecessary.Proposed: tie the priority boost to the software-encode/none path only; skip it when HW encode is active.
Acceptance criteria
renderD*-scoped rule), not world-0666on all DRM.tcp.ymlandmtu.ymlonly run whenansible_virtualization_role == 'guest'; physical machines get no RDP networking sysctls.Nice=-10priority applied only on the software-encode/none path.Scope:
rdprole only.