MVGAL Development Progress
Version: 0.2.2 | Updated: May 2026 | Completion: ~95%
Phase
Period
Status
Phase 1: Architecture & Research
Jan 2025
✅ Complete
Phase 2: Core Modules
Jan–Mar 2025
✅ Complete
Phase 3: API Interception
Mar–Apr 2025
✅ Complete
Phase 4: Testing & QA
Apr 2025
✅ Complete
Phase 5: Documentation
Apr 2025
✅ Complete
Phase 6: Execution Engine + Rust
Apr–May 2026
✅ Complete
Phase 7: Tools, Steam, UI, OpenGL
May 2026
✅ Complete
Phase 8: Root cleanup + CI
May 2026
✅ Complete
Phase 1: Architecture Research (100%)
Linux DRM/KMS subsystem analysis
AMD amdgpu, NVIDIA open-kernel-modules, Intel i915/xe, Moore Threads mtgpu-drv study
Multi-GPU Vulkan explicit API research (VK_KHR_device_group)
PCIe P2P DMA framework analysis
DMA-BUF mechanism and vendor support matrix
Vulkan layer development patterns
Documented in docs/ARCHITECTURE_RESEARCH.md, docs/research/
Phase 2: Core Modules (100%)
mvgal_core.c — DRM registration, /dev/mvgal0, 10 ioctls
mvgal_device.c — logical device, GPU enumeration, capability profile
mvgal_memory.c — DMA-BUF, unified virtual address space
mvgal_scheduler.c — 16-level priority queue, workload dispatch
mvgal_sync.c — cross-vendor fences, timeline semaphores
Vendor ops: AMD (TTM, DPM), NVIDIA (user-space shim), Intel (i915+xe), MTT
GPU manager: sysfs scan, PCI enumeration, health monitoring
Memory manager: DMA-BUF, P2P, allocator, sync (2,576 LOC)
Scheduler: 7 strategies, load balancer, workload splitter (2,275 LOC)
Core API: init, context, strategy, stats, fences, semaphores
Logging: 22 functions, thread-safe, color, syslog, file, callbacks
Config: INI format, defaults, validation, callbacks
IPC: Unix socket, magic header, 11 message types
Public API Headers
13 headers, 220+ public functions, fully documented
Phase 3: API Interception (100% for Vulkan, OpenCL, CUDA)
Full dispatch-chain layer (vk_layer.c, ~1,205 LOC)
17 intercepted functions across instance, device, queue, physical device
Physical device property caching for telemetry
Atomic submit counter, debug logging, log file support
Registered as implicit layer via JSON manifest
LD_PRELOAD wrapper (cl_intercept.c, ~600 LOC)
Platform + device interception
NDRange kernel partitioning across GPUs
Registered via /etc/OpenCL/vendors/mvgal.icd
LD_PRELOAD wrapper (cuda_wrapper.c, ~1,340 LOC)
40+ CUDA Driver and Runtime API functions
Kernel launch interception (cuLaunchKernel, cudaLaunchKernel)
Cross-GPU copy detection, memory tracking per GPU
6 distribution strategies
mvgal_gl_preload.c — intercepts glXSwapBuffers + eglSwapBuffers
Frame pacing telemetry injection
Works with Zink (Mesa OpenGL→Vulkan)
Suite
Pass/Total
C unit tests
5/5
C integration tests
1/1
Rust unit tests
10/10
Synthetic benchmarks
10/10
Real-world benchmarks
12/12
Stress benchmarks
9/10 (1 cosmetic)
Total
47/48
Phase 5: Documentation (100%)
23 markdown files in docs/
All public API headers documented with Doxygen-style comments
Architecture diagrams, data flow diagrams
Per-vendor driver integration guide
Gaming, Steam, Rust development guides
Phase 6: Execution Engine + Rust (100%)
Execution Engine (src/userspace/execution/)
Frame session lifecycle: begin → submit → present
Migration plan generation (DMA-BUF, P2P, staging, mirror)
Steam/Proton profile generation
Integration with scheduler and memory manager
Rust Safety Crates (safe/)
fence_manager: cross-device fence lifecycle, state machine
memory_safety: allocation tracking, ref counting, DMA-BUF association
capability_model: GPU capability normalization, JSON serialization
Cargo workspace, edition 2021, MSRV 1.75
Full C FFI interfaces, 10 unit tests
8 new API functions in mvgal_gpu.h
Background monitoring thread per GPU
Configurable thresholds (temp, utilization, memory)
Callback system for health alerts
Phase 7: Tools, Steam, UI, OpenGL (100%)
mvgal-info — GPU info, VRAM, temp, utilization, JSON output
mvgal-status — real-time bars, daemon check, --watch mode
mvgal-bench — memory BW, compute FLOPS, latency, sync overhead
mvgal-compat — system check + 15+ app compatibility database
All compile with -Wall -Wextra -Werror on GCC 16
mvgal_frame_pacer.c/h — vsync-aligned frame pacing, ring buffer depth 8
mvgal_steam_compat.sh — Steam compatibility tool entry point
toolmanifest.vdf + compatibilitytool.vdf — Steam registration
mvgal_dashboard.cpp/h — Qt5/Qt6, 4 tabs, per-GPU widgets
mvgal_rest_server.go — Go HTTP server, 5 REST endpoints
mvgal_gl_preload.c — LD_PRELOAD shim for frame pacing
Blender, Unreal Engine, PyTorch/TensorFlow/JAX, FFmpeg guides
Automated test scripts
Phase 8: Root Cleanup + CI (100%)
Moved generated/backup files to .archive/ (gitignored)
Both CI workflows set to workflow_dispatch (manual-only)
build/install.sh — generic installer using pkexec
build/cmake/toolchains/aarch64-linux-gnu.cmake — ARM64 cross-compile
All sudo replaced with pkexec in scripts
Item
Priority
Notes
D3D/Metal/WebGPU intercept logic
Medium
Skeletons exist, need implementation
Full Vulkan ICD (virtual VkPhysicalDevice)
Low
Phase 5 planned feature
Network-distributed GPU pooling
Future
Design accommodated
AI-driven scheduler
Future
Design accommodated
Language
LOC
Files
C (userspace)
~25,700
~28
C (kernel)
~2,000
~9
C++ (daemon)
~3,000
~16
Rust
~748
~6
Go
~372
~1
Shell
~500
~8
Total
~32,320
~68