QEMU Passes RTX 5090 Through macOS Host
Scott J. Goldman’s project adapts QEMU on macOS to pass a Thunderbolt-connected NVIDIA GPU through to a Linux VM on Apple Silicon, using a custom virtual PCI device and guest-side DMA handling to work around macOS and DART constraints. The post walks through the engineering, including PCI BAR mapping, DMA quirks, NVIDIA driver patching, and mapping coalescing, then backs it up with benchmarks for games like Cyberpunk 2077, Doom, and Crysis plus AI inference tests. The practical takeaway is that the setup is hacky and unstable, but it proves an Apple Silicon Mac can drive CUDA workloads through passthrough and, in some inference cases, outperform a higher-end Mac Studio.
This is a very cool proof-of-concept, but it reads more like “future infrastructure” than a product you can recommend to normal users.
- –The core technical trick is a macOS-hosted QEMU fork plus a custom `apple-dma-pci` path to keep GPU DMA working within Apple Silicon limits.
- –The AI angle is the most interesting part: CUDA inference on the passed-through RTX 5090 looks materially faster than native Metal on the M4 Air for the workloads tested.
- –The gaming benchmarks are useful as validation, but they also underline how much glue code and driver patching this needs.
- –Stability is a real limiter: entitlement delays, mapping fragmentation, Steam/FEX issues, and manual GPU reset/replug workflows make it a hobbyist-only setup for now.
DISCOVERED
7h ago
2026-05-08
PUBLISHED
11h ago
2026-05-08
RELEVANCE
AUTHOR
scottjgo