PCI Passthrough for Virtual Machines
PCI passthrough allows a KVM virtual machine to have direct, exclusive access to a physical PCIe device (such as a GPU, NIC, or NVMe controller), bypassing the hypervisor for near-native performance. This guide covers enabling IOMMU, binding devices to VFIO drivers, configuring GPU passthrough in KVM, and tuning for performance.
Prerequisites
- CPU with Intel VT-d (Intel) or AMD-Vi (AMD) IOMMU support
- BIOS/UEFI with IOMMU/VT-d/AMD-Vi enabled
- Ubuntu 20.04/22.04 or CentOS/Rocky Linux 8+
- KVM/QEMU/libvirt installed
- The PCIe device you want to pass through (e.g., GPU, NIC)
Enable IOMMU in BIOS and Kernel
Step 1: Enable in BIOS/UEFI
- Intel systems: Enable "VT-d" (Intel Virtualization Technology for Directed I/O)
- AMD systems: Enable "AMD-Vi" or "IOMMU" in the CPU configuration section
Step 2: Enable in kernel via GRUB:
# Edit GRUB configuration
sudo nano /etc/default/grub
# For Intel systems, add intel_iommu=on:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt"
# For AMD systems:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=on iommu=pt"
# iommu=pt (passthrough mode) reduces overhead for non-passthrough devices
# Apply the GRUB change
sudo update-grub # Ubuntu/Debian
sudo grub2-mkconfig -o /boot/grub2/grub.cfg # CentOS/Rocky
# Reboot
sudo reboot
Step 3: Verify IOMMU is active:
# Check IOMMU is enabled in kernel
sudo dmesg | grep -E "IOMMU|iommu"
# Should show: "DMAR: IOMMU enabled" (Intel) or "AMD-Vi: AMD IOMMUv2 loaded"
# Verify IOMMU groups exist
ls /sys/kernel/iommu_groups/ | wc -l
# Non-zero output means IOMMU is working
Identify IOMMU Groups
Devices in the same IOMMU group must all be passed through together:
# Script to list all IOMMU groups with their devices
for d in /sys/kernel/iommu_groups/*/devices/*; do
n=$(basename $(dirname $(dirname $d)))
printf "IOMMU Group $n:\t$(lspci -nns ${d##*/})\n"
done | sort -V
# Output example:
# IOMMU Group 1: 00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3...
# IOMMU Group 14: 03:00.0 VGA compatible controller [0300]: NVIDIA GeForce RTX 3080
# IOMMU Group 14: 03:00.1 Audio device [0403]: NVIDIA Corporation GA102 HD Audio
Key point: If a GPU is in the same IOMMU group as other devices (like a PCIe bridge), you may need the ACS override patch or a different PCIe slot.
# Find the PCI address of your target device
lspci | grep -E "VGA|NVIDIA|AMD|Radeon|Network|NVMe"
# Get vendor:device IDs (needed for VFIO binding)
lspci -n -s 03:00.0 # Shows: 03:00.0 0300: 10de:2206 (rev a1)
# 10de:2206 = NVIDIA (10de) + RTX 3080 device ID (2206)
Bind Devices to VFIO
The VFIO driver replaces the device's normal driver, making it available for passthrough:
# Load VFIO modules
sudo modprobe vfio
sudo modprobe vfio_pci
sudo modprobe vfio_iommu_type1
# Make VFIO modules load on boot
echo -e "vfio\nvfio_pci\nvfio_iommu_type1" | \
sudo tee /etc/modules-load.d/vfio.conf
# Bind a specific device to VFIO using its vendor:device ID
# Method 1: via /etc/modprobe.d (persistent, applies at boot)
# Get the IDs: lspci -n -s 03:00.0 -> 10de:2206 and 10de:1aef (audio)
echo "options vfio-pci ids=10de:2206,10de:1aef" | \
sudo tee /etc/modprobe.d/vfio.conf
# Regenerate initramfs to include VFIO modules early
sudo update-initramfs -u # Ubuntu/Debian
sudo dracut -f # CentOS/Rocky Linux
sudo reboot
# After reboot, verify the device is bound to VFIO
lspci -k -s 03:00.0
# Kernel driver in use: vfio-pci <-- This is what you want
# NOT: nvidia or nouveau
# Method 2: Bind at runtime (temporary, no reboot needed)
# Replace current driver with VFIO
echo "10de 2206" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
echo "0000:03:00.0" | sudo tee /sys/bus/pci/devices/0000:03:00.0/driver/unbind
echo "0000:03:00.0" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind
Configure KVM Virtual Machine for PCI Passthrough
# Install KVM and virt-manager if not already installed
sudo apt install -y qemu-kvm libvirt-daemon-system virtinst virt-manager
# Enable and start libvirt
sudo systemctl enable libvirtd
sudo systemctl start libvirtd
# Add your user to the libvirt and kvm groups
sudo usermod -aG libvirt,kvm $USER
# Create or edit a VM using virsh
virsh edit myvm
Add the PCI device to the VM XML configuration:
<!-- Add inside <devices> section of your VM's XML -->
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</hostdev>
<!-- Also pass through the GPU's HDMI audio device -->
<hostdev mode='subsystem' type='pci' managed='yes'>
<driver name='vfio'/>
<source>
<address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
</source>
</hostdev>
# Add PCI device using virsh command line (alternative to XML edit)
virsh attach-device myvm \
--file /tmp/gpu-passthrough.xml \
--persistent
# Start the VM and verify the device is present inside it
virsh start myvm
virsh console myvm
# Inside the VM: lspci | grep NVIDIA
GPU Passthrough Configuration
GPU passthrough requires additional steps for NVIDIA and AMD GPUs:
# NVIDIA: Hide KVM signature (required for NVIDIA driver on GeForce cards)
# Add to VM XML <features> section:
<features>
<acpi/>
<apic/>
<kvm>
<hidden state='on'/>
</kvm>
<ioapic driver='kvm'/>
</features>
# For the VM to use the GPU properly:
# 1. Remove the virtual display (QXL/VGA) from the VM
# 2. The GPU's physical output will be used instead
# If using Looking Glass (software for sharing GPU output to host):
# https://looking-glass.io/
# Hugepages improve performance for GPU VMs
# Add to /etc/sysctl.conf:
echo "vm.nr_hugepages=4096" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p
# Configure VM to use hugepages in the XML:
# <memoryBacking>
# <hugepages/>
# </memoryBacking>
Performance Tuning
# CPU pinning: assign specific host CPUs to the VM
# First, identify NUMA topology
lscpu | grep NUMA
numactl --hardware
# In VM XML, pin vCPUs to physical cores:
<vcpu placement='static' cpuset='4-7'>4</vcpu>
<cputune>
<vcpupin vcpu='0' cpuset='4'/>
<vcpupin vcpu='1' cpuset='5'/>
<vcpupin vcpu='2' cpuset='6'/>
<vcpupin vcpu='3' cpuset='7'/>
<emulatorpin cpuset='2-3'/>
</cputune>
# Isolate CPUs from the host scheduler (kernel cmdline)
# Add isolcpus=4-7 to GRUB_CMDLINE_LINUX_DEFAULT
# This reserves CPUs 4-7 exclusively for the VM
# Enable hugepages for lower memory latency
echo "vm.nr_hugepages=4096" | sudo tee -a /etc/sysctl.conf
# Set CPU governor to performance
sudo apt install -y cpufrequtils
sudo cpufreq-set -g performance
# Verify tuning inside VM
# Install drivers normally — the GPU appears as a native PCIe device
# Benchmark: nvidia-smi shows full GPU performance
Troubleshooting
"VFIO: No available IOMMU models":
# IOMMU is not enabled in the kernel
dmesg | grep -i iommu
# If empty, check BIOS settings and GRUB parameters
cat /proc/cmdline | grep iommu
# Must contain intel_iommu=on or amd_iommu=on
Device still bound to original driver after VFIO config:
# Check if the original driver is loading before vfio-pci
# Add the original driver to the modprobe blacklist
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo "blacklist nvidia" | sudo tee /etc/modprobe.d/blacklist-nvidia.conf
sudo update-initramfs -u
sudo reboot
# Verify after reboot
lspci -k -s 03:00.0 | grep "driver in use"
# Should show: vfio-pci
VM fails to start with "Error starting domain: Unable to reset PCI device":
# The device's IOMMU group has devices still bound to host drivers
# Find all devices in the same IOMMU group
for d in /sys/kernel/iommu_groups/14/devices/*; do lspci -nns ${d##*/}; done
# All devices in the group must be bound to vfio-pci
# Add all vendor:device IDs to /etc/modprobe.d/vfio.conf
GPU not detected inside VM:
# Check libvirt logs for passthrough errors
sudo journalctl -u libvirtd | grep -i "vfio\|pci"
# Verify VFIO permissions
ls -la /dev/vfio/
# Should show your IOMMU group number with libvirt access
Conclusion
PCI passthrough with VFIO enables virtual machines to achieve near-native performance with physical PCIe devices, making it ideal for GPU compute workloads, high-performance networking, and low-latency storage in virtualized environments. The key requirements are proper IOMMU grouping, early VFIO driver binding, and appropriate CPU pinning for performance-sensitive workloads. GPU passthrough in particular opens up use cases like running gaming VMs or CUDA workloads in isolated environments on the same hardware.


