PCI Passthrough for Virtual Machines

PCI passthrough allows a KVM virtual machine to have direct, exclusive access to a physical PCIe device (such as a GPU, NIC, or NVMe controller), bypassing the hypervisor for near-native performance. This guide covers enabling IOMMU, binding devices to VFIO drivers, configuring GPU passthrough in KVM, and tuning for performance.

Prerequisites

  • CPU with Intel VT-d (Intel) or AMD-Vi (AMD) IOMMU support
  • BIOS/UEFI with IOMMU/VT-d/AMD-Vi enabled
  • Ubuntu 20.04/22.04 or CentOS/Rocky Linux 8+
  • KVM/QEMU/libvirt installed
  • The PCIe device you want to pass through (e.g., GPU, NIC)

Enable IOMMU in BIOS and Kernel

Step 1: Enable in BIOS/UEFI

  • Intel systems: Enable "VT-d" (Intel Virtualization Technology for Directed I/O)
  • AMD systems: Enable "AMD-Vi" or "IOMMU" in the CPU configuration section

Step 2: Enable in kernel via GRUB:

# Edit GRUB configuration
sudo nano /etc/default/grub

# For Intel systems, add intel_iommu=on:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash intel_iommu=on iommu=pt"

# For AMD systems:
GRUB_CMDLINE_LINUX_DEFAULT="quiet splash amd_iommu=on iommu=pt"

# iommu=pt (passthrough mode) reduces overhead for non-passthrough devices

# Apply the GRUB change
sudo update-grub   # Ubuntu/Debian
sudo grub2-mkconfig -o /boot/grub2/grub.cfg  # CentOS/Rocky

# Reboot
sudo reboot

Step 3: Verify IOMMU is active:

# Check IOMMU is enabled in kernel
sudo dmesg | grep -E "IOMMU|iommu"
# Should show: "DMAR: IOMMU enabled" (Intel) or "AMD-Vi: AMD IOMMUv2 loaded"

# Verify IOMMU groups exist
ls /sys/kernel/iommu_groups/ | wc -l
# Non-zero output means IOMMU is working

Identify IOMMU Groups

Devices in the same IOMMU group must all be passed through together:

# Script to list all IOMMU groups with their devices
for d in /sys/kernel/iommu_groups/*/devices/*; do
  n=$(basename $(dirname $(dirname $d)))
  printf "IOMMU Group $n:\t$(lspci -nns ${d##*/})\n"
done | sort -V

# Output example:
# IOMMU Group 1:  00:01.0 PCI bridge [0604]: Intel Corporation Xeon E3... 
# IOMMU Group 14: 03:00.0 VGA compatible controller [0300]: NVIDIA GeForce RTX 3080
# IOMMU Group 14: 03:00.1 Audio device [0403]: NVIDIA Corporation GA102 HD Audio

Key point: If a GPU is in the same IOMMU group as other devices (like a PCIe bridge), you may need the ACS override patch or a different PCIe slot.

# Find the PCI address of your target device
lspci | grep -E "VGA|NVIDIA|AMD|Radeon|Network|NVMe"

# Get vendor:device IDs (needed for VFIO binding)
lspci -n -s 03:00.0   # Shows: 03:00.0 0300: 10de:2206 (rev a1)
# 10de:2206 = NVIDIA (10de) + RTX 3080 device ID (2206)

Bind Devices to VFIO

The VFIO driver replaces the device's normal driver, making it available for passthrough:

# Load VFIO modules
sudo modprobe vfio
sudo modprobe vfio_pci
sudo modprobe vfio_iommu_type1

# Make VFIO modules load on boot
echo -e "vfio\nvfio_pci\nvfio_iommu_type1" | \
  sudo tee /etc/modules-load.d/vfio.conf

# Bind a specific device to VFIO using its vendor:device ID
# Method 1: via /etc/modprobe.d (persistent, applies at boot)
# Get the IDs: lspci -n -s 03:00.0 -> 10de:2206 and 10de:1aef (audio)
echo "options vfio-pci ids=10de:2206,10de:1aef" | \
  sudo tee /etc/modprobe.d/vfio.conf

# Regenerate initramfs to include VFIO modules early
sudo update-initramfs -u     # Ubuntu/Debian
sudo dracut -f               # CentOS/Rocky Linux

sudo reboot

# After reboot, verify the device is bound to VFIO
lspci -k -s 03:00.0
# Kernel driver in use: vfio-pci  <-- This is what you want
# NOT: nvidia or nouveau

# Method 2: Bind at runtime (temporary, no reboot needed)
# Replace current driver with VFIO
echo "10de 2206" | sudo tee /sys/bus/pci/drivers/vfio-pci/new_id
echo "0000:03:00.0" | sudo tee /sys/bus/pci/devices/0000:03:00.0/driver/unbind
echo "0000:03:00.0" | sudo tee /sys/bus/pci/drivers/vfio-pci/bind

Configure KVM Virtual Machine for PCI Passthrough

# Install KVM and virt-manager if not already installed
sudo apt install -y qemu-kvm libvirt-daemon-system virtinst virt-manager

# Enable and start libvirt
sudo systemctl enable libvirtd
sudo systemctl start libvirtd

# Add your user to the libvirt and kvm groups
sudo usermod -aG libvirt,kvm $USER

# Create or edit a VM using virsh
virsh edit myvm

Add the PCI device to the VM XML configuration:

<!-- Add inside <devices> section of your VM's XML -->
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x03' slot='0x00' function='0x0'/>
  </source>
  <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/>
</hostdev>

<!-- Also pass through the GPU's HDMI audio device -->
<hostdev mode='subsystem' type='pci' managed='yes'>
  <driver name='vfio'/>
  <source>
    <address domain='0x0000' bus='0x03' slot='0x00' function='0x1'/>
  </source>
</hostdev>
# Add PCI device using virsh command line (alternative to XML edit)
virsh attach-device myvm \
  --file /tmp/gpu-passthrough.xml \
  --persistent

# Start the VM and verify the device is present inside it
virsh start myvm
virsh console myvm
# Inside the VM: lspci | grep NVIDIA

GPU Passthrough Configuration

GPU passthrough requires additional steps for NVIDIA and AMD GPUs:

# NVIDIA: Hide KVM signature (required for NVIDIA driver on GeForce cards)
# Add to VM XML <features> section:
<features>
  <acpi/>
  <apic/>
  <kvm>
    <hidden state='on'/>
  </kvm>
  <ioapic driver='kvm'/>
</features>
# For the VM to use the GPU properly:
# 1. Remove the virtual display (QXL/VGA) from the VM
# 2. The GPU's physical output will be used instead

# If using Looking Glass (software for sharing GPU output to host):
# https://looking-glass.io/

# Hugepages improve performance for GPU VMs
# Add to /etc/sysctl.conf:
echo "vm.nr_hugepages=4096" | sudo tee -a /etc/sysctl.conf
sudo sysctl -p

# Configure VM to use hugepages in the XML:
# <memoryBacking>
#   <hugepages/>
# </memoryBacking>

Performance Tuning

# CPU pinning: assign specific host CPUs to the VM
# First, identify NUMA topology
lscpu | grep NUMA
numactl --hardware

# In VM XML, pin vCPUs to physical cores:
<vcpu placement='static' cpuset='4-7'>4</vcpu>
<cputune>
  <vcpupin vcpu='0' cpuset='4'/>
  <vcpupin vcpu='1' cpuset='5'/>
  <vcpupin vcpu='2' cpuset='6'/>
  <vcpupin vcpu='3' cpuset='7'/>
  <emulatorpin cpuset='2-3'/>
</cputune>
# Isolate CPUs from the host scheduler (kernel cmdline)
# Add isolcpus=4-7 to GRUB_CMDLINE_LINUX_DEFAULT
# This reserves CPUs 4-7 exclusively for the VM

# Enable hugepages for lower memory latency
echo "vm.nr_hugepages=4096" | sudo tee -a /etc/sysctl.conf

# Set CPU governor to performance
sudo apt install -y cpufrequtils
sudo cpufreq-set -g performance

# Verify tuning inside VM
# Install drivers normally — the GPU appears as a native PCIe device
# Benchmark: nvidia-smi shows full GPU performance

Troubleshooting

"VFIO: No available IOMMU models":

# IOMMU is not enabled in the kernel
dmesg | grep -i iommu
# If empty, check BIOS settings and GRUB parameters

cat /proc/cmdline | grep iommu
# Must contain intel_iommu=on or amd_iommu=on

Device still bound to original driver after VFIO config:

# Check if the original driver is loading before vfio-pci
# Add the original driver to the modprobe blacklist
echo "blacklist nouveau" | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
echo "blacklist nvidia" | sudo tee /etc/modprobe.d/blacklist-nvidia.conf

sudo update-initramfs -u
sudo reboot

# Verify after reboot
lspci -k -s 03:00.0 | grep "driver in use"
# Should show: vfio-pci

VM fails to start with "Error starting domain: Unable to reset PCI device":

# The device's IOMMU group has devices still bound to host drivers
# Find all devices in the same IOMMU group
for d in /sys/kernel/iommu_groups/14/devices/*; do lspci -nns ${d##*/}; done

# All devices in the group must be bound to vfio-pci
# Add all vendor:device IDs to /etc/modprobe.d/vfio.conf

GPU not detected inside VM:

# Check libvirt logs for passthrough errors
sudo journalctl -u libvirtd | grep -i "vfio\|pci"

# Verify VFIO permissions
ls -la /dev/vfio/
# Should show your IOMMU group number with libvirt access

Conclusion

PCI passthrough with VFIO enables virtual machines to achieve near-native performance with physical PCIe devices, making it ideal for GPU compute workloads, high-performance networking, and low-latency storage in virtualized environments. The key requirements are proper IOMMU grouping, early VFIO driver binding, and appropriate CPU pinning for performance-sensitive workloads. GPU passthrough in particular opens up use cases like running gaming VMs or CUDA workloads in isolated environments on the same hardware.