mirror of
https://github.com/qemu/qemu.git
synced 2026-02-04 02:24:38 +00:00
Starting with commitcab1398a60, SR-IOV VFs are realized as soon as pcie_sriov_pf_init() is called. Because pcie_sriov_pf_init() must be called before pcie_sriov_pf_init_vf_bar(), the VF BARs types won't be known when the VF realize function calls pcie_sriov_vf_register_bar(). This breaks the memory regions of the VFs (for instance with igbvf): $ lspci ... Region 0: Memory at 281a00000 (64-bit, prefetchable) [virtual] [size=16K] Region 3: Memory at 281a20000 (64-bit, prefetchable) [virtual] [size=16K] $ info mtree ... address-space: pci_bridge_pci_mem 0000000000000000-ffffffffffffffff (prio 0, i/o): pci_bridge_pci 0000000081a00000-0000000081a03fff (prio 1, i/o): igbvf-mmio 0000000081a20000-0000000081a23fff (prio 1, i/o): igbvf-msix and causes MMIO accesses to fail: Invalid write at addr 0x281A01520, size 4, region '(null)', reason: rejected Invalid read at addr 0x281A00C40, size 4, region '(null)', reason: rejected To fix this, VF BARs are now registered with pci_register_bar() which has a type parameter and pcie_sriov_vf_register_bar() is removed. Fixes:cab1398a60("pcie_sriov: Reuse SR-IOV VF device instances") Signed-off-by: Damien Bergamini <damien.bergamini@eviden.com> Signed-off-by: Clement Mathieu--Drif <clement.mathieu--drif@eviden.com> Reviewed-by: Akihiko Odaki <odaki@rsg.ci.i.u-tokyo.ac.jp> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Message-ID: <20250901151314.1038020-1-clement.mathieu--drif@eviden.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
114 lines
3.7 KiB
Plaintext
114 lines
3.7 KiB
Plaintext
PCI SR/IOV EMULATION SUPPORT
|
|
============================
|
|
|
|
Description
|
|
===========
|
|
SR/IOV (Single Root I/O Virtualization) is an optional extended capability
|
|
of a PCI Express device. It allows a single physical function (PF) to appear as multiple
|
|
virtual functions (VFs) for the main purpose of eliminating software
|
|
overhead in I/O from virtual machines.
|
|
|
|
QEMU now implements the basic common functionality to enable an emulated device
|
|
to support SR/IOV.
|
|
|
|
Implementation
|
|
==============
|
|
Implementing emulation of an SR/IOV capable device typically consists of
|
|
implementing support for two types of device classes; the "normal" physical device
|
|
(PF) and the virtual device (VF). From QEMU's perspective, the VFs are just
|
|
like other devices, except that some of their properties are derived from
|
|
the PF.
|
|
|
|
A virtual function is different from a physical function in that the BAR
|
|
space for all VFs are defined by the BAR registers in the PFs SR/IOV
|
|
capability. All VFs have the same BARs and BAR sizes.
|
|
|
|
Accesses to these virtual BARs then is computed as
|
|
|
|
<VF BAR start> + <VF number> * <BAR sz> + <offset>
|
|
|
|
From our emulation perspective this means that there is a separate call for
|
|
setting up a BAR for a VF.
|
|
|
|
1) To enable SR/IOV support in the PF, it must be a PCI Express device so
|
|
you would need to add a PCI Express capability in the normal PCI
|
|
capability list. You might also want to add an ARI (Alternative
|
|
Routing-ID Interpretation) capability to indicate that your device
|
|
supports functions beyond it's "own" function space (0-7),
|
|
which is necessary to support more than 7 functions, or
|
|
if functions extends beyond offset 7 because they are placed at an
|
|
offset > 1 or have stride > 1.
|
|
|
|
...
|
|
#include "hw/pci/pcie.h"
|
|
#include "hw/pci/pcie_sriov.h"
|
|
|
|
pci_your_pf_dev_realize( ... )
|
|
{
|
|
...
|
|
int ret = pcie_endpoint_cap_init(d, 0x70);
|
|
...
|
|
pcie_ari_init(d, 0x100);
|
|
...
|
|
|
|
/* Add and initialize the SR/IOV capability */
|
|
if (!pcie_sriov_pf_init(d, 0x200, "your_virtual_dev",
|
|
vf_devid, initial_vfs, total_vfs,
|
|
fun_offset, stride, errp)) {
|
|
return;
|
|
}
|
|
|
|
/* Set up individual VF BARs (parameters as for normal BARs) */
|
|
pcie_sriov_pf_init_vf_bar( ... )
|
|
...
|
|
}
|
|
|
|
For cleanup, you simply call:
|
|
|
|
pcie_sriov_pf_exit(device);
|
|
|
|
which will delete all the virtual functions and associated resources.
|
|
|
|
2) Similarly in the implementation of the virtual function, you need to
|
|
make it a PCI Express device and add a similar set of capabilities
|
|
except for the SR/IOV capability. Then you need to set up the VF BARs as
|
|
subregions of the PFs SR/IOV VF BARs by calling pci_register_bar():
|
|
|
|
pci_your_vf_dev_realize( ... )
|
|
{
|
|
...
|
|
int ret = pcie_endpoint_cap_init(d, 0x60);
|
|
...
|
|
pcie_ari_init(d, 0x100);
|
|
...
|
|
memory_region_init(mr, ... )
|
|
pci_register_bar(d, bar_nr, bar_type, mr);
|
|
...
|
|
}
|
|
|
|
Testing on Linux guest
|
|
======================
|
|
The easiest is if your device driver supports sysfs based SR/IOV
|
|
enabling. Support for this was added in kernel v.3.8, so not all drivers
|
|
support it yet.
|
|
|
|
To enable 4 VFs for a device at 01:00.0:
|
|
|
|
modprobe yourdriver
|
|
echo 4 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
|
|
|
You should now see 4 VFs with lspci.
|
|
To turn SR/IOV off again - the standard requires you to turn it off before you can enable
|
|
another VF count, and the emulation enforces this:
|
|
|
|
echo 0 > /sys/bus/pci/devices/0000:01:00.0/sriov_numvfs
|
|
|
|
Older drivers typically provide a max_vfs module parameter
|
|
to enable it at load time:
|
|
|
|
modprobe yourdriver max_vfs=4
|
|
|
|
To disable the VFs again then, you simply have to unload the driver:
|
|
|
|
rmmod yourdriver
|