[Bug 1892132] Re: Failure to get the correct UpLink Representor
Frode Nordahl
1892132 at bugs.launchpad.net
Thu Aug 26 16:30:25 UTC 2021
Proposed libvirt package on Focal system with original unmodified kernel and driver:
$ uname -a
Linux node-laveran 5.4.0-81-generic #91-Ubuntu SMP Thu Jul 15 19:09:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ cat /sys/class/net/enp129s0f0/device/driver/module/version
5.0-0
$ lspci -nnvv | grep Mellanox
03:00.0 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
Subsystem: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:0061]
03:00.1 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:1017]
Subsystem: Mellanox Technologies MT27800 Family [ConnectX-5] [15b3:0061]
03:00.2 Ethernet controller [0200]: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] [15b3:1018]
Subsystem: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function] [15b3:0061]
...
# Note that in addition to libvirt from -proposed the system has a test
# package for the in-flight os-vif changes installed.
$ dpkg -l |grep libvirt
ii libvirt-clients 6.0.0-0ubuntu8.13 amd64 Programs for the libvirt library
ii libvirt-daemon 6.0.0-0ubuntu8.13 amd64 Virtualization daemon
ii libvirt-daemon-driver-qemu 6.0.0-0ubuntu8.13 amd64 Virtualization daemon QEMU connection driver
ii libvirt-daemon-driver-storage-rbd 6.0.0-0ubuntu8.13 amd64 Virtualization daemon RBD storage driver
ii libvirt-daemon-system 6.0.0-0ubuntu8.13 amd64 Libvirt daemon configuration files
ii libvirt-daemon-system-systemd 6.0.0-0ubuntu8.13 amd64 Libvirt daemon configuration files (systemd)
ii libvirt0:amd64 6.0.0-0ubuntu8.13 amd64 library for interfacing with different virtualization systems
$ sudo grep -A6 hostdev /etc/libvirt/qemu/instance-00000001.xml
<interface type='hostdev' managed='yes'>
<mac address='fa:16:3e:1a:59:22'/>
<source>
<address type='pci' domain='0x0000' bus='0x03' slot='0x0b' function='0x4'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
$ openstack server list --long
...
| c575e200-74cf-45bd-801e-712d3405f460 | fnord-node-laveran-1 | ACTIVE | None | Running | network=10.42.2.217 | ubuntu | d9aa89af-9ad7-4770-83ba-194f03fec7dc | m1.large | 96afbb8a-697f-4de8-aa76-b8604bc01180 | nova | node-laveran.maas | |
...
$ ssh -i id_rsa fnord-node-laveran-1 lspci
...
00:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
...
<install OFED drivers, reboot and restart instances>
$ uname -a
Linux node-laveran 5.4.0-81-generic #91-Ubuntu SMP Thu Jul 15 19:09:17 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ cat /sys/class/net/enp129s0f0/device/driver/module/version
5.4-1.0.3
$ ssh -i id_rsa fnord-node-laveran-1 lspci
...
00:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
...
** Tags removed: verification-needed-focal
** Tags added: verification-done-focal
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-os-vif in Ubuntu.
https://bugs.launchpad.net/bugs/1892132
Title:
Failure to get the correct UpLink Representor
Status in os-vif:
Fix Released
Status in os-vif victoria series:
Fix Committed
Status in libvirt package in Ubuntu:
Fix Released
Status in python-os-vif package in Ubuntu:
Fix Released
Status in libvirt source package in Focal:
Fix Committed
Status in python-os-vif source package in Focal:
In Progress
Status in libvirt source package in Groovy:
Won't Fix
Status in python-os-vif source package in Groovy:
Won't Fix
Status in libvirt source package in Hirsute:
Fix Committed
Status in python-os-vif source package in Hirsute:
Fix Released
Status in libvirt source package in Impish:
Fix Released
Status in python-os-vif source package in Impish:
Fix Released
Bug description:
[Impact]
An update to the mlx5_core driver [1] which will be made available to users of stable releases both through HWE kernels and DKMS packages provided by NVIDIA/Mellanox [2] makes some assumptions about sysfs layout made by OS-VIF and Libvirt apparent.
To allow users with this hardware to continue to enjoy their existing
systems with the most recent drivers updates are required to os-vif
and libvirt.
Without this update these systems will stop functioning when upgrading
to the new mlx5_core driver.
[Test Plan]
Note: Hardware making use of the mlx5_core driver with support for HWOL is required to test these changes.
1. Deploy OpenStack on machines with HWOL enabled using kernel without [1]
2. Create an instance using an HWOL port
3. Confirm the instance can start and that it has connectivity
4. Upgrade to kernel with [1] and re-confirm
[Regression Potential]
For OS-VIF the changes are made to code paths used exclusively by consumers of this type of hardware and HWOL enabled. They are also made in a backward compatible way so that it works both with the old and new driver.
For Libvirt the change is made in such a way that it will behave as
before when used to look up hardware that populates net/phys_port_id.
When used with hardware that do not populate net/phys_port_id but use
net_phys_port_name instead, which is typical for the hardware in
question, the new behavior is used.
[Original Bug Description]
Due to new kernel patch here [1], the PF and VF representors are linked to their parent PCI device.
Old Structure:
The structure of VF's PCI Address/physfn/net contains only the PF of that VF
$ ls /sys/bus/pci/devices/<vf-pci-addre>/physfn/net/
enp2s0f0
$ ls -l /sys/class/net
...
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_0 -> ../../devices/virtual/net/enp2s0f0_0
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_1 -> ../../devices/virtual/net/enp2s0f0_1
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_2 -> ../../devices/virtual/net/enp2s0f0_2
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_3 -> ../../devices/virtual/net/enp2s0f0_3
...
New Structure:
The structure of VF's PCI Address/physfn/net contains the PF of that VF and the VF representors
$ ls /sys/bus/pci/devices/<vf-pci-addre>/physfn/net/
enp3s0f0 enp3s0f0_0 enp3s0f0_1 enp3s0f0_2 enp3s0f0_3
$ ls -l /sys/class/net
...
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_0 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_0
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_1 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_1
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_2 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_2
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_3 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_3
...
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1
[2] https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
To manage notifications about this bug go to:
https://bugs.launchpad.net/os-vif/+bug/1892132/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list