[Bug 1892132] Re: Failure to get the correct UpLink Representor
Frode Nordahl
1892132 at bugs.launchpad.net
Fri Sep 10 13:37:53 UTC 2021
os-vif focal/victoria
$ uname -a
Linux node-laveran 5.4.0-84-generic #94-Ubuntu SMP Thu Aug 26 20:27:37 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ dpkg -l | grep os-vif
ii python3-os-vif 2.2.0-0ubuntu1~cloud1 all Integration library between network and compute - Python 3.x
$ sudo grep -A6 hostdev /etc/libvirt/qemu/instance-00000004.xml
<interface type='hostdev' managed='yes'>
<mac address='fa:16:3e:a2:74:a8'/>
<source>
<address type='pci' domain='0x0000' bus='0x03' slot='0x0b' function='0x5'/>
</source>
<address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/>
</interface>
$ sudo ovs-vsctl find interface name=eth59
_uuid : 27596589-93b4-419c-ac0d-77b848116334
admin_state : up
bfd : {}
bfd_status : {}
cfm_fault : []
cfm_fault_status : []
cfm_flap_count : []
cfm_health : []
cfm_mpid : []
cfm_remote_mpids : []
cfm_remote_opstate : []
duplex : []
error : []
external_ids : {attached-mac="fa:16:3e:a2:74:a8", iface-id="bb8c0af4-6b0a-4ec0-87ee-2cb30eb76445", iface-status=active, vm-uuid="27787de4-e1f7-45b0-b32f-d49a81a0675f"}
ifindex : 135
ingress_policing_burst: 0
ingress_policing_rate: 0
lacp_current : []
link_resets : 0
link_speed : []
link_state : up
lldp : {}
mac : []
mac_in_use : "0e:e7:e4:a9:5c:d5"
mtu : 8942
mtu_request : []
name : eth59
ofport : 3
ofport_request : []
options : {}
other_config : {}
statistics : {collisions=0, rx_bytes=36121, rx_crc_err=0, rx_dropped=0, rx_errors=0, rx_frame_err=0, rx_missed_errors=0, rx_over_err=0, rx_packets=270, tx_bytes=49194, tx_dropped=0, tx_errors=0, tx_packets=175}
status : {driver_name=mlx5e_rep, driver_version="5.4.0-84-generic", firmware_version="16.31.1014 (MT_0000000183)"}
type : ""
$ sudo ls -l /sys/class/net/eth59/device
ls: cannot access '/sys/class/net/eth59/device': No such file or directory
$ sudo ls -l /sys/class/net/enp129s0f0/device/net/
total 0
drwxr-xr-x 6 root root 0 Sep 10 09:37 enp129s0f0
$ ssh fnord-node-laveran-1 lspci
00:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
$ sudo apt install --install-recommends linux-generic-hwe-20.04-edge-wip
Reading package lists... Done
Building dependency tree
Reading state information... Done
The following additional packages will be installed:
linux-headers-5.13.0-14-generic linux-headers-generic-hwe-20.04-edge-wip
linux-hwe-5.13-headers-5.13.0-14 linux-image-5.13.0-14-generic
linux-image-generic-hwe-20.04-edge-wip linux-modules-5.13.0-14-generic
linux-modules-extra-5.13.0-14-generic
Suggested packages:
fdutils linux-doc | linux-hwe-5.13-source-5.13.0 linux-hwe-5.13-tools
The following NEW packages will be installed:
linux-generic-hwe-20.04-edge-wip linux-headers-5.13.0-14-generic
linux-headers-generic-hwe-20.04-edge-wip linux-hwe-5.13-headers-5.13.0-14
linux-image-5.13.0-14-generic linux-image-generic-hwe-20.04-edge-wip
linux-modules-5.13.0-14-generic linux-modules-extra-5.13.0-14-generic
0 upgraded, 8 newly installed, 0 to remove and 64 not upgraded.
Need to get 81.1 MB of archives.
After this operation, 411 MB of additional disk space will be used.
Do you want to continue? [Y/n]
...
$ sudo shutdown -r now
...
$ uname -a
Linux node-laveran 5.13.0-14-generic #14~20.04.4-Ubuntu SMP Wed Aug 25 11:02:57 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
$ sudo ls -l /sys/class/net/eth59/device
lrwxrwxrwx 1 root root 0 Sep 10 13:32 /sys/class/net/eth59/device -> ../../../0000:03:00.1
$ sudo ls -l /sys/class/net/enp129s0f0/device/net/
total 0
drwxr-xr-x 6 root root 0 Sep 10 13:30 enp129s0f0
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth0
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth1
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth10
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth11
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth12
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth13
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth14
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth15
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth16
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth17
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth18
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth19
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth2
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth20
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth21
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth22
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth23
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth24
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth25
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth26
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth27
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth28
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth29
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth3
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth30
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth31
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth4
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth5
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth6
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth7
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth8
drwxr-xr-x 5 root root 0 Sep 10 13:31 eth9
$ ssh fnord-node-laveran-1 lspci
...
00:03.0 Ethernet controller: Mellanox Technologies MT27800 Family [ConnectX-5 Virtual Function]
...
** Tags removed: verification-victoria-needed
** Tags added: verification-victoria-done
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1892132
Title:
Failure to get the correct UpLink Representor
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive ussuri series:
Fix Committed
Status in Ubuntu Cloud Archive victoria series:
Fix Committed
Status in os-vif:
Fix Released
Status in os-vif victoria series:
Fix Committed
Status in libvirt package in Ubuntu:
Fix Released
Status in python-os-vif package in Ubuntu:
Fix Released
Status in libvirt source package in Focal:
Fix Released
Status in python-os-vif source package in Focal:
Fix Committed
Status in libvirt source package in Groovy:
Won't Fix
Status in python-os-vif source package in Groovy:
Won't Fix
Status in libvirt source package in Hirsute:
Fix Released
Status in python-os-vif source package in Hirsute:
Fix Released
Status in libvirt source package in Impish:
Fix Released
Status in python-os-vif source package in Impish:
Fix Released
Bug description:
[Impact]
An update to the mlx5_core driver [1] which will be made available to users of stable releases both through HWE kernels and DKMS packages provided by NVIDIA/Mellanox [2] makes some assumptions about sysfs layout made by OS-VIF and Libvirt apparent.
To allow users with this hardware to continue to enjoy their existing
systems with the most recent drivers updates are required to os-vif
and libvirt.
Without this update these systems will stop functioning when upgrading
to the new mlx5_core driver.
[Test Plan]
Note: Hardware making use of the mlx5_core driver with support for HWOL is required to test these changes.
1. Deploy OpenStack on machines with HWOL enabled using kernel without [1]
2. Create an instance using an HWOL port
3. Confirm the instance can start and that it has connectivity
4. Upgrade to kernel with [1] and re-confirm
[Regression Potential]
For OS-VIF the changes are made to code paths used exclusively by consumers of this type of hardware and HWOL enabled. They are also made in a backward compatible way so that it works both with the old and new driver.
For Libvirt the change is made in such a way that it will behave as
before when used to look up hardware that populates net/phys_port_id.
When used with hardware that do not populate net/phys_port_id but use
net_phys_port_name instead, which is typical for the hardware in
question, the new behavior is used.
[Original Bug Description]
Due to new kernel patch here [1], the PF and VF representors are linked to their parent PCI device.
Old Structure:
The structure of VF's PCI Address/physfn/net contains only the PF of that VF
$ ls /sys/bus/pci/devices/<vf-pci-addre>/physfn/net/
enp2s0f0
$ ls -l /sys/class/net
...
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_0 -> ../../devices/virtual/net/enp2s0f0_0
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_1 -> ../../devices/virtual/net/enp2s0f0_1
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_2 -> ../../devices/virtual/net/enp2s0f0_2
lrwxrwxrwx 1 root root 0 Aug 17 11:11 enp2s0f0_3 -> ../../devices/virtual/net/enp2s0f0_3
...
New Structure:
The structure of VF's PCI Address/physfn/net contains the PF of that VF and the VF representors
$ ls /sys/bus/pci/devices/<vf-pci-addre>/physfn/net/
enp3s0f0 enp3s0f0_0 enp3s0f0_1 enp3s0f0_2 enp3s0f0_3
$ ls -l /sys/class/net
...
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_0 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_0
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_1 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_1
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_2 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_2
lrwxrwxrwx. 1 root root 0 Aug 17 08:43 enp3s0f0_3 -> ../../devices/pci0000:00/0000:00:02.0/0000:03:00.0/net/enp3s0f0_3
...
[1] https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=123f0f53dd64b67e34142485fe866a8a581f12f1
[2] https://www.mellanox.com/products/infiniband-drivers/linux/mlnx_ofed
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1892132/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list