APPLIED[K]: [SRU][F][J][K][PATCH 0/2] iavf: SR-IOV VFs error with no traffic flow when MTU greater than 1500

Andrea Righi andrea.righi at canonical.com
Wed Oct 12 06:50:19 UTC 2022


On Tue, Oct 04, 2022 at 05:44:34PM +1300, Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1983656
> 
> [Impact]
> 
> Virtual Machines with SR-IOV VFs from an Intel E810-XXV [8086:159b] get no 
> traffic flow and produce error messages in both the host and guest during
> network configuration.
> 
> Environment: Ubuntu OpenStack Focal-Ussuri with OVN
> Host Kernel: v5.15.0-41-generic 20.04 Focal-HWE
> Guest Kernels: v5.4.x Focal, v5.15.0-41-generic Jammy
> 
> Host Error Messages:
> ice 0000:98:00.1: VF 7 failed opcode 6, retval: -5
> 
> Guest Error Messages:
> iavf 0000:00:05.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> In the context of these errors "6" refers to the value of 
> VIRTCHNL_OP_CONFIG_VSI_QUEUES
> 
> It was found in these cases that the VM is able to successfully transmit packets
> but never receives any and the RX packet drop counters for the VF in "ip link" 
> on the host increase equal to the RX packet count.
> 
> There is a prior commit e6ba5273d4ede03d075d7a116b8edad1f6115f4d claiming to
> resolve this error in some cases. It is already included in 5.15.0-41-generic
> and did not resolve the issue.
> 
> The following conditions are required to trigger the bug:
> - A port VLAN must be assigned by the host
> - The MTU must be set >1500 by the guest
> 
> There is no workaround, Intel E810 SR-IOV VFs with MTU >1500 cannot be
> used without these patches.
> 
> [Fix]
> 
> iavf currently sets the maximum packet size to IAVF_MAX_RXBUFFER, but on the
> previous ice driver, it was decremented by VLAN_HLEN to make some space to fit
> the VLAN header. This doesn't happen on iavf, and we end up trying to use a 
> packet size larger than IAVF_MAX_RXBUFFER, causing the IAVF_ERR_PARAM error.
> 
> The fix is to change the maximum packet size from IAVF_MAX_RXBUFFER to max_mtu
> received from the PF via GET_VF_RESOURCES msg.
> 
> Also pick up a necessary commit for i40e to announce the correct maximum packet
> size by GET_VF_RESOURCES msg.
> 
> This has been fixed by the following commits:
> 
> commit 399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> Author: Michal Jaron <michalx.jaron at intel.com>
> Date:   Tue Sep 13 15:38:35 2022 +0200
> Subject: iavf: Fix set max MTU size with port VLAN and jumbo frames
> Link: https://github.com/torvalds/linux/commit/399c98c4dc50b7eb7e9f24da7ffdda6f025676ef
> 
> commit 372539def2824c43b6afe2403045b140f65c5acc
> Author: Michal Jaron <michalx.jaron at intel.com>
> Date:   Tue Sep 13 15:38:36 2022 +0200
> Subject: i40e: Fix VF set max MTU size
> Link: https://github.com/torvalds/linux/commit/372539def2824c43b6afe2403045b140f65c5acc
> 
> A test kernel is available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel to a compute host and VM, when you attach a 
> VF and set the MTU to 9000, it succeeds, and traffic can flow.
> 
> [Test Plan]
> 
> Create a Focal VM and assign an Intel E810 (ice) SR-IOV VF with a port vlan:
> 
> Openstack works, as does creating a VM directly with uvtool/libvirt.
> 
> $ uvt-kvm create focal-test release=focal
> 
> Using the document to understand SRIOV basics in the link below
> 
> https://www.intel.com/content/www/us/en/developer/articles/technical/configure-sr-iov-network-virtual-functions-in-linux-kvm.html
> 
> The following command show all the bus info for all the network devices
> 
> $ lshw -c network -businfo
> 
> Choose one, as shown below
> 
> pci at 0000:17:01.4  ens2f0v4     network        Ethernet Adaptive Virtual Function
> 
> We can then add the following into the XML definition via “virsh edit focal-test”
> 
> <interface type='hostdev' managed='yes'>
>       <source>
>         <address type='pci' domain='0x0000' bus='0x17' slot='0x01' function='0x4'/>
>       </source>
>      <vlan>
>         <tag id='998'/>
>       </vlan>
> </interface>
> 
> Then we stop and start the VM via "virsh shutdown focal-test" and then 
> "virsh start focal-test". We can then login to the VM using the command below
> 
> $ uvt-kvm ssh focal-test
> 
> Once you have logged in, run the following ip parameters
> 
> $ sudo ip a a 192.168.1.7/24 dev enp7s0
> $ sudo ip link set up dev enp7s0
> $ sudo ip link set mtu 9000 dev enp7s0
> 
> Now check dmesg, and we will find the error
> 
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> 
> Setting the IP and bringing the link up
> 
> [   36.228877] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> [   36.228887] IPv6: ADDRCONF(NETDEV_CHANGE): enp7s0: link becomes ready
> [   45.740100] crng init done
> [   45.740102] random: 7 urandom warning(s) missed due to ratelimiting
> 
> Then setting the MTU
> 
> [   61.433706] iavf 0000:07:00.0: Received 16 queues, but can only have a max of 4
> [   61.433707] iavf 0000:07:00.0: Fixing by reducing queues to 4
> [   61.529605] iavf 0000:07:00.0: PF returned error -5 (IAVF_ERR_PARAM) to our request 6
> [   61.552890] iavf 0000:07:00.0 enp7s0: NIC Link is Up Speed is 25 Gbps Full Duplex
> 
> There is a test kernel available in the following ppa:
> 
> https://launchpad.net/~arif-ali/+archive/ubuntu/sf00343742
> 
> If you install the test kernel, setting the MTU to 9000 works as expected and
> traffic can flow.
> 
> [Where problems could occur]
> 
> We are changing how maximum MTU is calculated and applied to VFs in the iavf and
> i40e drivers. Currently, any MTU over 1500 does not work at all when a port
> VLAN is enabled, but if someone has somehow got their setup to work, they may
> see a difference in MTU with these patches applied.
> 
> The iavf and i40e drivers are a popular driver, and if a regression were to
> occur, initialisation and bringup of these network devices and VFs might fail.
> 
> Most users currently using MTUs of 1500 are unlikely to see any difference or
> be at risk of regression.
> 
> [Other Info]
> 
> Both patches were developed by intel, and have been accepted into v6.0-rc7 and
> are already released into upstream stable v5.4.215, v5.15.71 and v5.19.12. These
> patches are well tested by the community and considered safe.

Applied to kinetic/linux.

Thanks,
-Andrea



More information about the kernel-team mailing list