[Bug 1720126] Re: [ip link] Message truncated error for large number of passthrough VFs

Thu Oct 19 23:10:08 UTC 2017

On 19.10.2017 [09:35:19 -0000], Jan Gutter wrote:
> @nacc
> 
> Thanks so much for the explanation. I also found
> https://wiki.ubuntu.com/ServerTeam/KnowledgeBase#Merge_Proposals_and_Reviewing
> that details a bit more of the internal processes. As relative outsiders
> to the Ubuntu process, I'd appreciate it very much if you could handle
> that part for Monique's patches. I can be on hand to answer technical
> questions if required.

And to be clear, the MP based workflow for the Git trees is brand new
and experimental :)

I'm happy to integrate the updated debdiffs (I'll reply to those
comments directly).

> Regarding the buffer size choice, it's very arbitrary as Phil said. I'm
> pretty sure we came to the same conclusion independently (libvirt and
> libnl had very similar issues) and the workaround is obvious. 32k seems
> to work for 64 VF's (our test case), but breaks with 128 VF's. Not a lot
> of machines can handle 128 concurrent VF's. I typed 64k "just because".
> libvirt+libnl allow message peeking. However, iproute2 uses netlink
> directly. So, implementing a similar idea would require an entirely new
> receive codepath with all the fun of finding out where new exception
> paths occur: something to be done on tip and not suitable for backport
> without thorough vetting.

Absolutely. My concern is the upstream code is at 32k as is Artful. I'm
hesitant to backport something different (64k) to X and T without also
ensuring Artful gets it (and BB when it opens), and presumably also
fixing it upstream.

So I see two routes forward:

1) File an upstream issue to request they bump to 64k, as you note 32k
is insufficient for 128 VFs. Link to that issue in this bug and we'll
fix AA, X and T with the suggested change (presuming upstream acks it).

2) Backport the upstream change as-is to X and T (AA already has the
necessary fix). This will be faster, of course, but does mean the 128 VF
case is broken. Given that it is less likely to be hit in the field,
perhaps that is ok -- and in the meanwhile, upstream can work on a
proper fix which, when available, we can backport accordingly (or decide
at that point, in any case).

I prefer 2), because I do not like diverging from upstream (or at least
not without an upstream bug report). If you and Monique are ok with 2),
I can update the debdiffs before sponsoring them.

> I'm sure it'll save a lot of time once the kinks have been worked out of
> the automation, backports are quite the double-edged sword.

Definitely :)

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1720126

Title:
  [ip link] Message truncated error for large number of passthrough VFs

Status in iproute2 package in Ubuntu:
  Fix Released
Status in iproute2 source package in Trusty:
  New
Status in iproute2 source package in Xenial:
  Confirmed
Status in iproute2 source package in Zesty:
  Fix Released
Status in iproute2 package in CentOS:
  Unknown

Bug description:
  [Impact]

  When querying a Physical Function netdev with a large amount of VF's
  (more than 30), the resulting return message can overflow the 16K
  netlink message buffer.

  This can be fixed by enabling message peeking on the socket and
  resizing the buffer on receive, or by simply enlarging the receive
  buffer.

  Since there's an upper limit to the number of VF's per PF, it's
  relatively sane to just enlarge the receive buffer. Please see the
  attached patch.

  [Test Case]

  # Set up 60 VF's on an SR-IOV device
  ip link show > /dev/null

  Observe the following:
  Message truncated
  Message truncated
  Message truncated

  [Regression Potential]

  1) Applications relying on the broken behaviour will need to be updated, but it would be a really dubious use case.
  2) Increasing the rx buffer size increases the memory footprint (but realistically, this is tiny).
  3) Extra processing time is now needed to parse the larger buffer, in the case that a call to "ip link" is on the critical time path of an application, (called multiple times in a tight loop, for example), it would affect load.

  [Other Info]

  Observed on Ubuntu kernel 4.4.0-93-generic on both 14.04 and 16.04

  =====================================================================================================
  Ubuntu16 system

  stack at cluster04:~$ lsb_release -a
  No LSB modules are available.
  Distributor ID:	Ubuntu
  Description:	Ubuntu 16.04.3 LTS
  Release:	16.04
  Codename:	xenial

  stack at cluster04:~$ uname -r
  4.4.0-93-generic

  stack at cluster04:~$ apt-cache policy iproute2
  iproute2:
    Installed: 4.3.0-1ubuntu3.16.04.1
  Version table:
  *** 4.3.0-1ubuntu3.16.04.1 500
          500 http://us.archive.ubuntu.com/ubuntu xenial-updates/main amd64 Packages
  =================================================================================================

  Ubuntu14 system:
  root at boomslang:~# lsb_release -a
  No LSB modules are available.
  Distributor ID:	Ubuntu
  Description:	Ubuntu 14.04.3 LTS
  Release:	14.04
  Codename:	trusty

  root at boomslang:~# uname -r
  4.4.0-96-generic

  root at boomslang:~# apt-cache policy iproute2
  iproute2:
    Installed: 3.12.0-2ubuntu1
    Version table:
   *** 3.12.0-2ubuntu1 0
          500 http://za.archive.ubuntu.com/ubuntu/ trusty-updates/main amd64 Packages

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/iproute2/+bug/1720126/+subscriptions