[Bug 1577088] Re: OVS+DPDK segfault at the host, after running "ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=4" within a KVM Guest

Thiago Martins thiagocmartinsc at gmail.com
Fri Dec 2 04:22:36 UTC 2016


This is still a bug on Xenial with OpenvSwitch-2.6 and DPDK-16.07, from
Ubuntu Cloud Archive.

I just crashed the OVS+DPDK running at the host, right after trying to
enable multiqueue inside of a KVM Guest, also running OVS+DPDK.

NOTE: Multiqueue was enabled at the host in advance, on both OVS and
Libvirt VM XML.

I also noted that right after enabling 2 queues at the host, the speed
improved inside of the running guest! Without doing anything with it.
But, then, after trying to enable multiqueue at the KVM guest, on its
OVS+DPDK on top of VirtIO, host crashed.

After running on the KVM Guest:

---
root at ubuntu-ovs-dpdk-vm-1:~# ovs-vsctl set interface dpdk0 options:n_rxq=2 ; ovs-vsctl set interface dpdk1 options:n_rxq=2
---

At the host, ovs-vswitchd crashed:

---
root at ubuntu-ovs-dpdk-kvm-1:~# tail -F /var/log/openvswitch/ovs-vswitchd.log
......
2016-12-02T04:11:28.578Z|00127|dpdk(vhost_thread2)|INFO|State of queue 0 ( tx_qid 0 ) of vhost device '/var/run/openvswitch/vhost-user1'changed to 'enabled'
2016-12-02T04:11:28.578Z|00128|dpdk(vhost_thread2)|INFO|State of queue 2 ( tx_qid 1 ) of vhost device '/var/run/openvswitch/vhost-user1'changed to 'enabled'
2016-12-02T04:11:28.956Z|00002|daemon_unix(monitor)|ERR|1 crashes: pid 3841 died, killed (Segmentation fault), core dumped, restarting
---

I REALLY want to be able to use DPDK Apps, on top of OVS+DPDK at the
host but, it still looks unstable to me.

:-(

** Summary changed:

- OVS+DPDK segfault at the host, after running "ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=4" within a KVM Guest
+ OVS+DPDK segfault at the host, after running "ovs-vsctl set interface dpdk0 options:n_rxq=2 " within a KVM Guest

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to openvswitch in Ubuntu.
https://bugs.launchpad.net/bugs/1577088

Title:
  OVS+DPDK segfault at the host, after running "ovs-vsctl set interface
  dpdk0 options:n_rxq=2 " within a KVM Guest

Status in dpdk package in Ubuntu:
  Expired
Status in openvswitch package in Ubuntu:
  Expired

Bug description:
  Guys,

   It is possible to crash OVS + DPDK running at the host, from inside
  of a KVM Guest!

   All you need to do, is to enable multi-queue, then, from a KVM Guest,
  you can kill OVS running at the host...

  
   * Hardware requirements (might be exaggerated but this is what I have):

   1 Dell Server with dedicated 2 x 10G NIC cards, plus another 1 or 2 1G NIC, for management, apt-get, ssh, etc;
   1 IXIA Traffic Generator - 10G in both directions.

  
   * Steps to reproduce, at a glance:

  
   1- Deploy Ubuntu at the host;

   a. Grub options /etc/default/grub:

  -
  GRUB_CMDLINE_LINUX_DEFAULT="quiet splash iommu=pt intel_iommu=on default_hugepagesz=1GB hugepagesz=1G hugepages=64"
  -

  
   2- Install OVS with DPDK;

  
   3- Configure DPDK, 1G Hugepages, PCI IDs and create the OVS bridges for a VM:

   a. /etc/default/openvswitch-switch:

  -
  DPDK_OPTS='--dpdk -c 0x1 -n 4 -m 2048,0 --vhost-owner libvirt-qemu:kvm --vhost-perm 0664'
  -

   b. /etc/dpdk/interfaces:

  -
  pci 0000:06:00.0 uio_pci_generic
  pci 0000:06:00.1 uio_pci_generic
  -

   NOTE: those PCI devices are located at NUMA Node 0.

   c. DPDK Hugepages /etc/dpdk/dpdk.conf:

  -
  NR_1G_PAGES=32
  -

   d. OVS Bridges:

  ovs-vsctl add-br br0 -- set bridge br0 datapath_type=netdev
  ovs-vsctl add-port br0 dpdk0 -- set Interface dpdk0 type=dpdk
  ovs-vsctl add-port br0 vhost-user1 -- set Interface vhost-user1 type=dpdkvhostuser

  ovs-vsctl add-br br1 -- set bridge br1 datapath_type=netdev
  ovs-vsctl add-port br1 dpdk1 -- set Interface dpdk1 type=dpdk
  ovs-vsctl add-port br1 vhost-user2 -- set Interface vhost-user2 type=dpdkvhostuser

  ip link set dev br0 up
  ip link set dev br1 up

  
   4- At the host, enable multi-queue and add more CPU Cores to OVS+DPDK PMD threads:

  ovs-vsctl set Open_vSwitch . other_config:n-dpdk-rxqs=4
  ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=FFFF

  
   5- Deploy Ubuntu at the VM, full Libvirt XML:

   a. ubuntu-16.01-1 XML:

   https://paste.ubuntu.com/16162857/

   b.  /etc/default/grub:

  -
  GRUB_CMDLINE_LINUX_DEFAULT="default_hugepagesz=1GB hugepagesz=1G hugepages=1" 
  -

  
   6- Install OVS with DPDK;

  
   7- Configure DPDK, 1G Hugepages, PCI IDs and create the OVS bridges within the VM:

   NOTE: Do NOT enable multi-queue inside of the VM yet, you'll see
  that, so far, it will work!

   a. /etc/default/openvswitch-switch:

  -
  DPDK_OPTS='--dpdk -c 0x1 -n 4 -m 1024 --pci-blacklist 0000:00:03.0 --pci-blacklist 0000:00:04.0'
  -

   b. /etc/dpdk/interfaces:

  -
  pci 0000:00:05.0 uio_pci_generic
  pci 0000:00:06.0 uio_pci_generic
  -

   c. DPDK Hugepages /etc/dpdk/dpdk.conf:

  -
  NR_1G_PAGES=1
  -

   d. OVS Bridge:

  ovs-vsctl add-br ovsbr -- set bridge ovsbr datapath_type=netdev
  ovs-vsctl add-port ovsbr dpdk0 -- set Interface dpdk0 type=dpdk
  ovs-vsctl add-port ovsbr dpdk1 -- set Interface dpdk1 type=dpdk

  ip link set dev ovsbr up

   NOTE 1: So far, so good! But no multi-queue yet!

   NOTE 2: Sometimes, you can crash ovs-vswitchd at the host, right
  here!!!

  
   8- At the VM, add more CPU Cores to OVS+DPDK PMD threads:

  2 Cores):

  ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=6

  or:

  4 Cores):

  ovs-vsctl set Open_vSwitch . other_config:pmd-cpu-mask=F

  
   9- Enable multi-queue before starting up DPDK and OVS, run this inside of the VM:

  systemctl disable dpdk
  systemctl disable openvswitch-switch

  reboot

  ethtool -L ens5 combined 4
  ethtool -L ens6 combined 4

  service dpdk start
  service openvswitch-switch start

  BOOM!!!

   10- Error log at the host (ovs-vswitchd + DPDK crashed):

   https://paste.ubuntu.com/16152614/

  
   IMPORTANT NOTES:

   * Sometimes, even without enabling multi-queue at the VM, ovs-
  vswitchd at the host, crashes!

   ** Also, more weird, is that I have a proprietary DPDK App (L2 Bridge
  for DPI), that uses multi-queue automatically and it does NOT crash
  the ovs-vswitchd running at the host! I can use my DPDK App with
  multi-queue, but I can't do the same with OVS+DPDK.

  
   So, if I replace "ubuntu16.01-1.qcow2", by my own qcow2 where I have a proprietary DPDK App, I can use multi-queue, OVS+DPDK at the host works just fine (slower than PCI Pass but, acceptable, much better than just regular OVS).

  Cheers!
  Thiago

  ProblemType: Bug
  DistroRelease: Ubuntu 16.04
  Package: openvswitch-switch-dpdk 2.5.0-0ubuntu1
  ProcVersionSignature: Ubuntu 4.4.0-22.38-generic 4.4.8
  Uname: Linux 4.4.0-22-generic x86_64
  ApportVersion: 2.20.1-0ubuntu2
  Architecture: amd64
  Date: Sat Apr 30 18:04:16 2016
  SourcePackage: openvswitch
  UpgradeStatus: Upgraded to xenial on 2016-04-07 (23 days ago)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/dpdk/+bug/1577088/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list