[Bug 1606940] Re: A a single PCI read or write appears twice on the PCIe bus. This happens when using the SR-IOV feature with some PCI devices

ChristianEhrhardt 1606940 at bugs.launchpad.net
Mon Oct 17 12:07:12 UTC 2016


Hi,
I wanted to verify this to get it out of the way.

But IMHO the reproduction statements lack some info on the device type you forwarded.
With what I could set up I can't reproduce as I miss this useful register that "acts like an adder in that every write adds to the previously written value minus anything the device has consumed".

Yet for whoever comes by here all the SR-IOV setup summary to (almost)
get to the point.

I'll look at the X540 spec, but I'm not sure I'll find an equally suited
test register ...

1. Create matching setup:
 - set up server machine with SR-IOV as trusty
  # GRUB_CMDLINE_LINUX="intel_iommu=on" into /etc/default/grub
  # reboot (could be default but be on the safe side)
  $ sudo rmmod ixgbe
  $ sudo modprobe ixgbe max_vfs=7
  # or long term conf in /etc/modprobe.d/ixgbe.conf
  [  390.988873] ixgbevf: Intel(R) 10 Gigabit PCI Express Virtual Function Network Driver - version 2.12.1-k
  [  391.618065] ixgbevf 0000:04:10.1: Intel(R) X540 Virtual Function
  ...
  dmesg | grep -e DMAR -e IOMMU
   [    0.000000] ACPI: DMAR 0x000000007B7E7000 0001E4 (v01 HP     ProLiant 00000001 HP   00000001)
   [    0.000000] DMAR: IOMMU enabled
   [    1.015129] DMAR: Host address width 46
   [    1.016287] DMAR: DRHD base: 0x000000fbffc000 flags: 0x1
   [    1.018008] DMAR: dmar0: reg_base_addr fbffc000 ver 1:0 cap d2078c106f0466 ecap f020df
   [    1.020342] DMAR: RMRR base: 0x00000079173000 end: 0x00000079175fff
   [    1.022241] DMAR: RMRR base: 0x000000791ec000 end: 0x000000791effff
   [    1.024111] DMAR: RMRR base: 0x000000791dc000 end: 0x000000791ebfff
   [    1.026033] DMAR: RMRR base: 0x000000791c9000 end: 0x000000791d9fff
   [    1.028022] DMAR: RMRR base: 0x000000791da000 end: 0x000000791dbfff
   [    1.029917] DMAR-IR: IOAPIC id 8 under DRHD base  0xfbffc000 IOMMU 0
   [    1.031796] DMAR-IR: IOAPIC id 9 under DRHD base  0xfbffc000 IOMMU 0
   [    1.033675] DMAR-IR: HPET id 0 under DRHD base 0xfbffc000
   [    1.535267] DMAR-IR: Queued invalidation will be enabled to support x2apic and Intr-remapping.
   [    1.538291] DMAR-IR: Enabled IRQ remapping in x2apic mode
   [    7.763329] DMAR: No ATSR found
   [    7.764417] DMAR: dmar0: Using Queued invalidation
   [    7.765854] DMAR: Setting RMRR:
   [    7.766824] DMAR: Setting identity map for device 0000:01:00.0 [0x791da000 - 0x791dbfff]
   [    7.769324] DMAR: Setting identity map for device 0000:01:00.1 [0x791da000 - 0x791dbfff]
   [    7.771721] DMAR: Setting identity map for device 0000:01:00.2 [0x791da000 - 0x791dbfff]
   [    7.774105] DMAR: Setting identity map for device 0000:01:00.4 [0x791da000 - 0x791dbfff]
   [    7.776526] DMAR: Setting identity map for device 0000:03:00.0 [0x791da000 - 0x791dbfff]
   [    7.779011] DMAR: Setting identity map for device 0000:01:00.0 [0x791c9000 - 0x791d9fff]
   [    7.781416] DMAR: Setting identity map for device 0000:01:00.1 [0x791c9000 - 0x791d9fff]
   [    7.783799] DMAR: Setting identity map for device 0000:01:00.2 [0x791c9000 - 0x791d9fff]
   [    7.786268] DMAR: Setting identity map for device 0000:01:00.4 [0x791c9000 - 0x791d9fff]
   [    7.788757] DMAR: Setting identity map for device 0000:01:00.0 [0x791dc000 - 0x791ebfff]
   [    8.091077] DMAR: Setting identity map for device 0000:01:00.1 [0x791dc000 - 0x791ebfff]
   [    8.093472] DMAR: Setting identity map for device 0000:01:00.2 [0x791dc000 - 0x791ebfff]
   [    8.095891] DMAR: Setting identity map for device 0000:01:00.4 [0x791dc000 - 0x791ebfff]
   [    8.098408] DMAR: Setting identity map for device 0000:02:00.0 [0x791dc000 - 0x791ebfff]
   [    8.100806] DMAR: Setting identity map for device 0000:04:00.0 [0x791dc000 - 0x791ebfff]
   [    8.103418] DMAR: Setting identity map for device 0000:01:00.0 [0x791ec000 - 0x791effff]
   [    8.105813] DMAR: Setting identity map for device 0000:01:00.1 [0x791ec000 - 0x791effff]
   [    8.108323] DMAR: Setting identity map for device 0000:01:00.2 [0x791ec000 - 0x791effff]
   [    8.110727] DMAR: Setting identity map for device 0000:01:00.4 [0x791ec000 - 0x791effff]
   [    8.113114] DMAR: Setting identity map for device 0000:02:00.0 [0x791ec000 - 0x791effff]
   [    8.115524] DMAR: Setting identity map for device 0000:02:00.1 [0x791ec000 - 0x791effff]
   [    8.117991] DMAR: Setting identity map for device 0000:02:00.2 [0x791ec000 - 0x791effff]
   [    8.120408] DMAR: Setting identity map for device 0000:02:00.3 [0x791ec000 - 0x791effff]
   [    8.622705] DMAR: Setting identity map for device 0000:03:00.0 [0x791ec000 - 0x791effff]
   [    8.625123] DMAR: Setting identity map for device 0000:04:00.0 [0x791ec000 - 0x791effff]
   [    8.627635] DMAR: Setting identity map for device 0000:04:00.1 [0x791ec000 - 0x791effff]
   [    8.630055] DMAR: Setting identity map for device 0000:00:1a.0 [0x79173000 - 0x79175fff]
   [    8.632459] DMAR: Setting identity map for device 0000:00:1d.0 [0x79173000 - 0x79175fff]
   [    8.634859] DMAR: Prepare 0-16MiB unity mapping for LPC
   [    8.636428] DMAR: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]
   [    8.638675] DMAR: Intel(R) Virtualization Technology for Directed I/O
  $ ll /sys/bus/pci/devices/0000\:04\:00.0/virtfn*
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn0 -> ../0000:04:10.0/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn1 -> ../0000:04:10.2/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn2 -> ../0000:04:10.4/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn3 -> ../0000:04:10.6/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn4 -> ../0000:04:11.0/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn5 -> ../0000:04:11.2/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.0/virtfn6 -> ../0000:04:11.4/
  $ ll /sys/bus/pci/devices/0000\:04\:00.1/virtfn*
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn0 -> ../0000:04:10.1/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn1 -> ../0000:04:10.3/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn2 -> ../0000:04:10.5/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn3 -> ../0000:04:10.7/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn4 -> ../0000:04:11.1/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn5 -> ../0000:04:11.3/
  lrwxrwxrwx 1 root root 0 Oct 17 10:15 /sys/bus/pci/devices/0000:04:00.1/virtfn6 -> ../0000:04:11.5/
  $ sudo uvt-simplestreams-libvirt sync --source http://cloud-images.ubuntu.com/daily arch=amd64 label=daily release=trusty
  $ sudo uvt-kvm create --memory 8192 --cpu 4 --password=ubuntu trusty-test-sriov release=trusty arch=amd64 label=daily
  $ sudo virsh nodedev-dumpxml pci_0000_04_00_0
   <device>
     <name>pci_0000_04_00_0</name>
     <path>/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.0</path>
     <parent>pci_0000_00_02_2</parent>
     <driver>
       <name>ixgbe</name>
     </driver>
     <capability type='pci'>
       <domain>0</domain>
       <bus>4</bus>
       <slot>0</slot>
       <function>0</function>
       <product id='0x1528'>Ethernet Controller 10-Gigabit X540-AT2</product>
       <vendor id='0x8086'>Intel Corporation</vendor>
       <capability type='virt_functions'>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x0'/>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x2'/>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x4'/>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x6'/>
         <address domain='0x0000' bus='0x04' slot='0x11' function='0x0'/>
         <address domain='0x0000' bus='0x04' slot='0x11' function='0x2'/>
         <address domain='0x0000' bus='0x04' slot='0x11' function='0x4'/>
       </capability>
     </capability>
   </device>
  $ sudo virsh nodedev-dumpxml pci_0000_04_00_1
   <device>
     <name>pci_0000_04_00_1</name>
     <path>/sys/devices/pci0000:00/0000:00:02.2/0000:04:00.1</path>
     <parent>pci_0000_00_02_2</parent>
     <driver>
       <name>ixgbe</name>
     </driver>
     <capability type='pci'>
       <domain>0</domain>
       <bus>4</bus>
       <slot>0</slot>
       <function>1</function>
       <product id='0x1528'>Ethernet Controller 10-Gigabit X540-AT2</product>
       <vendor id='0x8086'>Intel Corporation</vendor>
       <capability type='virt_functions'>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x1'/>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x3'/>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x5'/>
         <address domain='0x0000' bus='0x04' slot='0x10' function='0x7'/>
         <address domain='0x0000' bus='0x04' slot='0x11' function='0x1'/>
         <address domain='0x0000' bus='0x04' slot='0x11' function='0x3'/>
         <address domain='0x0000' bus='0x04' slot='0x11' function='0x5'/>
       </capability>
     </capability>
   </device>
  # modify the guest to have these SR-IOV
  # I was unsure on the overlapping part, I think with enough main memory one VF is enough - yet to be sure I added many
  # each of the six has this form:
    <interface type='hostdev' managed='yes'>
      <source>
        <address type='pci' domain='0' bus='04' slot='0x10' function='0'/>
      </source>
    </interface>
   # Restarting the guest and check VFs in Guest
   $ lspci | grep Virtua
     00:07.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
     00:08.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
     00:09.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
     00:0a.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
     00:0b.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
     00:0c.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
2. Trigger bug as-is
   # VF has two memory sections available
   $ lcpci -vvv
     00:0c.0 Ethernet controller: Intel Corporation X540 Ethernet Controller Virtual Function (rev 01)
        Subsystem: Hewlett-Packard Company Device 192d
        Physical Slot: 12
        Control: I/O- Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
        Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
        Latency: 0
        Region 0: Memory at febb8000 (64-bit, non-prefetchable) [size=16K]
        Region 3: Memory at febbc000 (64-bit, non-prefetchable) [size=16K]
        Capabilities: <access denied>
        Kernel driver in use: ixgbevf
    $ matching the 16k size the initial example won't work, only up to 0x3999 as address will do
    # Unable to trigger the bug at an address inside the scope of this device
    $ sudo ./pcimem /sys/bus/pci/devices/0000\:00\:07.0/resource3 0x3080 d
      /sys/bus/pci/devices/0000:00:07.0/resource3 opened.
      Target offset is 0x3080, page size is 4096
      mmap(0, 4096, 0x3, 0x1, 3, 0x3080)
      PCI Memory mapped to address 0x7ffa5a5c3000.
      Value at offset 0x3080 (0x7ffa5a5c3080): 0xdeadbeafdeadbeaf
    $ sudo ./pcimem /sys/bus/pci/devices/0000\:00\:07.0/resource3 0x3080 d 2048
      /sys/bus/pci/devices/0000:00:07.0/resource3 opened.
      Target offset is 0x3080, page size is 4096
      mmap(0, 4096, 0x3, 0x1, 3, 0x3080)
      PCI Memory mapped to address 0x7fdae5092000.
      Value at offset 0x3080 (0x7fdae5092080): 0xdeadbeafdeadbeaf
      Written 0x0000000000000800; readback 0xdeadbeafdeadbeaf
    $ sudo ./pcimem /sys/bus/pci/devices/0000\:00\:07.0/resource3 0x3080 d
      /sys/bus/pci/devices/0000:00:07.0/resource3 opened.
      Target offset is 0x3080, page size is 4096
      mmap(0, 4096, 0x3, 0x1, 3, 0x3080)
      PCI Memory mapped to address 0x7fa4c204e000.
      Value at offset 0x3080 (0x7fa4c204e080): 0xdeadbeafdeadbeaf
     # Well writing to a not supported region might be a bad test
     # Check what this kind of device has that could be used to check for a double write

-- 
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to a duplicate bug report (1563375).
https://bugs.launchpad.net/bugs/1606940

Title:
  A a single PCI read or write appears twice on the PCIe bus. This
  happens when using the SR-IOV feature with some PCI devices

Status in qemu package in Ubuntu:
  Fix Released
Status in qemu source package in Trusty:
  Fix Committed

Bug description:
  [Impact]

   * Users of SRIOV devices in qemu on Trusty may encounter unstable
     behavior on pass-through PCI devices due to a bug in qemu's MMIO
     mapping to overlapping ram slots. When memory is accessed in
     subpage granularity where slots have overlapping regions multiple
     invocations of the handler ocurrs which resulted in multiple pci
     writes.

     This affects the qemu releases prior to qemu 2.5, it has been fixed in
     newer releases.

   * Backporting fixes from upstream release is required to allow
     certain PCI devices under SRIOV to function properly.

   * All patches applied are already accepted upstream. Xenial, Yakkety
     are OK, Wily -> Trusty are affected.

  [Test Case]

   * On a Trusty 14.04 system with affected SRIOV device.
      - boot system with sriov enabled
      - launch vm with sriov device passed through
        using guest XML attached (bug-1606940-trusty-guest.xml)
      - unpack pcimem tarball inside vm (pcimem.tar attached)
      - Read (note the pci path should point to the SRIOV device)
       ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d
      - Write
       ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d 2048
      - Read again
       ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d

      The value of 0x10080 should be the same for the first read
      and the second read, after the write.

      If the bug is hit, the second read will report a value of double
      instead of the same.

  [Regression Potential]

   * SR-IOV device drivers may have unknowingly relied on KVM multi-write
     behavior prior to this patch; that's highly unlikely since it would
     fail on physical hardware (which does not produce this effect). But
     there is a chance that devices only passed into the guest via SRIOV
     might break.

  [Original Description]
  Customer engineers are testing the SR-IOV feature with a new network card on x86 servers and ran into the issue described below.

  They are *not* seeing this issue on Intel 82599 NIC.

  We are testing a new device in EP mode with SRIOV.  With a CentOS7 VM
  running on the Ubuntu 14.04.2 host (using VFIO) we see that a single
  PCI read or write transaction targeting the device’s BAR0 issued from
  the VM appears twice on the PCIe bus. The same accesses work fine when
  the VF is accessed directly from the Ubuntu 14.04.2 host. These BAR0
  PCI accesses do not require a driver on the VM side. We can reproduce
  the problem using a simple user-space application to access the VF’s
  BAR0 registers.

  We do not see this problem when the VM runs within a CentOS 7 host or
  under a Ubuntu 12.04 host. This appears specific to Ubuntu 14.04
  release. Appreciate your help in any clues or pointers to this
  behavior.

  This issue is also not happening with 16.04 beta.

  Steps to reproduce the bug with pcimem:

  Read:
  ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d

  Write:
  ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d 2048

  Read again:
  ./pcimem /sys/bus/pci/devices/0000\:04\:00.0/resource0 0x10080 d

  The value of 0x10080 should be the same for the first read and the
  second read, after the write.

  If the bug is hit, the second read will report a value of double
  instead of the same.

  The register should have read back the same value that was written.
  The register acts like an adder in that every write adds to the
  previously written value minus anything the device has consumed. We
  see that the second read returns double the value written in the
  single write. We captured a PCIe trace and found that each of the PCI
  operation accessing this register is seen twice on the PCI bus. The 2
  writes cause the register value to double which has implications for
  normal operation. The PCIe trace is attached and has markers to
  identify the relevant transactions.

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/qemu/+bug/1606940/+subscriptions



More information about the Ubuntu-sponsors mailing list