[SRU][Xenial][Yakkety][PATCH 2/2] PCI: hv: Add explicit barriers to config space access

Joseph Salisbury joseph.salisbury at canonical.com
Fri May 27 18:34:21 UTC 2016


From: Vitaly Kuznetsov <vkuznets at redhat.com>

BugLink: http://bugs.launchpad.net/bugs/1581243

I'm trying to pass-through Broadcom BCM5720 NIC (Dell device 1f5b) on a
Dell R720 server.  Everything works fine when the target VM has only one
CPU, but SMP guests reboot when the NIC driver accesses PCI config space
with hv_pcifront_read_config()/hv_pcifront_write_config().  The reboot
appears to be induced by the hypervisor and no crash is observed.  Windows
event logs are not helpful at all ('Virtual machine ... has quit
unexpectedly').  The particular access point is always different and
putting debug between them (printk/mdelay/...) moves the issue further
away.  The server model affects the issue as well: on Dell R420 I'm able to
pass-through BCM5720 NIC to SMP guests without issues.

While I'm obviously failing to reveal the essence of the issue I was able
to come up with a (possible) solution: if explicit barriers are added to
hv_pcifront_read_config()/hv_pcifront_write_config() the issue goes away.
The essential minimum is rmb() at the end on _hv_pcifront_read_config() and
wmb() at the end of _hv_pcifront_write_config() but I'm not confident it
will be sufficient for all hardware.  I suggest the following barriers:

1) wmb()/mb() between choosing the function and writing to its space.
2) mb() before releasing the spinlock in both _hv_pcifront_read_config()/
   _hv_pcifront_write_config() to ensure that consecutive reads/writes to
  the space won't get re-ordered as drivers may count on that.

Config space access is not supposed to be performance-critical so these
explicit barriers should not cause any slowdown.

[bhelgaas: use Linux "barriers" terminology]
Signed-off-by: Vitaly Kuznetsov <vkuznets at redhat.com>
Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>
Acked-by: Jake Oshins <jakeo at microsoft.com>
---
 drivers/pci/host/pci-hyperv.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/drivers/pci/host/pci-hyperv.c b/drivers/pci/host/pci-hyperv.c
index c17e792..7e9b2de 100644
--- a/drivers/pci/host/pci-hyperv.c
+++ b/drivers/pci/host/pci-hyperv.c
@@ -553,6 +553,8 @@ static void _hv_pcifront_read_config(struct hv_pci_dev *hpdev, int where,
 		spin_lock_irqsave(&hpdev->hbus->config_lock, flags);
 		/* Choose the function to be read. (See comment above) */
 		writel(hpdev->desc.win_slot.slot, hpdev->hbus->cfg_addr);
+		/* Make sure the function was chosen before we start reading. */
+		mb();
 		/* Read from that function's config space. */
 		switch (size) {
 		case 1:
@@ -565,6 +567,11 @@ static void _hv_pcifront_read_config(struct hv_pci_dev *hpdev, int where,
 			*val = readl(addr);
 			break;
 		}
+		/*
+		 * Make sure the write was done before we release the spinlock
+		 * allowing consecutive reads/writes.
+		 */
+		mb();
 		spin_unlock_irqrestore(&hpdev->hbus->config_lock, flags);
 	} else {
 		dev_err(&hpdev->hbus->hdev->device,
@@ -592,6 +599,8 @@ static void _hv_pcifront_write_config(struct hv_pci_dev *hpdev, int where,
 		spin_lock_irqsave(&hpdev->hbus->config_lock, flags);
 		/* Choose the function to be written. (See comment above) */
 		writel(hpdev->desc.win_slot.slot, hpdev->hbus->cfg_addr);
+		/* Make sure the function was chosen before we start writing. */
+		wmb();
 		/* Write to that function's config space. */
 		switch (size) {
 		case 1:
@@ -604,6 +613,11 @@ static void _hv_pcifront_write_config(struct hv_pci_dev *hpdev, int where,
 			writel(val, addr);
 			break;
 		}
+		/*
+		 * Make sure the write was done before we release the spinlock
+		 * allowing consecutive reads/writes.
+		 */
+		mb();
 		spin_unlock_irqrestore(&hpdev->hbus->config_lock, flags);
 	} else {
 		dev_err(&hpdev->hbus->hdev->device,
-- 
2.7.4





More information about the kernel-team mailing list