[PATCH 2/2] PCI/ERR: Update error status after reset_link()
Jay Vosburgh
jay.vosburgh at canonical.com
Sat Apr 18 00:30:12 UTC 2020
From: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy at linux.intel.com>
BugLink: https://bugs.launchpad.net/bugs/1873537
Commit bdb5ac85777d ("PCI/ERR: Handle fatal error recovery") uses
reset_link() to recover from fatal errors. But during fatal error
recovery, if the initial value of error status is PCI_ERS_RESULT_DISCONNECT
or PCI_ERS_RESULT_NO_AER_DRIVER then even after successful recovery (using
reset_link()) pcie_do_recovery() will report the recovery result as
failure. Update the status of error after reset_link().
You can reproduce this issue by triggering a SW DPC using "DPC Software
Trigger" bit in "DPC Control Register". You should see recovery failed
dmesg log as below:
pcieport 0000:00:16.0: DPC: containment event, status:0x1f27 source:0x0000
pcieport 0000:00:16.0: DPC: software trigger detected
pci 0000:04:00.0: AER: can't recover (no error_detected callback)
pcieport 0000:00:16.0: AER: device recovery failed
Fixes: bdb5ac85777d ("PCI/ERR: Handle fatal error recovery")
Link: https://lore.kernel.org/r/a255fcb3a3fdebcd90f84e08b555f1786eb8eba2.1585000084.git.sathyanarayanan.kuppuswamy@linux.intel.com
[bhelgaas: split pci_channel_io_frozen simplification to separate patch]
Signed-off-by: Kuppuswamy Sathyanarayanan <sathyanarayanan.kuppuswamy at linux.intel.com>
Signed-off-by: Bjorn Helgaas <bhelgaas at google.com>
Acked-by: Keith Busch <keith.busch at intel.com>
Cc: Ashok Raj <ashok.raj at intel.com>
(cherry picked from commit 6d2c89441571ea534d6240f7724f518936c44f8d)
Signed-off-by: Jay Vosburgh <jay.vosburgh at canonical.com>
---
drivers/pci/pcie/err.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/pci/pcie/err.c b/drivers/pci/pcie/err.c
index 9b2d30eebe4c..312ca5c92e85 100644
--- a/drivers/pci/pcie/err.c
+++ b/drivers/pci/pcie/err.c
@@ -201,7 +201,8 @@ void pcie_do_recovery(struct pci_dev *dev, enum pci_channel_state state,
pci_dbg(dev, "broadcast error_detected message\n");
if (state == pci_channel_io_frozen) {
pci_walk_bus(bus, report_frozen_detected, &status);
- if (reset_link(dev, service) != PCI_ERS_RESULT_RECOVERED)
+ status = reset_link(dev, service);
+ if (status != PCI_ERS_RESULT_RECOVERED)
goto failed;
} else {
pci_walk_bus(bus, report_normal_detected, &status);
--
2.7.4
More information about the kernel-team
mailing list