[PATCH Vivid SRU] powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()

tim.gardner at canonical.com tim.gardner at canonical.com
Wed Dec 2 20:33:57 UTC 2015


From: Gavin Shan <gwshan at linux.vnet.ibm.com>

BugLink: http://bugs.launchpad.net/bugs/1522071

The config space of some PCI devices can't be accessed when their
PEs are in frozen state. Otherwise, fenced PHB might be seen.
Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
PCI device BARs with eeh_pe_restore_bars(), which then calls
eeh_ops->restore_config() to reinitialize the PCI device in
(OPAL) firmware. eeh_ops->restore_config() produces PCI config
access that causes fenced PHB. The problem was reported on below
adapter:

   0001:01:00.0 0200: 14e4:168e (rev 10)
   0001:01:00.0 Ethernet controller: Broadcom Corporation \
                NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)

This fixes the issue by skipping eeh_pe_restore_bars() in
eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.

Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
Cc: stable at vger.kernel.org # v4.0+
Reported-by: Manvanthara B. Puttashankar <mputtash at in.ibm.com>
Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
Signed-off-by: Michael Ellerman <mpe at ellerman.id.au>
(cherry picked from commit 259800135c654a098d9f0adfdd3d1f20eef1f231)
Signed-off-by: Tim Gardner <tim.gardner at canonical.com>
---
 arch/powerpc/kernel/eeh.c | 21 ++++++++++++++++++---
 1 file changed, 18 insertions(+), 3 deletions(-)

diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
index 3b2252e..79150db 100644
--- a/arch/powerpc/kernel/eeh.c
+++ b/arch/powerpc/kernel/eeh.c
@@ -306,11 +306,26 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
 	if (!(pe->type & EEH_PE_PHB)) {
 		if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG))
 			eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
+
+		/*
+		 * The config space of some PCI devices can't be accessed
+		 * when their PEs are in frozen state. Otherwise, fenced
+		 * PHB might be seen. Those PEs are identified with flag
+		 * EEH_PE_CFG_RESTRICTED, indicating EEH_PE_CFG_BLOCKED
+		 * is set automatically when the PE is put to EEH_PE_ISOLATED.
+		 *
+		 * Restoring BARs possibly triggers PCI config access in
+		 * (OPAL) firmware and then causes fenced PHB. If the
+		 * PCI config is blocked with flag EEH_PE_CFG_BLOCKED, it's
+		 * pointless to restore BARs and dump config space.
+		 */
 		eeh_ops->configure_bridge(pe);
-		eeh_pe_restore_bars(pe);
+		if (!(pe->state & EEH_PE_CFG_BLOCKED)) {
+			eeh_pe_restore_bars(pe);
 
-		pci_regs_buf[0] = 0;
-		eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen);
+			pci_regs_buf[0] = 0;
+			eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen);
+		}
 	}
 
 	eeh_ops->get_log(pe, severity, pci_regs_buf, loglen);
-- 
1.9.1





More information about the kernel-team mailing list