[Acked] [PATCH Vivid SRU] powerpc/eeh: Fix fenced PHB caused by eeh_slot_error_detail()

Andy Whitcroft apw at canonical.com
Fri Dec 4 11:57:32 UTC 2015


On Wed, Dec 02, 2015 at 01:33:57PM -0700, tim.gardner at canonical.com wrote:
> From: Gavin Shan <gwshan at linux.vnet.ibm.com>
> 
> BugLink: http://bugs.launchpad.net/bugs/1522071
> 
> The config space of some PCI devices can't be accessed when their
> PEs are in frozen state. Otherwise, fenced PHB might be seen.
> Those PEs are identified with flag EEH_PE_CFG_RESTRICTED, meaing
> EEH_PE_CFG_BLOCKED is set automatically when the PE is put to
> frozen state (EEH_PE_ISOLATED). eeh_slot_error_detail() restores
> PCI device BARs with eeh_pe_restore_bars(), which then calls
> eeh_ops->restore_config() to reinitialize the PCI device in
> (OPAL) firmware. eeh_ops->restore_config() produces PCI config
> access that causes fenced PHB. The problem was reported on below
> adapter:
> 
>    0001:01:00.0 0200: 14e4:168e (rev 10)
>    0001:01:00.0 Ethernet controller: Broadcom Corporation \
>                 NetXtreme II BCM57810 10 Gigabit Ethernet (rev 10)
> 
> This fixes the issue by skipping eeh_pe_restore_bars() in
> eeh_slot_error_detail() when EEH_PE_CFG_BLOCKED is set for the PE.
> 
> Fixes: b6541db1 ("powerpc/eeh: Block PCI config access upon frozen PE")
> Cc: stable at vger.kernel.org # v4.0+
> Reported-by: Manvanthara B. Puttashankar <mputtash at in.ibm.com>
> Signed-off-by: Gavin Shan <gwshan at linux.vnet.ibm.com>
> Signed-off-by: Michael Ellerman <mpe at ellerman.id.au>
> (cherry picked from commit 259800135c654a098d9f0adfdd3d1f20eef1f231)
> Signed-off-by: Tim Gardner <tim.gardner at canonical.com>
> ---
>  arch/powerpc/kernel/eeh.c | 21 ++++++++++++++++++---
>  1 file changed, 18 insertions(+), 3 deletions(-)
> 
> diff --git a/arch/powerpc/kernel/eeh.c b/arch/powerpc/kernel/eeh.c
> index 3b2252e..79150db 100644
> --- a/arch/powerpc/kernel/eeh.c
> +++ b/arch/powerpc/kernel/eeh.c
> @@ -306,11 +306,26 @@ void eeh_slot_error_detail(struct eeh_pe *pe, int severity)
>  	if (!(pe->type & EEH_PE_PHB)) {
>  		if (eeh_has_flag(EEH_ENABLE_IO_FOR_LOG))
>  			eeh_pci_enable(pe, EEH_OPT_THAW_MMIO);
> +
> +		/*
> +		 * The config space of some PCI devices can't be accessed
> +		 * when their PEs are in frozen state. Otherwise, fenced
> +		 * PHB might be seen. Those PEs are identified with flag
> +		 * EEH_PE_CFG_RESTRICTED, indicating EEH_PE_CFG_BLOCKED
> +		 * is set automatically when the PE is put to EEH_PE_ISOLATED.
> +		 *
> +		 * Restoring BARs possibly triggers PCI config access in
> +		 * (OPAL) firmware and then causes fenced PHB. If the
> +		 * PCI config is blocked with flag EEH_PE_CFG_BLOCKED, it's
> +		 * pointless to restore BARs and dump config space.
> +		 */
>  		eeh_ops->configure_bridge(pe);
> -		eeh_pe_restore_bars(pe);
> +		if (!(pe->state & EEH_PE_CFG_BLOCKED)) {
> +			eeh_pe_restore_bars(pe);
>  
> -		pci_regs_buf[0] = 0;
> -		eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen);
> +			pci_regs_buf[0] = 0;
> +			eeh_pe_traverse(pe, eeh_dump_pe_log, &loglen);
> +		}
>  	}
>  
>  	eeh_ops->get_log(pe, severity, pci_regs_buf, loglen);


Looks to do what is claimed.  

Acked-by: Andy Whitcroft <apw at canonical.com>

-apw




More information about the kernel-team mailing list