ACK/Cmnt: [SRU][F][PATCH 1/1] s390/pci: fix enabling a reserved PCI function

Stefan Bader stefan.bader at canonical.com
Tue Aug 25 08:33:31 UTC 2020


On 25.08.20 09:45, Frank Heimes wrote:
> Yeah, I noticed that conversation.
> 
> I thought I went a bit into that direction with:
> 
> * There is some regression risk with having code changes in the zPCI sub-system.  
> * zPCI is the PCI implementation on s390x, modifications here do not affect any
> other architecture.  
> * It could be that *PCI events do not work anymore* and *NVMe devices don't IPL
> (boot)* on s390x anymore.

Yeah, it probably is that. Maybe a bit obscured by the two bullet points before.
Maybe just reordering is enough...

> ...
> 
> I guess you probably want me to go even deeper into 'what may happen' in future ...
> 
> On Tue, Aug 25, 2020 at 9:39 AM Stefan Bader <stefan.bader at canonical.com
> <mailto:stefan.bader at canonical.com>> wrote:
> 
>     On 25.08.20 09:00, frank.heimes at canonical.com
>     <mailto:frank.heimes at canonical.com> wrote:
>     > From: Niklas Schnelle <schnelle at linux.ibm.com <mailto:schnelle at linux.ibm.com>>
>     >
>     > BugLink: https://bugs.launchpad.net/bugs/1891437
>     >
>     > In usual IPL or hot plug scenarios a zPCI function transitions directly
>     > from reserved (invisible to Linux) to configured state or is configured
>     > by Linux itself using an SCLP, however it can also first go from
>     > reserved to standby and then from standby to configured without
>     > Linux initiative.
>     > In this scenario we first get a PEC event 0x302 and then 0x301.  This may
>     > happen for example when the device is deconfigured at another LPAR and
>     > made available for this LPAR. It may also happen under z/VM when
>     > a device is attached while in some inconsistent state.
>     >
>     > However when we get the 0x301 the device is already known to zPCI
>     > so calling zpci_create() will add it twice resulting in the below
>     > BUG. Instead we should only enable the existing device and finally
>     > scan it through the PCI subsystem.
>     >
>     > list_add double add: new=00000000ed5a9008, prev=00000000ed5a9008,
>     next=0000000083502300.
>     > kernel BUG at lib/list_debug.c:31!
>     > Krnl PSW : 0704c00180000000 0000000082dc2db8 (__list_add_valid+0x70/0xa8)
>     > Call Trace:
>     >  [<0000000082dc2db8>] __list_add_valid+0x70/0xa8
>     > ([<0000000082dc2db4>] __list_add_valid+0x6c/0xa8)
>     >  [<00000000828ea920>] zpci_create_device+0x60/0x1b0
>     >  [<00000000828ef04a>] zpci_event_availability+0x282/0x2f0
>     >  [<000000008315f848>] chsc_process_crw+0x2b8/0xa18
>     >  [<000000008316735c>] crw_collect_info+0x254/0x348
>     >  [<00000000829226ea>] kthread+0x14a/0x168
>     >  [<000000008319d5c0>] ret_from_fork+0x24/0x2c
>     >
>     > Fixes: f606b3ef47c9 ("s390/pci: adapt events for zbus")
>     > Reported-by: Alexander Egorenkov <egorenar at linux.ibm.com
>     <mailto:egorenar at linux.ibm.com>>
>     > Tested-by: Alexander Egorenkov <egorenar at linux.ibm.com
>     <mailto:egorenar at linux.ibm.com>>
>     > Signed-off-by: Niklas Schnelle <schnelle at linux.ibm.com
>     <mailto:schnelle at linux.ibm.com>>
>     > Signed-off-by: Heiko Carstens <heiko.carstens at de.ibm.com
>     <mailto:heiko.carstens at de.ibm.com>>
>     > (cherry picked from commit 3047766bc6ec9c6bc9ece85b45a41ff401e8d988)
>     > Signed-off-by: Frank Heimes <frank.heimes at canonical.com
>     <mailto:frank.heimes at canonical.com>>
>     Acked-by: Stefan Bader <stefan.bader at canonical.com
>     <mailto:stefan.bader at canonical.com>>
>     > ---
> 
>     Not sure you saw me commenting on the "regression potential" section of other
>     submissions. Recently there has been a discussion by the SRU team about this and
>     that most submitters use it not the way they expect. They would like to see more
>     of a guess about what kinds of symptoms people might observe. So kind of a
>     literal "what could go wrong?". Maybe you could adjust yours accordingly.
> 
>     -Stefan
> 
>     >  arch/s390/pci/pci_event.c | 13 ++++++++++++-
>     >  1 file changed, 12 insertions(+), 1 deletion(-)
>     >
>     > diff --git a/arch/s390/pci/pci_event.c b/arch/s390/pci/pci_event.c
>     > index 08e1d619398e..fdebd286f402 100644
>     > --- a/arch/s390/pci/pci_event.c
>     > +++ b/arch/s390/pci/pci_event.c
>     > @@ -94,7 +94,18 @@ static void __zpci_event_availability(struct
>     zpci_ccdf_avail *ccdf)
>     >               }
>     >               zdev->fh = ccdf->fh;
>     >               zdev->state = ZPCI_FN_STATE_CONFIGURED;
>     > -             zpci_create_device(zdev);
>     > +             ret = zpci_enable_device(zdev);
>     > +             if (ret)
>     > +                     break;
>     > +
>     > +             pdev = pci_scan_single_device(zdev->zbus->bus, zdev->devfn);
>     > +             if (!pdev)
>     > +                     break;
>     > +
>     > +             pci_bus_add_device(pdev);
>     > +             pci_lock_rescan_remove();
>     > +             pci_bus_add_devices(zdev->zbus->bus);
>     > +             pci_unlock_rescan_remove();
>     >               break;
>     >       case 0x0302: /* Reserved -> Standby */
>     >               if (!zdev) {
>     >
> 
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20200825/efb3e8f2/attachment.sig>


More information about the kernel-team mailing list