[SRU][F][PATCH 0/1] NULL pointer dereference when configuring multi-function with devfn != 0 before devfn == 0 (LP: 1903682)
frank.heimes at canonical.com
frank.heimes at canonical.com
Thu Nov 12 09:44:28 UTC 2020
BugLink: https://bugs.launchpad.net/bugs/1903682
SRU Justification:
[Impact]
* While handling multifunction devices in zPCI the UID of the PCI function with function number 0 (that always exists according to the PCI spec) is taken as domain number.
* Therefore if hot plugging functions with a function number larger than 0 are used before function 0, these need to be held in standby before creating the domain and bus.
* This has been tested during development of this feature using a patched QEMU and in DPM, but unfortunately never in classic/traditional HMC mode.
* On a classic/traditional mode machine with a multi-function device, and hot plug ("Reassign I/O Path") of the FID of the second port of the LPAR, any additional hotplug (and even just deconfiguring a PCI device) will hang - and hotplug now makes the entire Linux instance unresponsive.
* The reason for this is a NULL pointer dereference - inc case configuring multi-function with devfn != 0 before devfn == 0.
* This issue was introduced with the topology-aware PCI enumeration code.
[Fix]
* 0b2ca2c7d0c9e2731d01b6c862375d44a7e13923 0b2ca2c7d0c9 "s390/pci: fix hot-plug of PCI function missing bus"
[Test Case]
* IBM Z or LinuxONE hardware, equipped with hot-pluggable, multi-functional PCIe cards (like for example RoCE Express 2 adapters) in classic/traditional mode.
* An Ubuntu OS running in LPAR, that comes with a kernel that includes the topology-aware PCI enumeration code (like for example 20.04.1 w/o further updates or 20.10 GA kernel).
* Now on a system that is in classic/traditional mode, hot plug ("Reassign I/O Path") a multi-function device, but using the FID of the second port.
[Regression Potential]
* There is at least some regression risk, but I consider it as low, because:
* Even is the modification is a single if statement (that spans two lines) in 'zpci_event_availability' it could harm the zPCI event management even more, in worst case it could break hot plug not only for systems in classic/traditional mode, but also in DPM mode (and making the system hang) or for all ports.
* In such a case no enabling / disabling of devices would be possible.
* But the fix is very simple and straight-forward, it checks zdev->zbus->bus for being NULL and in such a case break the function - means breaking instead of calling the PCI common code pci_scan_single_device() with the NULL pointer.
* PCIe devices are usually more optional devices on s390x (compared to CCW and OSA devices for network) and this affects the zPCI subsystem only, which is unique to s390x.
[Other]
* The patch got upstream accepted with kernel v5.10-rc3, hence it will land sooner or later in Hirsute.
* The patch has also been tagged for the upstream stable v5.8 series, hence will land in Groovy (based on kernel teams regular 'Groovy update: v5.8.x upstream stable release' LP bug).
* Hence requesting this Kernel SRU for Focal only, since Ubuntu releases older than Focal do not have the topology-aware zPCI enumeration code.
Niklas Schnelle (1):
s390/pci: fix hot-plug of PCI function missing bus
arch/s390/pci/pci_event.c | 4 ++++
1 file changed, 4 insertions(+)
--
2.25.1
More information about the kernel-team
mailing list