[J][PATCH 1/6] s390/pci: tolerate inconsistent handle in recover

patricia.domingues at canonical.com patricia.domingues at canonical.com
Thu Mar 17 15:34:58 UTC 2022

From: Niklas Schnelle <schnelle at linux.ibm.com>

BugLink: https://bugs.launchpad.net/bugs/1959532

Since commit 8256adda1f44 ("s390/pci: handle FH state mismatch only on
disable") zpci_disable_device() returns -EINVAL when the platform
detects an attempt to disable a PCI function that it sees as already

In most situations we want to abort whenever this happens and abort is
possible since it either means that the device vanished but we haven't
gotten an availability event yet, or the FH got out of sync which should
not happen.

Unfortunately there is an inconsistency between the LPAR and z/VM
hypervisors on whether error events for PCI functions contain an
an enabled or a general handle. So under z/VM it can happen that our
most up to date function handle is enabled but trying to disable the
function results in the aforementioned error.

Since recover is designed to be used to recover functions from the error
state let's make it robust to this inconsistency by explicitly treating
it as a successful disable.

Acked-by: Pierre Morel <pmorel at linux.ibm.com>
Signed-off-by: Niklas Schnelle <schnelle at linux.ibm.com>
Signed-off-by: Vasily Gorbik <gor at linux.ibm.com>
(cherry picked from commit 1c8174fdc798489159a79466fca782daa231219a)
Signed-off-by: Patricia Domingues <patricia.domingues at canonical.com>
 arch/s390/pci/pci_sysfs.c | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/arch/s390/pci/pci_sysfs.c b/arch/s390/pci/pci_sysfs.c
index 335c281811c7..cae280e5c047 100644
--- a/arch/s390/pci/pci_sysfs.c
+++ b/arch/s390/pci/pci_sysfs.c
@@ -90,6 +90,14 @@ static ssize_t recover_store(struct device *dev, struct device_attribute *attr,
 		if (zdev_enabled(zdev)) {
 			ret = zpci_disable_device(zdev);
+			/*
+			 * Due to a z/VM vs LPAR inconsistency in the error
+			 * state the FH may indicate an enabled device but
+			 * disable says the device is already disabled don't
+			 * treat it as an error here.
+			 */
+			if (ret == -EINVAL)
+				ret = 0;
 			if (ret)
 				goto out;

More information about the kernel-team mailing list