ACK: [focal:linux-azure, bionic:linux-azure-4.15][PATCH 0/5] Fix kdump Over Network
Colin Ian King
colin.king at canonical.com
Thu Oct 8 11:28:23 UTC 2020
On 07/10/2020 22:16, Kelsey Skunberg wrote:
> BugLink: https://bugs.launchpad.net/bugs/1883261
>
> [Impact]
>
> Microsoft would like to request two kdump related fixes in all releases
> supported on Azure. The two commits are:
>
> c81992e7f4aa1 ("PCI: hv: Retry PCI bus D0 entry on invalid device
> state")
> 83cc3508ffaa6 ("PCI: hv: Fix the PCI HyperV probe failure path
> to release resource properly")
>
> These are in the virtual PCI driver for Hyper-V. The customer visible
> symptom is that the network is not functional in the kdump kernel, so
> the dump file must be stored on the local disk and cannot be written
> over the network.
>
> The problem only occurs when Accelerated Networking is enabled. It’s a
> relatively obscure scenario, which is why the problem has not surfaced
> before now. But we have an important customer who wants the
> “dump-file-over-the-network” functionality to work.
>
> For bionic/linux-azure-4.15, the following additional patch needs to be
> backported first to allow the requested patches to apply cleanly:
>
> a8e37506e79a ("PCI: hv: Reorganize the code in preparation of
> hibernation")
>
> [Test Case]
>
> - Apply requested patches and boot into updated kernel
> - Verify Accelerated Networking is enabled
> - Set up kdump
> - configure kdump to use SSH
> - Test the crash dump mechanism and verify the kernel crash dump appears
> on the selected remote server
>
> Further details for setting up kdump through testing can be found here:
> https://ubuntu.com/server/docs/kernel-crash-dump
>
> [Regression Potential]
>
> Patches are only targeted to azure kernels.
>
> Patches are desgiend to release allocated resources remaining after
> error cases in hv_pci_probe() or PCI devices not being shut down
> properly. if those resources are still not correctly released, then
> entering D0 state in kdump kernel could continue to fail.
>
> Potential for finding regression with freeing resources or still failing to
> enter D0 state in the kdump kernel even after all resources have been
> released.
>
> Build & boot tested. Verified kdump works as intended over SSH after
> patches are applied.
>
> Both 5.4 and 4.15 test kernels were sent to Microsoft. Both kernels
> signed off on and verified to resolve problem.
>
>
> Changes for Bionic/linux-azure-4.15:
>
>
> Dexuan Cui (1):
> PCI: hv: Reorganize the code in preparation of hibernation
>
> Wei Hu (2):
> PCI: hv: Fix the PCI HyperV probe failure path to release resource
> properly
> PCI: hv: Retry PCI bus D0 entry on invalid device state
>
> drivers/pci/host/pci-hyperv.c | 101 +++++++++++++++++++++++++++-------
> 1 file changed, 81 insertions(+), 20 deletions(-)
>
>
> Changes for Focal/linux-azure:
>
> Wei Hu (2):
> PCI: hv: Fix the PCI HyperV probe failure path to release resource
> properly
> PCI: hv: Retry PCI bus D0 entry on invalid device state
>
> drivers/pci/controller/pci-hyperv.c | 60 ++++++++++++++++++++++++++---
> 1 file changed, 54 insertions(+), 6 deletions(-)
>
> --
> 2.25.1
>
Thanks Kelsey; backports look good to me, good test case and results, I
think the regression potential vs benefit looks sane, so..
Acked-by: Colin Ian King <colin.king at canonical.com>
More information about the kernel-team
mailing list