APPLIED[L]: [SRU] [L/M/Unstable] [PATCH 0/2] Fix numerous AER related issues
Roxana Nicolescu
roxana.nicolescu at canonical.com
Fri Sep 1 09:21:13 UTC 2023
On 25/08/2023 10:19, Kai-Heng Feng wrote:
> BugLink: https://bugs.launchpad.net/bugs/2033025
>
> [Impact]
> Numerous issues triggered from AER/DPC services
>
> - When AER is shared with PME, cutting the power off the device can
> trigger AER IRQ. Since AER IRQ is shared with PME, it's treated like a
> wakeup source, preventing the system from entering sleep.
>
> - When system resume from S3, device can reset itself and start sending
> PTM messages, triggering AER and reset the entire hierarchy. Since the
> hardware/firmware starts before software, it's never soon enough to put
> a band-aid from kernel.
>
> - Following above one, device firmware restarts before kernel resume,
> when DPC is triggered then the device is gone without any recovering
> method. We really want to prevent that from happening.
>
> [Fix]
> Disable and re-enable AER and DPC services on suspend and resume,
> respectively. Right now the the PCI mailing list doesn't have a
> consensus which PCI state (D3hot vs D3cold) should the AER/DPC services
> should be disabled, so re-instate the old workaround for now.
>
> [Test]
> One the workaround is applied, symptoms described above can no longer be
> observed.
>
> [Where problems could occur]
> Theoretically there can be some "real" issues get unnoticed once AER
> gets temporarily disabled, but the benefit far outweighs the downside.
>
> Kai-Heng Feng (2):
> UBUNTU: SAUCE: PCI/AER: Disable AER service during suspend, again
> UBUNTU: SAUCE: PCI/DPC: Disable DPC service during suspend, again
>
> drivers/pci/pcie/aer.c | 18 ++++++++++++++++
> drivers/pci/pcie/dpc.c | 49 ++++++++++++++++++++++++++++++++----------
> 2 files changed, 56 insertions(+), 11 deletions(-)
>
Applied to lunar:master-next. Thanks!
Roxana
More information about the kernel-team
mailing list