ACK: [SRU][F][PATCH 0/2] CVE-2024-26891
Thibault Ferrante
thibault.ferrante at canonical.com
Fri Sep 6 14:33:45 UTC 2024
Acked-by: Thibault Ferrante <thibault.ferrante at canonical.com>
On 06-09-2024 15:28, Koichiro Den wrote:
> [Impact]
>
> iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
>
> For those endpoint devices connect to system via hotplug capable ports,
> users could request a hot reset to the device by flapping device's link
> through setting the slot's link control register, as pciehp_ist() DLLSC
> interrupt sequence response, pciehp will unload the device driver and
> then power it off. thus cause an IOMMU device-TLB invalidation (Intel
> VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for non-existence
> target device to be sent and deadly loop to retry that request after ITE
> fault triggered in interrupt context.
>
> That would cause continuous hard lockup warning and system hang.
>
> Such issue could be triggered by all kinds of regular surprise removal
> hotplug operation. like:
>
> 1. pull EP(endpoint device) out directly.
> 2. turn off EP's power.
> 3. bring the link down.
> etc.
>
> this patch aims to work for regular safe removal and surprise removal
> unplug. these hot unplug handling process could be optimized for fix the
> ATS Invalidation hang issue by calling pci_dev_is_disconnected() in
> function devtlb_invalidation_with_pasid() to check target device state to
> avoid sending meaningless ATS Invalidation request to iommu when device is
> gone. (see IMPLEMENTATION NOTE in PCIe spec r6.1 section 10.3.1)
>
> For safe removal, device wouldn't be removed until the whole software
> handling process is done, it wouldn't trigger the hard lock up issue
> caused by too long ATS Invalidation timeout wait. In safe removal path,
> device state isn't set to pci_channel_io_perm_failure in
> pciehp_unconfigure_device() by checking 'presence' parameter, calling
> pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return
> false there, wouldn't break the function.
>
> For surprise removal, device state is set to pci_channel_io_perm_failure in
> pciehp_unconfigure_device(), means device is already gone (disconnected)
> call pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will
> return true to break the function not to send ATS Invalidation request to
> the disconnected device blindly, thus avoid to trigger further ITE fault,
> and ITE fault will block all invalidation request to be handled.
> furthermore retry the timeout request could trigger hard lockup.
>
> safe removal (present) & surprise removal (not present)
>
> pciehp_ist()
> pciehp_handle_presence_or_link_change()
> pciehp_disable_slot()
> remove_board()
> pciehp_unconfigure_device(presence) {
> if (!presence)
> pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
> }
>
> this patch works for regular safe removal and surprise removal of ATS
> capable endpoint on PCIe switch downstream ports.
>
> [Backport]
>
> To backport the main patch, the pci_dev_is_disconnected() helper needs
> to be made public. Thus, cherry-picked commit 39714fd73c6 ("PCI: Make
> pci_dev_is_disconnected() helper public for other drivers").
>
> Additionally, context adjustment were needed due to missing commit
> 672cf6df9b8a ("iommu/vt-d: Move Intel IOMMU driver into subdirectory")
>
> [Fix]
>
> Noble: fixed via stable
> Jammy: fixed via stable
> Focal: Backport - adjusted contexts due to missing commits, see [Backport]
> Bionic: not affected
> Xenial: not affected
> Trusty: not affected
>
> [Test Case]
>
> Compile and boot tested
>
> [Where problems could occur]
>
> This fix potentially impacts intel architectures where an IOMMU capable
> of SM address translation is active, an issue with this fix would induce
> never succeeding device-TLB invalidation against no longer existing
> endpoint after its surprise removal, leading to hard lockup and system
> hang.
>
>
> Ethan Zhao (2):
> PCI: Make pci_dev_is_disconnected() helper public for other drivers
> iommu/vt-d: Don't issue ATS Invalidation request when device is
> disconnected
>
> drivers/iommu/intel-pasid.c | 3 +++
> drivers/pci/pci.h | 5 -----
> include/linux/pci.h | 5 +++++
> 3 files changed, 8 insertions(+), 5 deletions(-)
>
--
--
Thibault
More information about the kernel-team
mailing list