ACK: [SRU][F][PATCH 0/2] CVE-2024-26891

Thibault Ferrante thibault.ferrante at canonical.com
Fri Sep 6 14:33:45 UTC 2024


Acked-by: Thibault Ferrante <thibault.ferrante at canonical.com>


On 06-09-2024 15:28, Koichiro Den wrote:
> [Impact]
> 
> iommu/vt-d: Don't issue ATS Invalidation request when device is disconnected
> 
> For those endpoint devices connect to system via hotplug capable ports,
> users could request a hot reset to the device by flapping device's link
> through setting the slot's link control register, as pciehp_ist() DLLSC
> interrupt sequence response, pciehp will unload the device driver and
> then power it off. thus cause an IOMMU device-TLB invalidation (Intel
> VT-d spec, or ATS Invalidation in PCIe spec r6.1) request for non-existence
> target device to be sent and deadly loop to retry that request after ITE
> fault triggered in interrupt context.
> 
> That would cause continuous hard lockup warning and system hang.
> 
> Such issue could be triggered by all kinds of regular surprise removal
> hotplug operation. like:
> 
> 1. pull EP(endpoint device) out directly.
> 2. turn off EP's power.
> 3. bring the link down.
> etc.
> 
> this patch aims to work for regular safe removal and surprise removal
> unplug. these hot unplug handling process could be optimized for fix the
> ATS Invalidation hang issue by calling pci_dev_is_disconnected() in
> function devtlb_invalidation_with_pasid() to check target device state to
> avoid sending meaningless ATS Invalidation request to iommu when device is
> gone. (see IMPLEMENTATION NOTE in PCIe spec r6.1 section 10.3.1)
> 
> For safe removal, device wouldn't be removed until the whole software
> handling process is done, it wouldn't trigger the hard lock up issue
> caused by too long ATS Invalidation timeout wait. In safe removal path,
> device state isn't set to pci_channel_io_perm_failure in
> pciehp_unconfigure_device() by checking 'presence' parameter, calling
> pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will return
> false there, wouldn't break the function.
> 
> For surprise removal, device state is set to pci_channel_io_perm_failure in
> pciehp_unconfigure_device(), means device is already gone (disconnected)
> call pci_dev_is_disconnected() in devtlb_invalidation_with_pasid() will
> return true to break the function not to send ATS Invalidation request to
> the disconnected device blindly, thus avoid to trigger further ITE fault,
> and ITE fault will block all invalidation request to be handled.
> furthermore retry the timeout request could trigger hard lockup.
> 
> safe removal (present) & surprise removal (not present)
> 
> pciehp_ist()
>     pciehp_handle_presence_or_link_change()
>       pciehp_disable_slot()
>         remove_board()
>           pciehp_unconfigure_device(presence) {
>             if (!presence)
>                  pci_walk_bus(parent, pci_dev_set_disconnected, NULL);
>             }
> 
> this patch works for regular safe removal and surprise removal of ATS
> capable endpoint on PCIe switch downstream ports.
> 
> [Backport]
> 
> To backport the main patch, the pci_dev_is_disconnected() helper needs
> to be made public. Thus, cherry-picked commit 39714fd73c6 ("PCI: Make
> pci_dev_is_disconnected() helper public for other drivers").
> 
> Additionally, context adjustment were needed due to missing commit
> 672cf6df9b8a ("iommu/vt-d: Move Intel IOMMU driver into subdirectory")
> 
> [Fix]
> 
> Noble:  fixed via stable
> Jammy:  fixed via stable
> Focal:  Backport - adjusted contexts due to missing commits, see [Backport]
> Bionic: not affected
> Xenial: not affected
> Trusty: not affected
> 
> [Test Case]
> 
> Compile and boot tested
> 
> [Where problems could occur]
> 
> This fix potentially impacts intel architectures where an IOMMU capable
> of SM address translation is active, an issue with this fix would induce
> never succeeding device-TLB invalidation against no longer existing
> endpoint after its surprise removal, leading to hard lockup and system
> hang.
> 
> 
> Ethan Zhao (2):
>    PCI: Make pci_dev_is_disconnected() helper public for other drivers
>    iommu/vt-d: Don't issue ATS Invalidation request when device is
>      disconnected
> 
>   drivers/iommu/intel-pasid.c | 3 +++
>   drivers/pci/pci.h           | 5 -----
>   include/linux/pci.h         | 5 +++++
>   3 files changed, 8 insertions(+), 5 deletions(-)
> 


-- 
--
Thibault



More information about the kernel-team mailing list