[PATCH 0/1][Jammy linux-azure] [Azure][CVM] Fix swiotlb_max_mapping_size() for potential bounce buffer allocation failure in storvsc
Tim Gardner
tim.gardner at canonical.com
Thu May 12 12:07:05 UTC 2022
BugLink: https://bugs.launchpad.net/bugs/1973169
SRU Justification
[Impact]
Description of problem:
When the v5.15 linux-azure kernel is used for CVM on Azure, it uses swiotlb for
bounce buffering. We recently found an issue in swiotlb_max_mapping_size(),
which is used by the SCSI subsytem APIs, which are used by the hv_storvsc driver.
The issue is: currently swiotlb_max_mapping_size() always reports 256KB (i.e.
128 bounce buffer slots), but swiotlb_tbl_map_single() is unable to allocate a
bounce buffer for an unaligned 256KB request, and eventually it can get stuck
and we see this call-trace (BTW, this call-trace is obtained from a SLES VM, but
I believe the issue exists in all distro kernels supporting CVM, and Tianyu says
he's able to repro the issue in a Ubuntu CVM when trying to mount a XFS file system):
[ 186.458666][ C1] swiotlb_tbl_map_single+0x396/0x920
[ 186.458669][ C1] swiotlb_map+0xaa/0x2d0
[ 186.458674][ C1] dma_direct_map_sg+0xee/0x2c0
[ 186.458677][ C1] __dma_map_sg_attrs+0x30/0x70
[ 186.458680][ C1] dma_map_sg_attrs+0xa/0x20
[ 186.458681][ C1] scsi_dma_map+0x35/0x40
[ 186.458684][ C1] storvsc_queuecommand+0x20b/0x890
[ 186.458696][ C1] scsi_queue_rq+0x606/0xb80
[ 186.458698][ C1] __blk_mq_try_issue_directly+0x149/0x1c0
[ 186.458702][ C1] blk_mq_try_issue_directly+0x15/0x50
[ 186.458704][ C1] blk_mq_submit_bio+0x4b6/0x620
[ 186.458706][ C1] __submit_bio+0xe8/0x160
[ 186.458708][ C1] submit_bio_noacct_nocheck+0xf0/0x2b0
[ 186.458713][ C1] submit_bio+0x42/0xd0
[ 186.458714][ C1] submit_bio_wait+0x54/0xb0
[ 186.458718][ C1] xfs_rw_bdev+0x180/0x1b0 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.458769][ C1] xlog_do_io+0x8d/0x140 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.458819][ C1] xlog_bread+0x1f/0x40 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.458859][ C1] xlog_find_verify_cycle+0xc8/0x180 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.458899][ C1] xlog_find_head+0x2ae/0x3a0 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.458937][ C1] xlog_find_tail+0x44/0x360 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.458978][ C1] xlog_recover+0x2b/0x170 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.459056][ C1] xfs_log_mount+0x15b/0x270 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.459098][ C1] xfs_mountfs+0x49e/0x830 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.459224][ C1] xfs_fs_fill_super+0x5c2/0x7c0 [xfs 172cb9b0bc08b0ee82c7c88dc584daeab1b34d46]
[ 186.459303][ C1] get_tree_bdev+0x163/0x260
[ 186.459307][ C1] vfs_get_tree+0x25/0xc0
[ 186.459309][ C1] path_mount+0x704/0x9c0
Details: For example, the original physical address from the SCSI layer can be
0x1_0903_f200 with size=256KB, and when swiotlb_tbl_map_single() calls
swiotlb_find_slots(), it passes "alloc_size + offset" (i.e. 256KB + 0x200 ) to
swiotlb_find_slots(), which then calculates "nslots = nr_slots(alloc_size) ==>
129" and fails to allocate a bounce buffer as the maximum allowable number of
contiguous slabs to map is IO_TLB_SEGSIZE (128).
The issue affects the hv_storvsc driver, as it calls
dma_set_min_align_mask(&device->device, HV_HYP_PAGE_SIZE - 1);
dma_set_min_align_mask() is also called by hv_netvsc, but netvsc is not affected
as netvsc never calls swiotlb_tbl_map_single() with a near-to-256KB size.
dma_set_min_align_mask() is also called by the NVMe driver, but since we don't
support PCI device assignment for CVM for now, that's not affected for now.
Tianyu Lan made a fix which is under review:
https://lwn.net/ml/linux-kernel/20220510142109.777738-1-ltykernel%40gmail.com/
Note: the linux-azure-cvm v5.4 kernel doesn't need the fix, as that kernel uses
a vmbus private bounce buffering implementation (drivers/hv/hv_bounce.c) rathen
than swiotlb.
[Test Case]
Microsoft tested
[Where things could go wrong]
Bounce buffers may fail to allocate.
[Other Info]
SF: #00336634
More information about the kernel-team
mailing list