[PATCH 0/4][jammy/linux-azure] swiotlb patch needed for CVM

Tim Gardner tim.gardner at canonical.com
Thu May 5 12:39:45 UTC 2022


BugLink: https://bugs.launchpad.net/bugs/1971701

SRU Justification

[Impact]

[Azure][CVM] Include the swiotlb patch to increase the disk/network performance

Description
As we discussed, there will be new CVM-supporting linux-azure kernels that're based on
v5.13 and v5.15. Here I'm requesting the below patch to be included into the two kernels
because it can significantly improve the disk/network performance:

swiotlb: Split up single swiotlb lock: https://github.com/intel/tdx/commit/4529b5784c141782c72ec9bd9a92df2b68cb7d45

We have tested the patch with the upstream 5.16-rc8.
BTW, the patch is unlikely to be in the mainline kernel, as the community is trying
to resolve the lock contention issue in the swiotlb code using a different per-device
per-queue implementation, which would need quite some time to be finalized -- before
that happens, we need this out-of-tree patch to achieve good disk/network performance
for CVM GA on Azure.

(BTW, the v5.4-based linux-azure-cvm kernel does not need the patch, because it uses
a private bounce buffer implementation: drivers/hv/hv_bounce.c, which doesn’t have
the I/O performance issue caused by lock contention in the mainline kernel’s swiotlb code.)

[Test Case]

[Microsoft tested]

I tried the April-27 amd64 test kernel and it worked great for me:
1. The test kernels booted up successfully with 256 virtual CPUs + 100 GB memory.
2. The kernel worked when I changed the MTU of the NetVSC NIC.
3. The Hyper-V HeartBeat/TimeSync/ShutDown VMBsus devices also worked as expected.
4. I did some quick disk I/O and network stress tests and found no issue.

When I did the above tests, I changed the low MMIO size to 3GB (which is the setting
for a VM on Azure today) by "set-vm decui-u2004-cvm -LowMemoryMappedIoSpace 3GB".

Our test team will do more testing, including performance test. We expect the
performance of this v5.15 test kernel should be on par with the v5.4 linux-azure-cvm kernel.

[Where things could go wrong]

Networking could fail or continue to suffer from poor performance.

[Other Info]

SF: #00332721




More information about the kernel-team mailing list