ACK/Cmnt: [PATCH 0/2] [FH:linux-azure] hv: vmbus: Fix duplicate CPU assignments within a device

Stefan Bader stefan.bader at canonical.com
Tue Aug 31 13:33:11 UTC 2021


On 20.08.21 21:00, Tim Gardner wrote:
> BugLink: https://bugs.launchpad.net/bugs/1937078
> 
> SRU Justification
> 
> [Impact]
> 
> Description of issue and solution:
> 
> The vmbus module uses a rotational algorithm to assign target CPUs to
> a device's channels. Depending on the timing of different device's channel
> offers, different channels of a device may be assigned to the same CPU.
> 
> For example on a VM with 2 CPUs, if NIC A and B's channels are offered
> in the following order, NIC A will have both channels on CPU0, and
> NIC B will have both channels on CPU1 -- see below. This kind of
> assignment causes RSS load that is spreading across different channels
> to end up on the same CPU.
> 
> Timing of channel offers:
> NIC A channel 0
> NIC B channel 0
> NIC A channel 1
> NIC B channel 1
> 
> VMBUS ID 14: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} - Synthetic network adapter
> Device_ID = {cab064cd-1f31-47d5-a8b4-9d57e320cccd}
> Sysfs path: /sys/bus/vmbus/devices/cab064cd-1f31-47d5-a8b4-9d57e320cccd
> Rel_ID=14, target_cpu=0
> Rel_ID=17, target_cpu=0
> 
> VMBUS ID 16: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} - Synthetic network adapter
> Device_ID = {244225ca-743e-4020-a17d-d7baa13d6cea}
> Sysfs path: /sys/bus/vmbus/devices/244225ca-743e-4020-a17d-d7baa13d6cea
> Rel_ID=16, target_cpu=1
> Rel_ID=18, target_cpu=1
> 
> Update the vmbus CPU assignment algorithm to avoid duplicate CPU
> assignments within a device.
> 
> The new algorithm iterates num_online_cpus + 1 times.
> The existing rotational algorithm to find "next NUMA & CPU" is still here.
> But if the resulting CPU is already used by the same device, it will try
> the next CPU.
> In the last iteration, it assigns the channel to the next available CPU
> like the existing algorithm. This is not normally expected, because
> during device probe, we limit the number of channels of a device to
> be <= number of online CPUs.
> 
> [Fix]
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/commit/?h=hyperv-fixes&id=7c9ff3deeee61b253715dcf968a6307af148c9b2
> 
> [Test Plan]
> 
> Microsoft tested both patches. No regressions detected.
> Performance criteria satisfied.
> 
> [Where problems could occur]
> 
> Network performance issues could persist.
> 
> [Other Info]
> 
> SF: #00315347
> 
> 

I assume those should go to the custom kernels only (the base kernels might have 
the same files but likely deviate in that area). I have adjusted the bug report 
accordingly.

Acked-by: Stefan Bader <stefan.bader at canonical.com>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20210831/8c07ce80/attachment-0001.sig>


More information about the kernel-team mailing list