ACK/Cmnt: [PATCH 0/2] [FH:linux-azure] hv: vmbus: Fix duplicate CPU assignments within a device
Tim Gardner
tim.gardner at canonical.com
Tue Aug 31 14:08:18 UTC 2021
Correct - these patches are unique to HyperV. I've noticed sometimes
when creating a bug and assigning a package that LP doesn't always
retain the specified package, rather it falls back to linux when I
really wanted a derivative (like linux-azure).
On 8/31/21 7:33 AM, Stefan Bader wrote:
> On 20.08.21 21:00, Tim Gardner wrote:
>> BugLink: https://bugs.launchpad.net/bugs/1937078
>>
>> SRU Justification
>>
>> [Impact]
>>
>> Description of issue and solution:
>>
>> The vmbus module uses a rotational algorithm to assign target CPUs to
>> a device's channels. Depending on the timing of different device's
>> channel
>> offers, different channels of a device may be assigned to the same CPU.
>>
>> For example on a VM with 2 CPUs, if NIC A and B's channels are offered
>> in the following order, NIC A will have both channels on CPU0, and
>> NIC B will have both channels on CPU1 -- see below. This kind of
>> assignment causes RSS load that is spreading across different channels
>> to end up on the same CPU.
>>
>> Timing of channel offers:
>> NIC A channel 0
>> NIC B channel 0
>> NIC A channel 1
>> NIC B channel 1
>>
>> VMBUS ID 14: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} -
>> Synthetic network adapter
>> Device_ID = {cab064cd-1f31-47d5-a8b4-9d57e320cccd}
>> Sysfs path: /sys/bus/vmbus/devices/cab064cd-1f31-47d5-a8b4-9d57e320cccd
>> Rel_ID=14, target_cpu=0
>> Rel_ID=17, target_cpu=0
>>
>> VMBUS ID 16: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} -
>> Synthetic network adapter
>> Device_ID = {244225ca-743e-4020-a17d-d7baa13d6cea}
>> Sysfs path: /sys/bus/vmbus/devices/244225ca-743e-4020-a17d-d7baa13d6cea
>> Rel_ID=16, target_cpu=1
>> Rel_ID=18, target_cpu=1
>>
>> Update the vmbus CPU assignment algorithm to avoid duplicate CPU
>> assignments within a device.
>>
>> The new algorithm iterates num_online_cpus + 1 times.
>> The existing rotational algorithm to find "next NUMA & CPU" is still
>> here.
>> But if the resulting CPU is already used by the same device, it will try
>> the next CPU.
>> In the last iteration, it assigns the channel to the next available CPU
>> like the existing algorithm. This is not normally expected, because
>> during device probe, we limit the number of channels of a device to
>> be <= number of online CPUs.
>>
>> [Fix]
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/commit/?h=hyperv-fixes&id=7c9ff3deeee61b253715dcf968a6307af148c9b2
>>
>>
>> [Test Plan]
>>
>> Microsoft tested both patches. No regressions detected.
>> Performance criteria satisfied.
>>
>> [Where problems could occur]
>>
>> Network performance issues could persist.
>>
>> [Other Info]
>>
>> SF: #00315347
>>
>>
>
> I assume those should go to the custom kernels only (the base kernels
> might have the same files but likely deviate in that area). I have
> adjusted the bug report accordingly.
>
> Acked-by: Stefan Bader <stefan.bader at canonical.com>
>
>
--
-----------
Tim Gardner
Canonical, Inc
More information about the kernel-team
mailing list