ACK/Cmnt: [PATCH 0/2] [FH:linux-azure] hv: vmbus: Fix duplicate CPU assignments within a device

Tim Gardner tim.gardner at canonical.com
Tue Aug 31 14:08:18 UTC 2021


Correct - these patches are unique to HyperV. I've noticed sometimes 
when creating a bug and assigning a package that LP doesn't always 
retain the specified package, rather it falls back to linux when I 
really wanted a derivative (like linux-azure).

On 8/31/21 7:33 AM, Stefan Bader wrote:
> On 20.08.21 21:00, Tim Gardner wrote:
>> BugLink: https://bugs.launchpad.net/bugs/1937078
>>
>> SRU Justification
>>
>> [Impact]
>>
>> Description of issue and solution:
>>
>> The vmbus module uses a rotational algorithm to assign target CPUs to
>> a device's channels. Depending on the timing of different device's 
>> channel
>> offers, different channels of a device may be assigned to the same CPU.
>>
>> For example on a VM with 2 CPUs, if NIC A and B's channels are offered
>> in the following order, NIC A will have both channels on CPU0, and
>> NIC B will have both channels on CPU1 -- see below. This kind of
>> assignment causes RSS load that is spreading across different channels
>> to end up on the same CPU.
>>
>> Timing of channel offers:
>> NIC A channel 0
>> NIC B channel 0
>> NIC A channel 1
>> NIC B channel 1
>>
>> VMBUS ID 14: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} - 
>> Synthetic network adapter
>> Device_ID = {cab064cd-1f31-47d5-a8b4-9d57e320cccd}
>> Sysfs path: /sys/bus/vmbus/devices/cab064cd-1f31-47d5-a8b4-9d57e320cccd
>> Rel_ID=14, target_cpu=0
>> Rel_ID=17, target_cpu=0
>>
>> VMBUS ID 16: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} - 
>> Synthetic network adapter
>> Device_ID = {244225ca-743e-4020-a17d-d7baa13d6cea}
>> Sysfs path: /sys/bus/vmbus/devices/244225ca-743e-4020-a17d-d7baa13d6cea
>> Rel_ID=16, target_cpu=1
>> Rel_ID=18, target_cpu=1
>>
>> Update the vmbus CPU assignment algorithm to avoid duplicate CPU
>> assignments within a device.
>>
>> The new algorithm iterates num_online_cpus + 1 times.
>> The existing rotational algorithm to find "next NUMA & CPU" is still 
>> here.
>> But if the resulting CPU is already used by the same device, it will try
>> the next CPU.
>> In the last iteration, it assigns the channel to the next available CPU
>> like the existing algorithm. This is not normally expected, because
>> during device probe, we limit the number of channels of a device to
>> be <= number of online CPUs.
>>
>> [Fix]
>>
>> https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/commit/?h=hyperv-fixes&id=7c9ff3deeee61b253715dcf968a6307af148c9b2 
>>
>>
>> [Test Plan]
>>
>> Microsoft tested both patches. No regressions detected.
>> Performance criteria satisfied.
>>
>> [Where problems could occur]
>>
>> Network performance issues could persist.
>>
>> [Other Info]
>>
>> SF: #00315347
>>
>>
> 
> I assume those should go to the custom kernels only (the base kernels 
> might have the same files but likely deviate in that area). I have 
> adjusted the bug report accordingly.
> 
> Acked-by: Stefan Bader <stefan.bader at canonical.com>
> 
> 

-- 
-----------
Tim Gardner
Canonical, Inc



More information about the kernel-team mailing list