ACK: [PATCH][focal:linux-azure] Drivers: hv: vmbus: Fix duplicate CPU assignments within a device
Stefan Bader
stefan.bader at canonical.com
Wed Nov 3 08:18:00 UTC 2021
On 01.11.21 15:20, Tim Gardner wrote:
> From: Haiyang Zhang <haiyangz at microsoft.com>
>
> BugLink: https://bugs.launchpad.net/bugs/1937078
>
> commit 7c9ff3deeee61b253715dcf968a6307af148c9b2 upstream
>
> The vmbus module uses a rotational algorithm to assign target CPUs to
> a device's channels. Depending on the timing of different device's channel
> offers, different channels of a device may be assigned to the same CPU.
>
> For example on a VM with 2 CPUs, if NIC A and B's channels are offered
> in the following order, NIC A will have both channels on CPU0, and
> NIC B will have both channels on CPU1 -- see below. This kind of
> assignment causes RSS load that is spreading across different channels
> to end up on the same CPU.
>
> Timing of channel offers:
> NIC A channel 0
> NIC B channel 0
> NIC A channel 1
> NIC B channel 1
>
> VMBUS ID 14: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} - Synthetic network adapter
> Device_ID = {cab064cd-1f31-47d5-a8b4-9d57e320cccd}
> Sysfs path: /sys/bus/vmbus/devices/cab064cd-1f31-47d5-a8b4-9d57e320cccd
> Rel_ID=14, target_cpu=0
> Rel_ID=17, target_cpu=0
>
> VMBUS ID 16: Class_ID = {f8615163-df3e-46c5-913f-f2d2f965ed0e} - Synthetic network adapter
> Device_ID = {244225ca-743e-4020-a17d-d7baa13d6cea}
> Sysfs path: /sys/bus/vmbus/devices/244225ca-743e-4020-a17d-d7baa13d6cea
> Rel_ID=16, target_cpu=1
> Rel_ID=18, target_cpu=1
>
> Update the vmbus CPU assignment algorithm to avoid duplicate CPU
> assignments within a device.
>
> The new algorithm iterates num_online_cpus + 1 times.
> The existing rotational algorithm to find "next NUMA & CPU" is still here.
> But if the resulting CPU is already used by the same device, it will try
> the next CPU.
> In the last iteration, it assigns the channel to the next available CPU
> like the existing algorithm. This is not normally expected, because
> during device probe, we limit the number of channels of a device to
> be <= number of online CPUs.
>
> Signed-off-by: Haiyang Zhang <haiyangz at microsoft.com>
> Reviewed-by: Michael Kelley <mikelley at microsoft.com>
> Tested-by: Michael Kelley <mikelley at microsoft.com>
> Link: https://lore.kernel.org/r/1626459673-17420-1-git-send-email-haiyangz@microsoft.com
> Signed-off-by: Wei Liu <wei.liu at kernel.org>
> (backported from commit 7c9ff3deeee61b253715dcf968a6307af148c9b2)
> Signed-off-by: Tim Gardner <tim.gardner at canonical.com>
> [ rtg - backported by Haiyang Zhang and has been
> verified to fix the bug on A2 and D32 instance types. ]
Acked-by: Stefan Bader <stefan.bader at canonical.com>
> ---
>
> v2 - v1 of this patch caused boot hangs on certain instance types and was dropped from
> the SRU cycle (https://bugs.launchpad.net/ubuntu/+source/linux-azure/+bug/1937078/comments/6).
> This backport was developed by the original author who likely has a better handle on
> things then I do. From SF, "Haiyang backported the patch to azure-5.4-next (attached). He
> also reproduced the bug on 5.4 kernel with VM size A2 and D32, and verified the patch fixes the bug."
>
> ---
> drivers/hv/channel_mgmt.c | 33 +++++++++++++++++++++++++++++++++
> 1 file changed, 33 insertions(+)
>
> diff --git a/drivers/hv/channel_mgmt.c b/drivers/hv/channel_mgmt.c
> index 2b5d20965b1bf..ceac42e48f3a2 100644
> --- a/drivers/hv/channel_mgmt.c
> +++ b/drivers/hv/channel_mgmt.c
> @@ -628,6 +628,32 @@ static void vmbus_process_offer(struct vmbus_channel *newchannel)
> queue_work(wq, &newchannel->add_channel_work);
> }
>
> +/*
> + * Check if CPU is used by other channels of the same device.
> + * It should only be called by init_vp_index().
> + */
> +static bool hv_cpuself_used(u32 cpu, struct vmbus_channel *chn)
> +{
> + struct vmbus_channel *primary = chn->primary_channel;
> + struct vmbus_channel *sc;
> + bool self_used = false;
> + unsigned long flags;
> +
> + if (!primary)
> + return false;
> +
> + if (primary->target_cpu == cpu)
> + return true;
> +
> + spin_lock_irqsave(&primary->lock, flags);
> + list_for_each_entry(sc, &primary->sc_list, sc_list)
> + if (sc != chn && sc->target_cpu == cpu)
> + self_used = true;
> + spin_unlock_irqrestore(&primary->lock, flags);
> +
> + return self_used;
> +}
> +
> /*
> * We use this state to statically distribute the channel interrupt load.
> */
> @@ -654,6 +680,7 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
> u32 cur_cpu;
> bool perf_chn = vmbus_devs[dev_type].perf_device;
> struct vmbus_channel *primary = channel->primary_channel;
> + u32 cnt = 1, ncpu = num_online_cpus();
> int next_node;
> cpumask_var_t available_mask;
> struct cpumask *alloced_mask;
> @@ -676,6 +703,7 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
>
> spin_lock(&bind_channel_to_cpu_lock);
>
> +retry:
> /*
> * Based on the channel affinity policy, we will assign the NUMA
> * nodes.
> @@ -755,6 +783,11 @@ static void init_vp_index(struct vmbus_channel *channel, u16 dev_type)
> }
> }
>
> + if (channel->affinity_policy == HV_BALANCED &&
> + channel->offermsg.offer.sub_channel_index < ncpu &&
> + cnt++ < ncpu + 1 && hv_cpuself_used(cur_cpu, channel))
> + goto retry;
> +
> channel->target_cpu = cur_cpu;
> channel->target_vp = hv_cpu_number_to_vp_number(cur_cpu);
>
>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: OpenPGP_signature
Type: application/pgp-signature
Size: 833 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20211103/d02d9ea9/attachment.sig>
More information about the kernel-team
mailing list