ACK: [SRU][F][PATCH 0/1] net/mlx5: Avoid processing commands before cmdif is ready (LP: 1987287)

Tim Gardner tim.gardner at canonical.com
Thu Sep 1 17:53:49 UTC 2022


On 9/1/22 10:53, frank.heimes at canonical.com wrote:
> BugLink: https://bugs.launchpad.net/bugs/1987287
> 
> SRU Justification:
> 
> [Impact]
> 
>   * If the mlx5 driver is reloading while the recovery flow is happening,
>     and if it receives new commands before the command interface is up
>     again, this can lead to null pointer that tries to access non-
>     initialized command structures.
> 
>   * So it's required to avoid processing commands before the command
>     interface is up again.
> 
>   * This is accomplished by a new cmdif state that helps to avoid
>     processing commands while cmdif is not ready.
> 
> [Fix]
> 
>   * backport of f7936ddd35d8 f7936ddd35d8b849daf0372770c7c9dbe7910fca "net/mlx5: Avoid processing commands before cmdif is ready"
> 
> [Test Plan]
> 
>   * An Ubuntu Server for s390x 18.04 or 20.04 LPAR or z/VM installation
>     is needed that has Mellanox cards (RoCE Express 2.1) assigned,
>     configured and enabled and that runs a 5.4 kernel (on bionic hwe-5.4).
> 
>   * Now trigger a recovery (guess that can be done at the Support Element)
>     and reload the driver at the same time.
> 
>   * Make sure the module/driver mlx5 is loaded and in use
>     (otherwise it can't be removed/unloaded).
> 
>   * Now remove/unload the module with:
>     sudo modprobe -r mlx5
>     and (re-)load it again with:
>     sudo modprobe mlx5
> 
>   * Due to the lack of RoCE Express 2.1 hardware,
>     IBM needs to do the verification.
> 
> [Where problems could occur]
> 
>   * In case there is an issue with 'cmdif' it might not have the correct
>     interface state, which:
>     - either might lead to the fact that commands are not properly blocked
>       and the situation is similar like before
>     - or the commands may get always blocked,
>       which render the hardware useless
>     - or might block in wrong situation,
>       which will cause unexpected issues and broken behavior.
> 
>   * Since the patch got upstream accepted with v5.7-rc7 it's
>     not new to the kernel, was already part of groovy (and above)
>     and is therefor already in use by newer Ubuntu releases.
> 
> [Other Info]
>   
>   * Since the patch is upstream since v5.7-rc7,
>     it's already included in jammy and kinetic.
> 
>   * Since the upstream patch incl. the line:
>     Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox
>     Connect-IB adapters") it looks to me that it was forgotten
>     to mark the patch for upstream stable updates.
> 
>   * Such SRUs for focal's 5.4 will automatically land in bionic's
>     hwe-5.4, too. But since this was especially requested for
>     bionic's hwe-5.4, I wanted to mention this here.
> 
> Eran Ben Elisha (1):
>    net/mlx5: Avoid processing commands before cmdif is ready
> 
>   drivers/net/ethernet/mellanox/mlx5/core/cmd.c  | 10 ++++++++++
>   drivers/net/ethernet/mellanox/mlx5/core/main.c |  4 ++++
>   include/linux/mlx5/driver.h                    |  9 +++++++++
>   3 files changed, 23 insertions(+)
> 
Acked-by: Tim Gardner <tim.gardner at canonical.com>

-- 
-----------
Tim Gardner
Canonical, Inc



More information about the kernel-team mailing list