ACK: [SRU][F][PATCH 0/1] net/mlx5: Avoid processing commands before cmdif is ready (LP: 1987287)
Tim Gardner
tim.gardner at canonical.com
Thu Sep 1 17:53:49 UTC 2022
On 9/1/22 10:53, frank.heimes at canonical.com wrote:
> BugLink: https://bugs.launchpad.net/bugs/1987287
>
> SRU Justification:
>
> [Impact]
>
> * If the mlx5 driver is reloading while the recovery flow is happening,
> and if it receives new commands before the command interface is up
> again, this can lead to null pointer that tries to access non-
> initialized command structures.
>
> * So it's required to avoid processing commands before the command
> interface is up again.
>
> * This is accomplished by a new cmdif state that helps to avoid
> processing commands while cmdif is not ready.
>
> [Fix]
>
> * backport of f7936ddd35d8 f7936ddd35d8b849daf0372770c7c9dbe7910fca "net/mlx5: Avoid processing commands before cmdif is ready"
>
> [Test Plan]
>
> * An Ubuntu Server for s390x 18.04 or 20.04 LPAR or z/VM installation
> is needed that has Mellanox cards (RoCE Express 2.1) assigned,
> configured and enabled and that runs a 5.4 kernel (on bionic hwe-5.4).
>
> * Now trigger a recovery (guess that can be done at the Support Element)
> and reload the driver at the same time.
>
> * Make sure the module/driver mlx5 is loaded and in use
> (otherwise it can't be removed/unloaded).
>
> * Now remove/unload the module with:
> sudo modprobe -r mlx5
> and (re-)load it again with:
> sudo modprobe mlx5
>
> * Due to the lack of RoCE Express 2.1 hardware,
> IBM needs to do the verification.
>
> [Where problems could occur]
>
> * In case there is an issue with 'cmdif' it might not have the correct
> interface state, which:
> - either might lead to the fact that commands are not properly blocked
> and the situation is similar like before
> - or the commands may get always blocked,
> which render the hardware useless
> - or might block in wrong situation,
> which will cause unexpected issues and broken behavior.
>
> * Since the patch got upstream accepted with v5.7-rc7 it's
> not new to the kernel, was already part of groovy (and above)
> and is therefor already in use by newer Ubuntu releases.
>
> [Other Info]
>
> * Since the patch is upstream since v5.7-rc7,
> it's already included in jammy and kinetic.
>
> * Since the upstream patch incl. the line:
> Fixes: e126ba97dba9 ("mlx5: Add driver for Mellanox
> Connect-IB adapters") it looks to me that it was forgotten
> to mark the patch for upstream stable updates.
>
> * Such SRUs for focal's 5.4 will automatically land in bionic's
> hwe-5.4, too. But since this was especially requested for
> bionic's hwe-5.4, I wanted to mention this here.
>
> Eran Ben Elisha (1):
> net/mlx5: Avoid processing commands before cmdif is ready
>
> drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 10 ++++++++++
> drivers/net/ethernet/mellanox/mlx5/core/main.c | 4 ++++
> include/linux/mlx5/driver.h | 9 +++++++++
> 3 files changed, 23 insertions(+)
>
Acked-by: Tim Gardner <tim.gardner at canonical.com>
--
-----------
Tim Gardner
Canonical, Inc
More information about the kernel-team
mailing list