[SRU][F][PATCH 2/2] net/mlx5: Fix handling of entry refcount when command is not issued to FW
frank.heimes at canonical.com
frank.heimes at canonical.com
Wed Jun 28 10:04:07 UTC 2023
From: Moshe Shemesh <moshe at nvidia.com>
BugLink: https://bugs.launchpad.net/bugs/2019011
In case command interface is down, or the command is not allowed, driver
did not increment the entry refcount, but might have decrement as part
of forced completion handling.
Fix that by always increment and decrement the refcount to make it
symmetric for all flows.
Fixes: 50b2412b7e78 ("net/mlx5: Avoid possible free of command entry while timeout comp handler")
Signed-off-by: Eran Ben Elisha <eranbe at nvidia.com>
Signed-off-by: Moshe Shemesh <moshe at nvidia.com>
Reported-by: Jack Wang <jinpu.wang at ionos.com>
Tested-by: Jack Wang <jinpu.wang at ionos.com>
Signed-off-by: Saeed Mahameed <saeedm at nvidia.com>
(cherry picked from commit aaf2e65cac7f2e1ae729c2fbc849091df9699f96)
Signed-off-by: Frank Heimes <frank.heimes at canonical.com>
---
drivers/net/ethernet/mellanox/mlx5/core/cmd.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
index b9d45d6442b9..f6cc1c008eee 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/cmd.c
@@ -959,6 +959,7 @@ static void cmd_work_handler(struct work_struct *work)
cmd_ent_get(ent);
set_bit(MLX5_CMD_ENT_STATE_PENDING_COMP, &ent->state);
+ cmd_ent_get(ent); /* for the _real_ FW event on completion */
/* Skip sending command to fw if internal error */
if (mlx5_cmd_is_down(dev) || !opcode_allowed(&dev->cmd, ent->op)) {
u8 status = 0;
@@ -972,7 +973,6 @@ static void cmd_work_handler(struct work_struct *work)
return;
}
- cmd_ent_get(ent); /* for the _real_ FW event on completion */
/* ring doorbell after the descriptor is valid */
mlx5_core_dbg(dev, "writing 0x%x to command doorbell\n", 1 << ent->idx);
wmb();
@@ -1586,8 +1586,8 @@ static void mlx5_cmd_comp_handler(struct mlx5_core_dev *dev, u64 vec, bool force
cmd_ent_put(ent); /* timeout work was canceled */
if (!forced || /* Real FW completion */
- pci_channel_offline(dev->pdev) || /* FW is inaccessible */
- dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR)
+ mlx5_cmd_is_down(dev) || /* No real FW completion is expected */
+ !opcode_allowed(cmd, ent->op))
cmd_ent_put(ent);
ent->ts2 = ktime_get_ns();
--
2.25.1
More information about the kernel-team
mailing list