NAK: [SRU][Yakkety][PATCH 1/1] net/mlx4_core: Avoid delays during VF driver device shutdown

Joshua R. Poulson jrp at pun.org
Mon Apr 3 12:11:34 UTC 2017


We're engaged to test the driver in the intended scenario and code
path. We have a stable repro for the bug that this fixes.

Thanks, --jrp

On Mon, Apr 3, 2017 at 2:26 AM, Thadeu Lima de Souza Cascardo
<cascardo at canonical.com> wrote:
> On Thu, Mar 30, 2017 at 04:35:59PM -0400, Joseph Salisbury wrote:
>> From: Jack Morgenstein <jackm at dev.mellanox.co.il>
>>
>> BugLink: http://bugs.launchpad.net/bugs/1672785
>>
>> Some Hypervisors detach VFs from VMs by instantly causing an FLR event
>> to be generated for a VF.
>>
>> In the mlx4 case, this will cause that VF's comm channel to be disabled
>> before the VM has an opportunity to invoke the VF device's "shutdown"
>> method.
>>
>> For such Hypervisors, there is a race condition between the VF's
>> shutdown method and its internal-error detection/reset thread.
>>
>> The internal-error detection/reset thread (which runs every 5 seconds) also
>> detects a disabled comm channel. If the internal-error detection/reset
>> flow wins the race, we still get delays (while that flow tries repeatedly
>> to detect comm-channel recovery).
>>
>> The cited commit fixed the command timeout problem when the
>> internal-error detection/reset flow loses the race.
>>
>> This commit avoids the unneeded delays when the internal-error
>> detection/reset flow wins.
>>
>> Fixes: d585df1c5ccf ("net/mlx4_core: Avoid command timeouts during VF driver device shutdown")
>> Signed-off-by: Jack Morgenstein <jackm at dev.mellanox.co.il>
>> Reported-by: Simon Xiao <sixiao at microsoft.com>
>> Signed-off-by: Tariq Toukan <tariqt at mellanox.com>
>> Signed-off-by: David S. Miller <davem at davemloft.net>
>> (backported from commit 4cbe4dac82e423ecc9a0ba46af24a860853259f4)
>> Signed-off-by: Joseph Salisbury <joseph.salisbury at canonical.com>
>> ---
>>  drivers/net/ethernet/mellanox/mlx4/cmd.c  | 11 +++++++++++
>>  drivers/net/ethernet/mellanox/mlx4/main.c | 11 +++++++++++
>>  include/linux/mlx4/device.h               |  1 +
>>  3 files changed, 23 insertions(+)
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/cmd.c b/drivers/net/ethernet/mellanox/mlx4/cmd.c
>> index f04a423..be6906b 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/cmd.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/cmd.c
>> @@ -2278,6 +2278,17 @@ static int sync_toggles(struct mlx4_dev *dev)
>>               rd_toggle = swab32(readl(&priv->mfunc.comm->slave_read));
>>               if (wr_toggle == 0xffffffff || rd_toggle == 0xffffffff) {
>>                       /* PCI might be offline */
>> +
>> +                     /* If device removal has been requested,
>> +                      * do not continue retrying.
>> +                      */
>> +                     if (dev->persist->interface_state &
>> +                         MLX4_INTERFACE_STATE_NOWAIT) {
>> +                             mlx4_warn(dev,
>> +                                       "communication channel is offline\n");
>> +                             return -EIO;
>> +                     }
>> +
>>                       msleep(100);
>>                       wr_toggle = swab32(readl(&priv->mfunc.comm->
>>                                          slave_write));
>> diff --git a/drivers/net/ethernet/mellanox/mlx4/main.c b/drivers/net/ethernet/mellanox/mlx4/main.c
>> index 7183ac4..ea5e362 100644
>> --- a/drivers/net/ethernet/mellanox/mlx4/main.c
>> +++ b/drivers/net/ethernet/mellanox/mlx4/main.c
>> @@ -1916,6 +1916,14 @@ static int mlx4_comm_check_offline(struct mlx4_dev *dev)
>>                              (u32)(1 << COMM_CHAN_OFFLINE_OFFSET));
>>               if (!offline_bit)
>>                       return 0;
>> +
>> +             /* If device removal has been requested,
>> +              * do not continue retrying.
>> +              */
>> +             if (dev->persist->interface_state &
>> +                 MLX4_INTERFACE_STATE_NOWAIT)
>> +                     break;
>> +
>>               /* There are cases as part of AER/Reset flow that PF needs
>>                * around 100 msec to load. We therefore sleep for 100 msec
>>                * to allow other tasks to make use of that CPU during this
>> @@ -3930,6 +3938,9 @@ static void mlx4_remove_one(struct pci_dev *pdev)
>>       struct devlink *devlink = priv_to_devlink(priv);
>>       int active_vfs = 0;
>>
>> +     if (mlx4_is_slave(dev))
>> +             persist->interface_state |= MLX4_INTERFACE_STATE_NOWAIT;
>> +
>>       mutex_lock(&persist->interface_state_mutex);
>>       persist->interface_state |= MLX4_INTERFACE_STATE_DELETION;
>>       mutex_unlock(&persist->interface_state_mutex);
>> diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
>> index 42da355..d49f11d 100644
>> --- a/include/linux/mlx4/device.h
>> +++ b/include/linux/mlx4/device.h
>> @@ -468,6 +468,7 @@ enum {
>>       MLX4_INTERFACE_STATE_UP         = 1 << 0,
>>       MLX4_INTERFACE_STATE_DELETION   = 1 << 1,
>>       MLX4_INTERFACE_STATE_SHUTDOWN   = 1 << 2,
>> +     MLX4_INTERFACE_STATE_NOWAIT     = 1 << 2,
>
> So, NOWAIT and SHUTDOWN are defined to the same value, so it might be
> necessary to review it that is going to work as expected. Maybe doing
> the shutdown removal is a path to consider (commit
> b4353708f5a1c084fd73f1b6fd243b142157b173).
>
> Cascardo.
>
>>  };
>>
>>  #define MSTR_SM_CHANGE_MASK (MLX4_EQ_PORT_INFO_MSTR_SM_SL_CHANGE_MASK | \
>> --
>> 2.7.4
>>
>>
>> --
>> kernel-team mailing list
>> kernel-team at lists.ubuntu.com
>> https://lists.ubuntu.com/mailman/listinfo/kernel-team
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team




More information about the kernel-team mailing list