ACK: [SRU][F:linux-bluefield][PATCH v1 0/1] UBUNTU: SAUCE: mlxbf_gige: syncup with v1.23 content

Tim Gardner tim.gardner at canonical.com
Mon May 24 15:27:01 UTC 2021


Acked-by: Tim Gardner <tim.gardner at canonical.com>

On 5/21/21 12:45 PM, David Thompson wrote:
> BugLink: https://bugs.launchpad.net/bugs/1928852
> 
> SRU Justification:
> 
> [Impact]
> * Certain file transfers over the oob_net0 interface, which is
>    managed by the mlxbf_gige driver, can fail in one of these ways:
>    1) Transfer fails due to lost connection, e.g. SCP of a large (~1GB)
>       file from a server into the BlueField-2 OOB interface can fail
>       and return "lost connection" status
>    2) Transfer fails due to kernel crash, e.g. issuing SCP on BlueField-2
>       platform to retrieve a file from a server and copy it into an NFS
>       mounted directory on the BlueField-2 platform can fail and crash the
>       Linux kernel with a page fault and issues this message:
>       "Unable to handle kernel paging request at virtual address XXX"
> 
> [Fix]
> * This delivery provides a set of changes to add stability to
> the mlxbf_gige driver transmit and receive processing:
> 
> Changes to mlxbf_gige_rx_packet()
> ---------------------------------
> 1) Changed logic to remove the assumption that there's at
>     least one packet to process. Instead, at the start of
>     routine check the RX CQE polarity bit, and if it is not
>     the expected value then exit.
> 
> 2) Moved call to "dma_unmap_single()" to within the path
>     where packet status is OK. Otherwise if an errored
>     packet is received, the SKB is unmapped but no SKB is
>     allocated to fill that same index.
> 
> 3) Defer call to "netif_receive_skb()" to end of routine
>     since this call can trigger more processing, even
>     packet transmissions, in the networking stack.
> 
> Changes to mlxbf_gige_start_xmit()
> ----------------------------------
> 1) Added logic to drop oversized packets
> 
> 2) Added logic to use a spin lock when access priv->tx_pi
>     since this index is also accessed by the transmit
>     completion logic.
> 
> Changes to mlxbf_gige_handle_tx_complete
> ----------------------------------------
> 1) Added call to "mb()" to flush prev_tx_ci updates
> 
> [Test Case]
> * #1 After booting platform, verify that file transfers of large files (~1GB) from a
>    server into the BlueField-2 platform's /tmp directory over the oob_net0 interface succeed
> * #2 Configure an NFS mounted directory on BlueField-2 platform and transfer
>    large files over the oob_net0 interface into this directory. It is important to
>    ensure that the oob_net0 is used for the NFS mount, and no other active interface
>    will be involved. In the below example, the <peer-ip> is the IP address of the
>    server interface that is the peer to the BlueField-2 OOB.
>     1) Configure NFS server on a remote server
>     2) Configure NFS client on BlueField-2 platform
>        a) mkdir /mnt/share
>        b) mount -t nfs <peer-ip>:<nfs-server-mount> /mnt/share
>     3) Exercise file transfers over oob_net0 interface
>        a) cd /mnt/share
>        b) scp <user>@<peer-ip>:/tmp/<large-file> <local-file>
> 
> [Regression Potential]
> * These changes have been well tested, but there's a chance that certain file
>    transfers could still experience problems (hung transfer, lost connection)
> 
> [Other]
> * The mlxbf_gige driver will display v1.23 in modinfo after these changes.
> 
> David Thompson (1):
>    UBUNTU: SAUCE: mlxbf_gige: syncup with v1.23 content
> 
>   .../ethernet/mellanox/mlxbf_gige/mlxbf_gige.h |  2 +
>   .../mellanox/mlxbf_gige/mlxbf_gige_main.c     | 27 +++++++----
>   .../mellanox/mlxbf_gige/mlxbf_gige_rx.c       | 45 +++++++++++++------
>   .../mellanox/mlxbf_gige/mlxbf_gige_tx.c       | 20 +++++----
>   4 files changed, 64 insertions(+), 30 deletions(-)
> 

-- 
-----------
Tim Gardner
Canonical, Inc



More information about the kernel-team mailing list