APPLIED: [SRU][F:linux-bluefield][PATCH v1 0/1] UBUNTU: SAUCE: mlxbf_gige: syncup with v1.23 content
Kelsey Skunberg
kelsey.skunberg at canonical.com
Sat May 29 00:49:28 UTC 2021
Applied to F/bluefield master-next. Thank you!
-Kelsey
On 2021-05-21 14:45:49 , David Thompson wrote:
> BugLink: https://bugs.launchpad.net/bugs/1928852
>
> SRU Justification:
>
> [Impact]
> * Certain file transfers over the oob_net0 interface, which is
> managed by the mlxbf_gige driver, can fail in one of these ways:
> 1) Transfer fails due to lost connection, e.g. SCP of a large (~1GB)
> file from a server into the BlueField-2 OOB interface can fail
> and return "lost connection" status
> 2) Transfer fails due to kernel crash, e.g. issuing SCP on BlueField-2
> platform to retrieve a file from a server and copy it into an NFS
> mounted directory on the BlueField-2 platform can fail and crash the
> Linux kernel with a page fault and issues this message:
> "Unable to handle kernel paging request at virtual address XXX"
>
> [Fix]
> * This delivery provides a set of changes to add stability to
> the mlxbf_gige driver transmit and receive processing:
>
> Changes to mlxbf_gige_rx_packet()
> ---------------------------------
> 1) Changed logic to remove the assumption that there's at
> least one packet to process. Instead, at the start of
> routine check the RX CQE polarity bit, and if it is not
> the expected value then exit.
>
> 2) Moved call to "dma_unmap_single()" to within the path
> where packet status is OK. Otherwise if an errored
> packet is received, the SKB is unmapped but no SKB is
> allocated to fill that same index.
>
> 3) Defer call to "netif_receive_skb()" to end of routine
> since this call can trigger more processing, even
> packet transmissions, in the networking stack.
>
> Changes to mlxbf_gige_start_xmit()
> ----------------------------------
> 1) Added logic to drop oversized packets
>
> 2) Added logic to use a spin lock when access priv->tx_pi
> since this index is also accessed by the transmit
> completion logic.
>
> Changes to mlxbf_gige_handle_tx_complete
> ----------------------------------------
> 1) Added call to "mb()" to flush prev_tx_ci updates
>
> [Test Case]
> * #1 After booting platform, verify that file transfers of large files (~1GB) from a
> server into the BlueField-2 platform's /tmp directory over the oob_net0 interface succeed
> * #2 Configure an NFS mounted directory on BlueField-2 platform and transfer
> large files over the oob_net0 interface into this directory. It is important to
> ensure that the oob_net0 is used for the NFS mount, and no other active interface
> will be involved. In the below example, the <peer-ip> is the IP address of the
> server interface that is the peer to the BlueField-2 OOB.
> 1) Configure NFS server on a remote server
> 2) Configure NFS client on BlueField-2 platform
> a) mkdir /mnt/share
> b) mount -t nfs <peer-ip>:<nfs-server-mount> /mnt/share
> 3) Exercise file transfers over oob_net0 interface
> a) cd /mnt/share
> b) scp <user>@<peer-ip>:/tmp/<large-file> <local-file>
>
> [Regression Potential]
> * These changes have been well tested, but there's a chance that certain file
> transfers could still experience problems (hung transfer, lost connection)
>
> [Other]
> * The mlxbf_gige driver will display v1.23 in modinfo after these changes.
>
> David Thompson (1):
> UBUNTU: SAUCE: mlxbf_gige: syncup with v1.23 content
>
> .../ethernet/mellanox/mlxbf_gige/mlxbf_gige.h | 2 +
> .../mellanox/mlxbf_gige/mlxbf_gige_main.c | 27 +++++++----
> .../mellanox/mlxbf_gige/mlxbf_gige_rx.c | 45 +++++++++++++------
> .../mellanox/mlxbf_gige/mlxbf_gige_tx.c | 20 +++++----
> 4 files changed, 64 insertions(+), 30 deletions(-)
>
> --
> 2.30.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list