[SRU][F:linux-bluefield][PATCH v1 0/1] UBUNTU: SAUCE: mlxbf_gige: syncup with v1.23 content
David Thompson
davthompson at nvidia.com
Fri May 21 18:45:49 UTC 2021
BugLink: https://bugs.launchpad.net/bugs/1928852
SRU Justification:
[Impact]
* Certain file transfers over the oob_net0 interface, which is
managed by the mlxbf_gige driver, can fail in one of these ways:
1) Transfer fails due to lost connection, e.g. SCP of a large (~1GB)
file from a server into the BlueField-2 OOB interface can fail
and return "lost connection" status
2) Transfer fails due to kernel crash, e.g. issuing SCP on BlueField-2
platform to retrieve a file from a server and copy it into an NFS
mounted directory on the BlueField-2 platform can fail and crash the
Linux kernel with a page fault and issues this message:
"Unable to handle kernel paging request at virtual address XXX"
[Fix]
* This delivery provides a set of changes to add stability to
the mlxbf_gige driver transmit and receive processing:
Changes to mlxbf_gige_rx_packet()
---------------------------------
1) Changed logic to remove the assumption that there's at
least one packet to process. Instead, at the start of
routine check the RX CQE polarity bit, and if it is not
the expected value then exit.
2) Moved call to "dma_unmap_single()" to within the path
where packet status is OK. Otherwise if an errored
packet is received, the SKB is unmapped but no SKB is
allocated to fill that same index.
3) Defer call to "netif_receive_skb()" to end of routine
since this call can trigger more processing, even
packet transmissions, in the networking stack.
Changes to mlxbf_gige_start_xmit()
----------------------------------
1) Added logic to drop oversized packets
2) Added logic to use a spin lock when access priv->tx_pi
since this index is also accessed by the transmit
completion logic.
Changes to mlxbf_gige_handle_tx_complete
----------------------------------------
1) Added call to "mb()" to flush prev_tx_ci updates
[Test Case]
* #1 After booting platform, verify that file transfers of large files (~1GB) from a
server into the BlueField-2 platform's /tmp directory over the oob_net0 interface succeed
* #2 Configure an NFS mounted directory on BlueField-2 platform and transfer
large files over the oob_net0 interface into this directory. It is important to
ensure that the oob_net0 is used for the NFS mount, and no other active interface
will be involved. In the below example, the <peer-ip> is the IP address of the
server interface that is the peer to the BlueField-2 OOB.
1) Configure NFS server on a remote server
2) Configure NFS client on BlueField-2 platform
a) mkdir /mnt/share
b) mount -t nfs <peer-ip>:<nfs-server-mount> /mnt/share
3) Exercise file transfers over oob_net0 interface
a) cd /mnt/share
b) scp <user>@<peer-ip>:/tmp/<large-file> <local-file>
[Regression Potential]
* These changes have been well tested, but there's a chance that certain file
transfers could still experience problems (hung transfer, lost connection)
[Other]
* The mlxbf_gige driver will display v1.23 in modinfo after these changes.
David Thompson (1):
UBUNTU: SAUCE: mlxbf_gige: syncup with v1.23 content
.../ethernet/mellanox/mlxbf_gige/mlxbf_gige.h | 2 +
.../mellanox/mlxbf_gige/mlxbf_gige_main.c | 27 +++++++----
.../mellanox/mlxbf_gige/mlxbf_gige_rx.c | 45 +++++++++++++------
.../mellanox/mlxbf_gige/mlxbf_gige_tx.c | 20 +++++----
4 files changed, 64 insertions(+), 30 deletions(-)
--
2.30.1
More information about the kernel-team
mailing list