mlx5_core reports hardware checksum error for padded packets on Mellanox NICs

Matthew Ruffell matthew.ruffell at canonical.com
Tue Dec 10 23:24:51 UTC 2019

BugLink: https://bugs.launchpad.net/bugs/1854842


On machines equipped with Mellanox NIC's, in this particular case, Mellanox 5 
series NICs using the mlx5_core driver, there is a kernel splat when sending
large IP packets which have padding at the end.

enp6s0f0: hw csum failure
CPU: 19 PID: 0 Comm: swapper/19 Not tainted 4.15.0-72-generic
Call Trace:
icmp_error+0x27d/0x310 [nf_conntrack_ipv4]
nf_conntrack_in+0x15a/0x510 [nf_conntrack]
? __skb_checksum+0x68/0x330
ipv4_conntrack_in+0x1c/0x20 [nf_conntrack_ipv4]
? skb_send_sock+0x50/0x50
? inet_del_offload+0x40/0x40
? __netif_receive_skb+0x18/0x60
mlx5e_handle_rx_cqe+0x48d/0x5e0 [mlx5_core]
? enqueue_task_rt+0x1b4/0x2e0
mlx5e_poll_rx_cq+0xd1/0x8c0 [mlx5_core]
mlx5e_napi_poll+0x9d/0x290 [mlx5_core]

This bug is a further attempt to fix these splats, as there has been previous
fixes in LP #1840854 and a series of commits which landed in 4.15.0-67 
(LP #1847155) as a part of upstream -stable patches.

This bug will also fix the same problems on the new Mellanox CX6 and Bluefield 
hardware, which has been enabled already via previous upstream -stable patches 
which landed in LP #1847155.


This particular issue was fixed for Mellanox series 5 drivers in the following 

commit 0aa1d18615c163f92935b806dcaff9157645233a
Author: Saeed Mahameed <saeedm at mellanox.com>
Date:   Tue Mar 12 00:24:52 2019 -0700
Subject: net/mlx5e: Rx, Fixup skb checksum for packets with tail padding

This commit required a minor backport.

This commit was selected for upstream -stable in 4.19.76 and 5.0.10.
This commit appears to be omitted from "Bionic update: upstream stable patchset 
2019-10-07", which is LP #1847155, probably due to requiring a backport.

commit db849faa9bef993a1379dc510623f750a72fa7ce
Author: Saeed Mahameed <saeedm at mellanox.com>
Date:   Fri May 3 13:14:59 2019 -0700
Subject: net/mlx5e: Rx, Fix checksum calculation for new hardware

This commit required a minor backport.

This commit was selected for upstream -stable in 5.1.21 and 5.2.4.
This commit has already been applied to the disco kernel, as part of stable 


The following scapy script will reproduce this issue. Run from the machine with 
the Mellanox series 5 NIC:

1) a=Ether(dst='ff:ff:ff:ff:ff:ff')/IP(dst='')/ICMP()/

2) sendp(a, iface='enp6s0f0')

3) Check dmesg on the receiver side. The example uses localhost, so check dmesg.

I have built some test kernels, which are available here:

This kernel contains 0aa1d18615c163f92935b806dcaff9157645233a.


This kernel contains db849faa9bef993a1379dc510623f750a72fa7ce.

If you install the test kernels the issue is resolved.

[Regression Potential]

The changes are limited to the mlx5_core driver, and only modify how packet 
checksums are calculated when padding is involved.

Both patches have been accepted and published by upstream -stable, and are 
widely accepted by the community.

Because of this, I believe the risk of regression is low.

Saeed Mahameed (2):
  net/mlx5e: Rx, Fixup skb checksum for packets with tail padding
  net/mlx5e: Rx, Fix checksum calculation for new hardware

 drivers/net/ethernet/mellanox/mlx5/core/en.h  |  1 +
 .../net/ethernet/mellanox/mlx5/core/en_main.c |  5 ++
 .../net/ethernet/mellanox/mlx5/core/en_rx.c   | 85 +++++++++++++++----
 .../ethernet/mellanox/mlx5/core/en_stats.c    |  4 +
 .../ethernet/mellanox/mlx5/core/en_stats.h    |  4 +
 include/linux/mlx5/mlx5_ifc.h                 |  3 +-
 6 files changed, 85 insertions(+), 17 deletions(-)


