APPLIED: [PATCH 0/1][Focal] Backport mlx5e fix for tunnel offload

Kleber Souza kleber.souza at canonical.com
Fri Apr 9 10:59:04 UTC 2021


On 05.04.21 15:31, Tim Gardner wrote:
> [SRU Justification]
> 
> We've discovered an issue on Ubuntu 20.04 when used with Kubernetes CNIs that
> perform offloading using Geneve that causes the kernel to panic on Azure
> instances with accelerated networking with the following errors:
> 
> [ 307.561223] mlx5_core 0001:00:02.0 enP1s1: Error cqe on cqn 0x200, ci 0x3d4, sqn 0x2c5, opcode 0xd, syndrome 0x2, vendor syndrome 0x68
> [ 307.573864] mlx5_core 0001:00:02.0 enP1s1: ERR CQE on SQ: 0x2c5
> [ 307.764902] mlx5_core 0001:00:02.0 enP1s1: Error cqe on cqn 0x200, ci 0x3d7, sqn 0x2c5, opcode 0xd, syndrome 0x2, vendor syndrome 0x68
> [ 307.777332] mlx5_core 0001:00:02.0 enP1s1: ERR CQE on SQ: 0x2c5
> [ 322.814393] mlx5_core 0001:00:02.0 enP1s1: Error cqe on cqn 0x218, ci 0x1a7, sqn 0x2bd, opcode 0xd, syndrome 0x2, vendor syndrome 0x68
> [ 322.826685] mlx5_core 0001:00:02.0 enP1s1: ERR CQE on SQ: 0x2bd
> 
> NVIDIA fixed this issue in https://github.com/torvalds/linux/commit/5ccc0ecda9e8a67add654d93d7e0ac4346c0fa22,
> so we're looking to have this backported to at least the linux-azure package.
> 
> [Test Plan]
> https://bugs.launchpad.net/ubuntu/bionic/+source/linux-azure/+bug/1921769/comments/6
> (waiting on response, but now I've seen a 2nd request in LP#1922472)
> 
> [Where problems could occur]
> Released in stable kernels:
> linux-5.10.y
> linux-5.11.y
> 
> [Other Info]
> None
> 
> 

Applied to focal/linux.

Thanks,
Kleber



More information about the kernel-team mailing list