NAK: [SRU][J:linux-bluefield][PATCH v1 0/9] Kernel panic in restart driver after configuring IPsec full offload

Tim Gardner tim.gardner at canonical.com
Thu Jan 4 16:17:44 UTC 2024


On 12/24/23 11:20 PM, Tony Duan wrote:
> BugLink: https://bugs.launchpad.net/bugs/2044427
> 
> SRU Justification:
> 
> [Impact]
> 
> * This patch ported some fixes related to xfrm to avoid crash in some cases
> 
> [Fix]
> 
> * cherry-pick afa8cc09c0effbc6532b4a6d89027c63a4f4dfa2 afa8cc0 net: xfrm: Fix xfrm_address_filter OOB read
>    cherry-pick 027657f5b0e5786fb4a3f81f0c56807128c38e8d 027657f xfrm: add forgotten nla_policy for XFRMA_MTIMER_THRESH
>    cherry-pick e2cfb0384b887db477b969e998c53c4745513f92 e2cfb03 xfrm: Silence warnings triggerable by bad packets
>    cherry-pick 7cbe43787657bc3d6edd175ba3e486980a89afdf 7cbe437 xfrm: Remove inner/outer modes from input path
>    cherry-pick 7e4e5880259f9e85d322969577a36f61d98deff4 7e4e588 net: xfrm: Amend XFRMA_SEC_CTX nla_policy structure
>    cherry-pick 92ad4f000093dcb14dd131a2fd7bf7d59ae956c0 92ad4f0 net: af_key: fix sadb_x_filter validation
>    cherry-pick 4c8893c6d1f25a9d04740afc27ce0166d1662609 4c8893c xfrm: Flush xfrm state synchronously on netdev close or unregister
>    backport 1a18e06a37ae5c0eb83f47bdc91a3923a7c21c6f 1a18e06 xfrm: get global statistics from the offloaded device
>    backport aabb407c261858f1b772eb1f4fa92bc38a203098 aabb407 xfrm: generalize xdo_dev_state_update_curlft to allow statistics update
> 
> [Test Plan]
> 
> * Restarting the driver with IPsec full offload transparent mode configuration causes kernel panic.
> Kernel version is linux-bluefield 5.15
> 
> Test step:
> 1) configure xfrm rules
> 2) configure VF
> 3) configure FW steering mode
> 4) restart driver
> 5) check dmesg
> 
> Test result:
>   [ 937.989359] ------------[ cut here ]------------
>   [ 937.989786] WARNING: CPU: 11 PID: 60463 at /tmp/23.10-0.1.8/6.5.0-rc6_mlnx/fedora_32/mlnx-ofa_kernel/BUILD/mlnx-ofa_kernel-23.10/obj/default/drivers/net/ethernet/mellanox/mlx5/core/en_accel/ipsec_fs.c:1828 mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core]
>   [ 937.991698] fuse virtio_net net_failover failover [last unloaded: vdpa]
>   [ 937.999155] CPU: 11 PID: 60463 Comm: modprobe Tainted: G OE 6.5.0-rc6_mlnx #1
>   [ 937.999891] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.13.0-0-gf21b5a4aeb02-prebuilt.qemu.org 04/01/2014
>   [ 938.000823] RIP: 0010:mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core]
>   [ 938.001459] Code: f6 45 31 c0 48 89 ea 31 ff e8 d4 d5 df ff 59 e9 8c fe ff ff c3 0f 0b e9 3b fe ff ff 0f 0b e9 e8 fd ff ff 0f 0b e9 07 fe ff ff <0f> 0b e9 65 fe ff ff 0f 0b e9 82 fe ff ff 66 2e 0f 1f 84 00 00 00
>   [ 938.002949] RSP: 0018:ffffc90001183c08 EFLAGS: 00010202
>   [ 938.003418] RAX: 0000000000000000 RBX: ffff8882f3869c00 RCX: 0000000000000001
>   [ 938.004024] RDX: ffffffff82a305c0 RSI: 0000000000000002 RDI: ffff888103aa2b30
>   [ 938.004624] RBP: ffff888103aa2d80 R08: 0000000000000001 R09: ffff888100042800
>   [ 938.005238] R10: 0000000000000002 R11: ffffc90001183ba8 R12: ffff8881312e6800
>   [ 938.005836] R13: ffff8881127401a0 R14: ffff8881312e6800 R15: ffff888148bbd160
>   [ 938.006444] FS: 00007fd22b82c740(0000) GS:ffff88885fac0000(0000) knlGS:0000000000000000
>   [ 938.009456] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>   [ 938.009970] CR2: 00007f26ca697000 CR3: 000000012e73f003 CR4: 0000000000770ee0
>   [ 938.010568] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
>   [ 938.011173] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
>   [ 938.011772] PKRU: 55555554
>   [ 938.012065] Call Trace:
>   [ 938.012333]
>   [ 938.012583] ? __warn+0x7d/0x120
>   [ 938.012921] ? mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core]
>   [ 938.013494] ? report_bug+0xf1/0x1c0
>   [ 938.013850] ? handle_bug+0x44/0x70
>   [ 938.014201] ? exc_invalid_op+0x13/0x60
>   [ 938.014568] ? asm_exc_invalid_op+0x16/0x20
>   [ 938.014970] ? mlx5e_accel_ipsec_fs_cleanup+0x298/0x2b0 [mlx5_core]
>   [ 938.015532] ? mlx5e_accel_ipsec_fs_cleanup+0xf2/0x2b0 [mlx5_core]
>   [ 938.016093] mlx5e_ipsec_cleanup+0x1e/0x100 [mlx5_core]
>   [ 938.016594] mlx5e_detach_netdev+0x46/0x80 [mlx5_core]
>   [ 938.017098] mlx5e_vport_rep_unload+0x147/0x1a0 [mlx5_core]
>   [ 938.017623] mlx5_eswitch_unregister_vport_reps+0x13e/0x190 [mlx5_core]
>   [ 938.018221] auxiliary_bus_remove+0x18/0x30
>   [ 938.018616] device_release_driver_internal+0xaa/0x130
>   [ 938.019076] bus_remove_device+0xc3/0x130
>   [ 938.019451] device_del+0x157/0x380
>   [ 938.019792] ? kobject_put+0xb3/0x200
>   [ 938.020153] delete_drivers+0x72/0xa0 [mlx5_core]
>   [ 938.020608] mlx5_unregister_device+0x34/0x70 [mlx5_core]
>   [ 938.021113] mlx5_uninit_one+0x25/0x130 [mlx5_core]
>   [ 938.021572] remove_one+0x72/0xc0 [mlx5_core]
>   [ 938.022002] pci_device_remove+0x31/0xb0
>   [ 938.022376] device_release_driver_internal+0xaa/0x130
>   [ 938.022827] driver_detach+0x3f/0x80
>   [ 938.023181] bus_remove_driver+0x69/0xe0
>   [ 938.023553] pci_unregister_driver+0x22/0x90
>   [ 938.023957] mlx5_cleanup+0xc/0x4c [mlx5_core]
>   [ 938.024384] __x64_sys_delete_module+0x157/0x280
>   [ 938.024806] do_syscall_64+0x34/0x80
>   [ 938.025163] entry_SYSCALL_64_after_hwframe+0x46/0xb0
>   [ 938.025616] RIP: 0033:0x7fd22b93812b
>   [ 938.025969] Code: 73 01 c3 48 8b 0d 6d 0d 0c 00 f7 d8 64 89 01 48 83 c8 ff c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa b8 b0 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 3d 0d 0c 00 f7 d8 64 89 01 48
>   [ 938.027458] RSP: 002b:00007ffce1ea2658 EFLAGS: 00000206 ORIG_RAX: 00000000000000b0
>   [ 938.028129] RAX: ffffffffffffffda RBX: 000055b5a4efb3b0 RCX: 00007fd22b93812b
>   [ 938.028719] RDX: 0000000000000000 RSI: 0000000000000800 RDI: 000055b5a4efb418
>   [ 938.029327] RBP: 000055b5a4efb3b0 R08: 1999999999999999 R09: 0000000000000000
>   [ 938.029932] R10: 00007fd22b9acac0 R11: 0000000000000206 R12: 0000000000000000
>   [ 938.030529] R13: 000055b5a4efb418 R14: 000055b5a4efe350 R15: 000055b5a4efb150
>   [ 938.031134]
>   [ 938.031388] ---[ end trace 0000000000000000 ]---
> 
> [Where problems could occur]
> 
> * Without this patch, it will see kernel panic info in dmesg
> 
> [Other Info]
> 
> * nothing
> 
> Herbert Xu (2):
>    xfrm: Remove inner/outer modes from input path
>    xfrm: Silence warnings triggerable by bad packets
> 
> Jianbo Liu (1):
>    xfrm: Flush xfrm state synchronously on netdev close or unregister
> 
> Leon Romanovsky (2):
>    xfrm: generalize xdo_dev_state_update_curlft to allow statistics
>      update
>    xfrm: get global statistics from the offloaded device
> 
> Lin Ma (4):
>    net: af_key: fix sadb_x_filter validation
>    net: xfrm: Amend XFRMA_SEC_CTX nla_policy structure
>    xfrm: add forgotten nla_policy for XFRMA_MTIMER_THRESH
>    net: xfrm: Fix xfrm_address_filter OOB read
> 
>   Documentation/networking/xfrm_device.rst |  4 +-
>   include/linux/netdevice.h                |  2 +-
>   include/net/xfrm.h                       | 14 +++---
>   net/key/af_key.c                         |  4 +-
>   net/xfrm/xfrm_compat.c                   |  2 +-
>   net/xfrm/xfrm_input.c                    | 78 +++++++++++---------------------
>   net/xfrm/xfrm_proc.c                     |  1 +
>   net/xfrm/xfrm_state.c                    | 19 ++++++--
>   net/xfrm/xfrm_user.c                     | 14 +++++-
>   9 files changed, 69 insertions(+), 69 deletions(-)
> 

Cap the thread in anticipation of a v2.
-- 
-----------
Tim Gardner
Canonical, Inc




More information about the kernel-team mailing list