[SRU][F/J:linux-bluefield][PATCH v1 0/1] UBUNTU: SAUCE: mlxbf-gige: Fix kernel panic at shutdown
Asmaa Mnebhi
asmaa at nvidia.com
Fri Jun 2 17:04:24 UTC 2023
BugLink: https://bugs.launchpad.net/bugs/2022370
SRU Justification:
[Impact]
We occasionally see a race condition (once every 350 reboots) where napi is still
running (mlxbf_gige_poll) while a shutdown has been initiated through "reboot".
Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as
a result causes a kernel panic.
[Fix]
The fix is to explicitly disable napi and dequeue it during shutdown.
mlxbf_gige_remove already calls:
unregister_netdev->unregister_netdevice->unregister_netdev_queue->
rollback_registered->rollback_registered_many->dev_close_many->
__dev_close_many->ndo_stop->mlxbf_gige_stop which stops napi
So use mlxbf_gige_remove in place of the existing shutdown logic.
[Test Case]
* Issue at least 1000 reboots from linux and make sure there is no panic caused by the mlxbf-gige driver.
[Regression Potential]
* since this issue is hard to reproduce, it hasn't been tested thoroughly yet. so it needs several reboot loops to validate it.
More information about the kernel-team
mailing list