[SRU][F/J:linux-bluefield][PATCH v1 0/1] UBUNTU: SAUCE: mlxbf-gige: Fix kernel panic at shutdown

Asmaa Mnebhi asmaa at nvidia.com
Fri Jun 2 17:04:24 UTC 2023


BugLink: https://bugs.launchpad.net/bugs/2022370

SRU Justification:

[Impact]

We occasionally see a race condition (once every 350 reboots) where napi is still
running (mlxbf_gige_poll) while a shutdown has been initiated through "reboot".
Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as
a result causes a kernel panic.

[Fix]

The fix is to explicitly disable napi and dequeue it during shutdown.
mlxbf_gige_remove already calls:
unregister_netdev->unregister_netdevice->unregister_netdev_queue->
rollback_registered->rollback_registered_many->dev_close_many->
__dev_close_many->ndo_stop->mlxbf_gige_stop which stops napi

So use mlxbf_gige_remove in place of the existing shutdown logic.

[Test Case]

* Issue at least 1000 reboots from linux and make sure there is no panic caused by the mlxbf-gige driver.

[Regression Potential]

* since this issue is hard to reproduce, it hasn't been tested thoroughly yet. so it needs several reboot loops to validate it.



More information about the kernel-team mailing list