APPLIED: [SRU][F/J:linux-bluefield][PATCH v1 0/1] UBUNTU: SAUCE: mlxbf-gige: Fix kernel panic at shutdown
Bartlomiej Zolnierkiewicz
bartlomiej.zolnierkiewicz at canonical.com
Fri Jun 23 11:58:44 UTC 2023
Applied to focal:linux-bluefield/master-next and
jammy:linux-bluefield/master-next. Thanks.
--
Best regards,
Bartlomiej
On Fri, Jun 2, 2023 at 7:05 PM Asmaa Mnebhi <asmaa at nvidia.com> wrote:
>
> BugLink: https://bugs.launchpad.net/bugs/2022370
>
> SRU Justification:
>
> [Impact]
>
> We occasionally see a race condition (once every 350 reboots) where napi is still
> running (mlxbf_gige_poll) while a shutdown has been initiated through "reboot".
> Since mlxbf_gige_poll is still running, it tries to access a NULL pointer and as
> a result causes a kernel panic.
>
> [Fix]
>
> The fix is to explicitly disable napi and dequeue it during shutdown.
> mlxbf_gige_remove already calls:
> unregister_netdev->unregister_netdevice->unregister_netdev_queue->
> rollback_registered->rollback_registered_many->dev_close_many->
> __dev_close_many->ndo_stop->mlxbf_gige_stop which stops napi
>
> So use mlxbf_gige_remove in place of the existing shutdown logic.
>
> [Test Case]
>
> * Issue at least 1000 reboots from linux and make sure there is no panic caused by the mlxbf-gige driver.
>
> [Regression Potential]
>
> * since this issue is hard to reproduce, it hasn't been tested thoroughly yet. so it needs several reboot loops to validate it.
More information about the kernel-team
mailing list