ACK/CMT: [SRU][J/I/H/F][PATCH 0/2] Fix crash on ipmi module unload

Ian May ian.may at canonical.com
Thu Dec 16 20:42:25 UTC 2021


We should look to upstream the ordering problem in the init that was called out by Cascardo.

Acked-by: Ian May <ian.may at canonical.com>

On 2021-12-16 12:21:43 , Ioanna Alifieraki wrote:
> BugLink: https://bugs.launchpad.net/bugs/1950666
> 
> [IMPACT]
> 
> Commit 3b9a907223d7 (ipmi: fix sleep-in-atomic in free_user at cleanup SRCU user->release_barrier)
> pushes the removal of an ipmi_user into the system's workqueue.
> 
> Whenever an ipmi_user struct is about to be removed it is scheduled as a work on the system's workqueue 
> to guarantee the free operation won't be executed in atomic context. When the work is executed the 
> free_user_work() function is invoked which frees the ipmi_user.
> 
> When ipmi_msghandler module is removed in cleanup_ipmi() function, there is no check if there 
> are any pending works to be executed.
> Therefore, there is a potential race condition :
> An ipmi_user is scheduled for removal and shortly after to remove the ipmi_msghandler module.
> If the scheduled work delays execution for any reason and the module is removed first, then when 
> the work is executed the pages of free_user_work() are gone and the system crashes with the following :
> 
> BUG: unable to handle page fault for address: ffffffffc05c3450
> #PF: supervisor instruction fetch in kernel mode
> #PF: error_code(0x0010) - not-present page
> PGD 635420e067 P4D 635420e067 PUD 6354210067 PMD 4711e51067 PTE 0
> Oops: 0010 [#1] SMP PTI
> CPU: 19 PID: 29646 Comm: kworker/19:1 Kdump: loaded Not tainted 5.4.0-77-generic #86~18.04.1-Ubuntu
> Hardware name: Ciara Technologies ORION RS610-G4-DTH4S/MR91-FS1-Y9, BIOS F29 05/23/2019
> Workqueue: events 0xffffffffc05c3450
> RIP: 0010:0xffffffffc05c3450
> Code: Bad RIP value.
> RSP: 0018:ffffb721333c3e88 EFLAGS: 00010286
> RAX: ffffffffc05c3450 RBX: ffff92a95f56a740 RCX: ffffb7221cfd14e8
> RDX: 0000000000000001 RSI: ffff92616040d4b0 RDI: ffffb7221cf404e0
> RBP: ffffb721333c3ec0 R08: 000073746e657665 R09: 8080808080808080
> R10: ffffb721333c3de0 R11: fefefefefefefeff R12: ffff92a95f570700
> R13: ffff92a0a40ece40 R14: ffffb7221cf404e0 R15: 0ffff92a95f57070
> FS: 0000000000000000(0000) GS:ffff92a95f540000(0000) knlGS:0000000000000000
> CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> CR2: ffffffffc05c3426 CR3: 00000081e9bfc005 CR4: 00000000007606e0
> DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> PKRU: 55555554
> Call Trace:
> ? process_one_work+0x20f/0x400
> worker_thread+0x34/0x410
> kthread+0x121/0x140
> ? process_one_work+0x400/0x400
> ? kthread_park+0x90/0x90
> ret_from_fork+0x35/0x40
> Modules linked in: xt_REDIRECT xt_owner ipt_rpfilter xt_CT xt_multiport xt_set ip_set_hash_ip veth xt_statistic ipt_REJECT
> ... megaraid_sas ahci libahci wmi [last unloaded: ipmi_msghandler]
> CR2: ffffffffc05c3450
> 
> [TEST CASE]
> 
> The user who reported the issue can reproduce reliably by stopping the ipmi related services and then removing the ipmi modules.
> I could reproduce the issue only when turning the normal 'work' to delayed work.
> 
> [WHERE PROBLEMS COULD OCCUR]
> 
> The fixing patch creates a dedicated workqueue for the remove_work struct of ipmi_user when loading the ipmi_msghandler 
> modules and destroys the workqueue when removing the module. Therefore any potential problems would occur during these two 
> operations or when scheduling works on the dedicated workqueue.
> 
> [OTHER]
> 
> Upstream patches :
> 1d49eb91e86e (ipmi: Move remove_work to dedicated workqueue)
> 5a3ba99b62d8 (ipmi: msghandler: Make symbol 'remove_work_wq' static)
> 
> 
> Ioanna Alifieraki (1):
>   ipmi: Move remove_work to dedicated workqueue
> 
> Wei Yongjun (1):
>   ipmi: msghandler: Make symbol 'remove_work_wq' static
> 
>  drivers/char/ipmi/ipmi_msghandler.c | 13 ++++++++++++-
>  1 file changed, 12 insertions(+), 1 deletion(-)
> 
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



More information about the kernel-team mailing list