ACK: [SRU D][PATCH 0/2] Two crashes on raid0 error path (during a member device removal)

Connor Kuehl connor.kuehl at canonical.com
Fri Jul 19 15:50:40 UTC 2019


On 7/16/19 3:15 PM, Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1836806
> 
> [Impact]
> 
> * During raid0 error path testing, by removing one member of the array, we've
> noticed after kernel 4.18 we can trigger a crash depending if there's I/O in-flight
> during the array removal. When debugging the issue, a second problem was found,
> that could cause a different crash.
> 
> * For the first and more relevant problem, commit cd4a4ae4683d
> ("block: don't use blocking queue entered for recursive bio submits") introduced
> the flag BIO_QUEUE_ENTERED in order BIOs that were split do bypass the blocking
> queue entering routine and use the live non-blocking version. What happens with
> md/raid0 though is that their BIOs have their underlying device changed to the
> physical disk (array member). If we remove this physical disk (or if it fails),
> we could have one BIO that had the flag changed to BIO_QUEUE_ENTERED and had the
> device changed to the removed array member (before its removal); this bio then
> skips a lot of checks in generic_make_request_checks(), triggering the following crash:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000155
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> RIP: 0010:blk_throtl_bio+0x45/0x970
> [...]
> Call Trace:
>  generic_make_request_checks+0x1bf/0x690
>  generic_make_request+0x64/0x3f0
>  raid0_make_request+0x184/0x620 [raid0]
>  ? raid0_make_request+0x184/0x620 [raid0]
>  md_handle_request+0x126/0x1a0
>  md_make_request+0x7b/0x180
>  generic_make_request+0x19e/0x3f0
>  submit_bio+0x73/0x140
> [...]
> 
> * When debugging the above issue, by rebuilding the kernel with CONFIG_BLK_CGROUP=n
> we've noticed a different crash. Commit 37f9579f4c31 ("blk-mq: Avoid that submitting
> a bio concurrently with device removal triggers a crash") introduced a NULL pointer
> dereference in generic_make_request(), that manifests as:
> 
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> RIP: 0010:generic_make_request+0x32b/0x400
> Call Trace:
>  submit_bio+0x73/0x140
>  ext4_io_submit+0x4d/0x60
>  ext4_writepages+0x626/0xe90
>  do_writepages+0x4b/0xe0
> [...]
> 
> * For both the issues, we have simple patches that are present in linux-stable
> but not in Linus tree.
> ## For issue 1 (md removal crash):
> 869eec894663 ("md/raid0: Do not bypass blocking queue entered for raid0 bios")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=869eec894663
> 
> ## For issue 2 (generic_make_request() NULL dereference):
> c9d8d3e9d7a0 ("block: Fix a NULL pointer dereference in generic_make_request()")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c9d8d3e9d7a0
> 
> The reasoning for both patches not being present in Linus tree is explained in
> the commit messages, but in summary Ming Lei submitted a major clean-up series
> at the same time I've submitted both patches, it wouldn't make sense to accept
> my patches to soon after remove the code paths with his clean-up. But Ming's
> series rely on legacy I/O path removal, and so it's very hard to backport.
> Hence maintainers suggested me to submit my small fixes to stable tree only.
> 
> [Test case]
> 
> For both cases, the test is the same, the only change being a kernel config option.
> To reproduce issue 1 (md removal crash), a regular Ubuntu kernel config is enough.
> For the issue 2, a kernel rebuild with CONFIG_BLK_CGROUP=n is necessary.
> 
> Steps to reproduce:
> 
> a) Create a raid0 md array with 2 NVMe devices as members, and mount it
> with an ext4 filesystem.
> 
> b) Run the following oneliner (supposing the raid0 is mounted in /mnt):
> (dd of=/mnt/tmp if=/dev/zero bs=1M count=999 &); sleep 0.3;\
> echo 1 > /sys/block/nvme1n1/device/device/remove
> (whereas nvme1n1 is the 2nd array member)
> 
> [Regression potential]
> 
> The fixes are self-contained and small, both validated by a great number of
> subsystem maintainers (including block, raid and stable). Commit c9d8d3e9d7a0 was
> also validated by the author of the offender patch it fixes, and has no functional
> change. Commit 869eec894663 has only raid0 driver as scope, and fall-backs raid0
> to a previous behavior before the introduction of BIO_QUEUE_ENTERED flag (which
> indeed increases the amount of checks performed in BIOs), so the regression
> potential is low and restricted to raid0.
> 
> Guilherme G. Piccoli (2):
>   block: Fix a NULL pointer dereference in generic_make_request()
>   md/raid0: Do not bypass blocking queue entered for raid0 bios
> 
>  block/blk-core.c   | 5 ++---
>  drivers/md/raid0.c | 2 ++
>  2 files changed, 4 insertions(+), 3 deletions(-)
> 

Acked-by: Connor Kuehl <connor.kuehl at canonical.com>




More information about the kernel-team mailing list