APPLIED: [SRU D][PATCH 0/2] Two crashes on raid0 error path (during a member device removal)
Khaled Elmously
khalid.elmously at canonical.com
Tue Jul 23 05:25:31 UTC 2019
On 2019-07-16 19:15:37 , Guilherme G. Piccoli wrote:
> BugLink: https://bugs.launchpad.net/bugs/1836806
>
> [Impact]
>
> * During raid0 error path testing, by removing one member of the array, we've
> noticed after kernel 4.18 we can trigger a crash depending if there's I/O in-flight
> during the array removal. When debugging the issue, a second problem was found,
> that could cause a different crash.
>
> * For the first and more relevant problem, commit cd4a4ae4683d
> ("block: don't use blocking queue entered for recursive bio submits") introduced
> the flag BIO_QUEUE_ENTERED in order BIOs that were split do bypass the blocking
> queue entering routine and use the live non-blocking version. What happens with
> md/raid0 though is that their BIOs have their underlying device changed to the
> physical disk (array member). If we remove this physical disk (or if it fails),
> we could have one BIO that had the flag changed to BIO_QUEUE_ENTERED and had the
> device changed to the removed array member (before its removal); this bio then
> skips a lot of checks in generic_make_request_checks(), triggering the following crash:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000155
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> RIP: 0010:blk_throtl_bio+0x45/0x970
> [...]
> Call Trace:
> generic_make_request_checks+0x1bf/0x690
> generic_make_request+0x64/0x3f0
> raid0_make_request+0x184/0x620 [raid0]
> ? raid0_make_request+0x184/0x620 [raid0]
> md_handle_request+0x126/0x1a0
> md_make_request+0x7b/0x180
> generic_make_request+0x19e/0x3f0
> submit_bio+0x73/0x140
> [...]
>
> * When debugging the above issue, by rebuilding the kernel with CONFIG_BLK_CGROUP=n
> we've noticed a different crash. Commit 37f9579f4c31 ("blk-mq: Avoid that submitting
> a bio concurrently with device removal triggers a crash") introduced a NULL pointer
> dereference in generic_make_request(), that manifests as:
>
> BUG: unable to handle kernel NULL pointer dereference at 0000000000000078
> PGD 0 P4D 0
> Oops: 0000 [#1] SMP PTI
> RIP: 0010:generic_make_request+0x32b/0x400
> Call Trace:
> submit_bio+0x73/0x140
> ext4_io_submit+0x4d/0x60
> ext4_writepages+0x626/0xe90
> do_writepages+0x4b/0xe0
> [...]
>
> * For both the issues, we have simple patches that are present in linux-stable
> but not in Linus tree.
> ## For issue 1 (md removal crash):
> 869eec894663 ("md/raid0: Do not bypass blocking queue entered for raid0 bios")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=869eec894663
>
> ## For issue 2 (generic_make_request() NULL dereference):
> c9d8d3e9d7a0 ("block: Fix a NULL pointer dereference in generic_make_request()")
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?id=c9d8d3e9d7a0
>
> The reasoning for both patches not being present in Linus tree is explained in
> the commit messages, but in summary Ming Lei submitted a major clean-up series
> at the same time I've submitted both patches, it wouldn't make sense to accept
> my patches to soon after remove the code paths with his clean-up. But Ming's
> series rely on legacy I/O path removal, and so it's very hard to backport.
> Hence maintainers suggested me to submit my small fixes to stable tree only.
>
> [Test case]
>
> For both cases, the test is the same, the only change being a kernel config option.
> To reproduce issue 1 (md removal crash), a regular Ubuntu kernel config is enough.
> For the issue 2, a kernel rebuild with CONFIG_BLK_CGROUP=n is necessary.
>
> Steps to reproduce:
>
> a) Create a raid0 md array with 2 NVMe devices as members, and mount it
> with an ext4 filesystem.
>
> b) Run the following oneliner (supposing the raid0 is mounted in /mnt):
> (dd of=/mnt/tmp if=/dev/zero bs=1M count=999 &); sleep 0.3;\
> echo 1 > /sys/block/nvme1n1/device/device/remove
> (whereas nvme1n1 is the 2nd array member)
>
> [Regression potential]
>
> The fixes are self-contained and small, both validated by a great number of
> subsystem maintainers (including block, raid and stable). Commit c9d8d3e9d7a0 was
> also validated by the author of the offender patch it fixes, and has no functional
> change. Commit 869eec894663 has only raid0 driver as scope, and fall-backs raid0
> to a previous behavior before the introduction of BIO_QUEUE_ENTERED flag (which
> indeed increases the amount of checks performed in BIOs), so the regression
> potential is low and restricted to raid0.
>
> Guilherme G. Piccoli (2):
> block: Fix a NULL pointer dereference in generic_make_request()
> md/raid0: Do not bypass blocking queue entered for raid0 bios
>
> block/blk-core.c | 5 ++---
> drivers/md/raid0.c | 2 ++
> 2 files changed, 4 insertions(+), 3 deletions(-)
>
> --
> 2.22.0
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list