Possible io_uring regression with QEMU on Ubuntu's kernel

Kamal Mostafa kamal at canonical.com
Fri Jul 9 19:02:56 UTC 2021


Thanks very much for your detailed analysis here, Juhyung.  Pending further
understanding of this, we'll go ahead and revert "block: don't ignore
REQ_NOWAIT for
direct IO" from the Ubuntu v5.8 kernels.  (That will be tracked in the
Launchpad bug you filed:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1935017).

 -Kamal

On Thu, Jul 8, 2021 at 3:02 AM Juhyung Park <qkrwngud825 at gmail.com> wrote:

> Hi Kamal.
>
> On Sat, Jul 3, 2021 at 2:33 AM Kamal Mostafa <kamal at canonical.com> wrote:
> >
> > Hi Juhyung-
> > [trimmed the cc: list for now]
>
> Let me add Jens and io-uring list back, juuust in case this affects
> mainline too in a way that I didn't notice.
>
> >
> > We don't doubt it.  Before we ask you to start trying all the
> intervening kernels, let's try one more targeted shot.  Here's another test
> kernel which is 5.8.0-59 with a set of md/raid patches reverted.  Those
> patches -- backports targeting the bug "raid10: Block discard is very slow"
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1896578 -- landed in
> 5.8.0-56.63_20.04.1.
> >
> > https://kernel.ubuntu.com/~kamal/uring-mdrevert1/
> >
> > TEST KERNEL 5.8.0-59.66~20.04.1+mdrevert1
> >
> > Revert "md: add md_submit_discard_bio() for submitting discard bio"
> >
> > Revert "md/raid10: extend r10bio devs to raid disks"
> >
> > Revert "md/raid10: pull the code that wait for blocked dev into one
> function"
> >
> > Revert "md/raid10: improve raid10 discard request"
> >
> > Revert "md/raid10: improve discard request for far layout"
> >
> > Revert "dm raid: remove unnecessary discard limits for raid0 and raid10"
> >
>
> The 3950X machine that had this issue as well didn't use a md device
> to QEMU and simply used a partition under an NVMe device, so it was
> unlikely that an md patch would cause the issue.
>
> I've set up a kernel build environment and manually bisected the issue
> (hence the delayed reply, apologies).
>
> It was the commit 87c9cfe0fa1fb ("block: don't ignore REQ_NOWAIT for
> direct IO").
> (Upstream commit f8b78caf21d5bc3fcfc40c18898f9d52ed1451a5)
>
> I've double checked by resetting the Git to
> Ubuntu-hwe-5.8-5.8.0-59.66_20.04.1 and reverting that patch alone.
> It fixes the issue.
>
> It seems like this patch was backported to multiple stable trees, so
> I'm not exactly sure why only Canonical's 5.8 is affected.
> FWIW, 5.8.0-61 is also affected.
>
> >
> > Also (regardless of the outcome of that test kernel), we would like to
> start tracking this with a Launchpad.net bug.  If you'd be so kind as to
> file one via https://bugs.launchpad.net/ubuntu/+source/linux/+filebug it
> would be much appreciated.
> >
>
> Yep, will do this as well.
>
> Thanks.
>
> >  -Kamal
> >
> >
> >>
> >> On Fri, Jul 2, 2021 at 2:50 AM Kamal Mostafa <kamal at canonical.com>
> wrote:
> >> >
> >> > Hi-
> >> >
> >> > Thanks very much for reporting this.  We picked up that patch
> ("io_uring: don't mark S_ISBLK async work as unbounded") for our Ubuntu
> v5.8 kernel from linux-stable/v5.10.31.  Since it's not clear that it's
> appropriate for v5.8 (or even v5.10-stable?) we'll revert it from Ubuntu
> v5.8 if you can confirm that actually fixes the problem.
> >> >
> >> > Here's a test build of that (5.8.0-59 with that commit reverted).
> The full set of packages is provided, but you probably only actually need
> to install the linux-image and linux-modules[-extra] deb's. We'll stand by
> for your results:
> >> > https://kernel.ubuntu.com/~kamal/uringrevert0/
> >> >
> >> > Thanks again,
> >> >
> >> >  -Kamal Mostafa (Canonical Kernel Team)
> >> >
> >> > On Wed, Jun 30, 2021 at 1:47 AM Juhyung Park <qkrwngud825 at gmail.com>
> wrote:
> >> >>
> >> >> Hi everyone.
> >> >>
> >> >> With the latest Ubuntu 20.04's HWE kernel 5.8.0-59, I'm noticing some
> >> >> weirdness when using QEMU/libvirt with the following storage
> >> >> configuration:
> >> >>
> >> >> <disk type="block" device="disk">
> >> >>   <driver name="qemu" type="raw" cache="none" io="io_uring"
> >> >> discard="unmap" detect_zeroes="unmap"/>
> >> >>   <source
> dev="/dev/disk/by-id/md-uuid-df271a1e:9dfb7edb:8dc4fbb8:c43e652f-part1"
> >> >> index="1"/>
> >> >>   <backingStore/>
> >> >>   <target dev="vda" bus="virtio"/>
> >> >>   <alias name="virtio-disk0"/>
> >> >>   <address type="pci" domain="0x0000" bus="0x07" slot="0x00"
> function="0x0"/>
> >> >> </disk>
> >> >>
> >> >> QEMU version is 5.2+dfsg-9ubuntu3 and libvirt version is
> 7.0.0-2ubuntu2.
> >> >>
> >> >> The guest VM is unable to handle I/O properly with io_uring, and
> >> >> nuking io="io_uring" fixes the issue.
> >> >> On one machine (EPYC 7742), the partition table cannot be read and on
> >> >> another (Ryzen 9 3950X), ext4 detects weirdness with journaling and
> >> >> ultimately remounts the guest disk to R/O:
> >> >>
> >> >> [    2.712321] virtio_blk virtio5: [vda] 3906519775 512-byte logical
> >> >> blocks (2.00 TB/1.82 TiB)
> >> >> [    2.714054] vda: detected capacity change from 0 to 2000138124800
> >> >> [    2.963671] blk_update_request: I/O error, dev vda, sector 0 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.964909] Buffer I/O error on dev vda, logical block 0, async
> page read
> >> >> [    2.966021] blk_update_request: I/O error, dev vda, sector 1 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.967177] Buffer I/O error on dev vda, logical block 1, async
> page read
> >> >> [    2.968330] blk_update_request: I/O error, dev vda, sector 2 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.969504] Buffer I/O error on dev vda, logical block 2, async
> page read
> >> >> [    2.970767] blk_update_request: I/O error, dev vda, sector 3 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.971624] Buffer I/O error on dev vda, logical block 3, async
> page read
> >> >> [    2.972170] blk_update_request: I/O error, dev vda, sector 4 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.972728] Buffer I/O error on dev vda, logical block 4, async
> page read
> >> >> [    2.973308] blk_update_request: I/O error, dev vda, sector 5 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.973920] Buffer I/O error on dev vda, logical block 5, async
> page read
> >> >> [    2.974496] blk_update_request: I/O error, dev vda, sector 6 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.975093] Buffer I/O error on dev vda, logical block 6, async
> page read
> >> >> [    2.975685] blk_update_request: I/O error, dev vda, sector 7 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.976295] Buffer I/O error on dev vda, logical block 7, async
> page read
> >> >> [    2.980074] blk_update_request: I/O error, dev vda, sector 0 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.981104] Buffer I/O error on dev vda, logical block 0, async
> page read
> >> >> [    2.981786] blk_update_request: I/O error, dev vda, sector 1 op
> >> >> 0x0:(READ) flags 0x0 phys_seg 1 prio class 0
> >> >> [    2.982083] ixgbe 0000:06:00.0: Multiqueue Enabled: Rx Queue count
> >> >> = 63, Tx Queue count = 63 XDP Queue count = 0
> >> >> [    2.982442] Buffer I/O error on dev vda, logical block 1, async
> page read
> >> >> [    2.983642] ldm_validate_partition_table(): Disk read failed.
> >> >>
> >> >> Kernel 5.8.0-55 is fine, and the only io_uring-related change between
> >> >> 5.8.0-55 and 5.8.0-59 is the commit 4b982bd0f383 ("io_uring: don't
> >> >> mark S_ISBLK async work as unbounded").
> >> >>
> >> >> The weird thing is that this commit was first introduced with v5.12,
> >> >> but neither the mainline v5.12.0 or v5.13.0 is affected by this
> issue.
> >> >>
> >> >> I guess one of these commits following the backported commit from
> >> >> v5.12 fixes the issue, but that's just a guess. It might be another
> >> >> earlier commit:
> >> >> c7d95613c7d6 io_uring: fix early sqd_list removal sqpoll hangs
> >> >> 9728463737db io_uring: fix rw req completion
> >> >> 6ad7f2332e84 io_uring: clear F_REISSUE right after getting it
> >> >> e82ad4853948 io_uring: fix !CONFIG_BLOCK compilation failure
> >> >> 230d50d448ac io_uring: move reissue into regular IO path
> >> >> 07204f21577a io_uring: fix EIOCBQUEUED iter revert
> >> >> 696ee88a7c50 io_uring/io-wq: protect against sprintf overflow
> >> >>
> >> >> It would be much appreciated if Jens could give pointers to Canonical
> >> >> developers on how to fix the issue, and hopefully a suggestion to
> >> >> prevent this from happening again.
> >> >>
> >> >> Thanks,
> >> >> Regards
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/kernel-team/attachments/20210709/148c7823/attachment-0001.html>


More information about the kernel-team mailing list