ACK: [PATCH 0/1] [focal:linux, groovy:linux, hirsute:linux, impish:linux] block: return the correct bvec when checking for gaps
Guilherme Piccoli
gpiccoli at canonical.com
Tue Jul 6 18:57:56 UTC 2021
On Tue, Jul 6, 2021 at 2:15 PM Tim Gardner <tim.gardner at canonical.com> wrote:
>
> [Impact]
>
> There is a bug in the Linux block layer responsible for merging BIOs that go
> across the page boundary. This bug was introduced in Linux 5.1 when the block
> layer BIO page tracking is enhanced to support multiple pages.
>
> Without this patch, data corruption can occur. The change to the kernel block
> layer in Linux 5.1 changes the way multiple pages are merged to a single block
> I/O descriptor, and how contiguous block I/O descriptors are merged with previous
> descriptors.
>
> If contiguous block I/O requests cross a page boundary of 4k, defined by the hv_storvsc
> driver, the new block merge process can create two pages of block I/O requests (the
> latter page with an offset) that refer to the same physical sector on disk. This page list
> is then assembled for the SCSI generic driver.
>
> In the above scenario, when the block I/O request sizes are 512 bytes, the Azure LIS driver
> (hv_storvsc module) is not able to correctly parse the page array from the SCSI generic
> driver due to this bug in Linux block layer and creates a potential overflow of
> offset I/O requests and corruption of data on disk.
>
> Mitigation of data loss is proven with filesystems with block size 4k. When block
> I/O requests are of sizes 4k or multiples of 4k, they are the page aligned in the
> memory and are not affected by the block I/O merging algorithm introduced in Linux
> 5.1. Most modern file systems use 4k I/O block size by default, thus mitigating
> this problem.
>
> An upstream patch fixes this bug: commit c9c9762d4d44dcb1b2ba90cfb4122dc11ceebf31
> ("block: return the correct bvec when checking for gaps")
>
> Please include this patch in any supported kernels that are 5.1 or later.
>
> [Test Plan]
>
> stress-ng --sequential 8 --class io -t 5m --times
>
> [Where problems could occur]
>
> Different incorrect pages could be wriiten to disk.
>
> [Other Info]
>
> This patch has already been released in all [FGHI] Azure kernels.
>
Thanks Tim, seems a solid fix for me!
Acked-by: Guilherme G. Piccoli <gpiccoli at canonical.com>
More information about the kernel-team
mailing list