[B/D/E] [PATCH 0/1] md raid0/linear doesn't show error state if an array member is removed and allows successful writes

Sun Oct 13 23:47:05 UTC 2019

BugLink: https://bugs.launchpad.net/bugs/1847773

[Impact]

* Currently, mounted raid0/md-linear arrays have no indication/warning when one
or more members are removed or suffer from some non-recoverable error condition.

* Given that, arrays keep mounted, and regular written data to it goes through
page cache and appear as successful written to the devices, despite writeback
threads can't write to it. For users, it can potentially cause data corruption,
given that even "sync" command will return success despite the data is not
written to the disk. Kernel messages will show I/O errors though.

* The patch proposed in this SRU addresses this issue in 2 levels; first, it
fast-fails written I/Os to the raid0/md-linear array devices with one or more
failed members. Also, it introduces the "broken" state, which is analog to
"clean" but indicates that array is not in a good/correct state. A message
showed indmesg helps to clarify when such array gets a member removed/failed.

* The commit proposed here, available on Linus tree as 62f7b1989c02
("md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone")
[http://git.kernel.org/linus/62f7b1989c02], was pretty discussed upstream and
received a good amount of reviews/analysis by both the current md maintainer
as well as an old maintainer.

* One important note here is that this patch requires a counter-part in mdadm
tool to be fully functional, which was SRUed in LP: #1847924.
It works fine without this counter-part, but in case of broken arrays, the
"mdadm --detail" command won't show broken, and instead show as "clean, FAILED".

* We ask hereby an exception from kernel team to have this backported to kernel
4.15 *only in Bionic* and not in Xenial. The reason is that mdadm code changed
too much and we didn't want to introduce a potential regression in the Xenial
version from that tool, so we only backported the mdadm counter-part of this
patch to Bionic, Disco and Eoan - hence, we'd like to have a match in the kernel
backported versions.

[Test case]

* To test this patch, create a raid0 or linear md array on Linux using mdadm,
like in:
"mdadm --create md0 --level=0 --raid-devices=2 /dev/nvme0n1 /dev/nvme1n1";

* Format the array using a filesystem of your choice (for example ext4) and
mount the array;

* Remove one member of the array, for example using sysfs interface (for nvme:
echo 1 > /sys/block/nvme0nX/device/device/remove, for scsi:
echo 1 > /sys/block/sdX/device/delete);

* Without this patch, the array partition can be written with success, and
"mdadm --detail" will show clean state.

[Regression potential]

* There's not much potential regression here; we failed written I/Os to bad
arrays and show message/status according to it, showing the array broken status.
We believe the most common "issue" that could be reported from this patch is if
an userspace tool rely on success of I/O writes or in the "clean" state of an
array - after this patch it can potentially have a different behavior in case
of a broken array.

Guilherme G. Piccoli (1):
  md raid0/linear: Mark array as 'broken' and fail BIOs if a member is gone

 drivers/md/md-linear.c |  5 +++++
 drivers/md/md.c        | 22 ++++++++++++++++++----
 drivers/md/md.h        | 16 ++++++++++++++++
 drivers/md/raid0.c     |  6 ++++++
 4 files changed, 45 insertions(+), 4 deletions(-)

-- 
2.23.0