ACK/Cmnt: [B][PATCH 00/11] LP#1829563 bcache: risk of data loss on I/O errors in backing or caching devices
andrea.righi at canonical.com
Fri Jul 12 06:25:22 UTC 2019
On Sun, Jul 07, 2019 at 09:50:27PM -0300, Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1829563
> Note: the patchset is relatively large because the support
> for error detection/handling is mostly non-existent in 4.15.
> All patches are in Cosmic/4.18 except PATCH 11 (other series).
> * The bcache code in Bionic lacks several fixes to handle
> I/O errors in both backing devices and caching devices.
> * Partial or permanent errors in backing or caching devices,
> specially in writeback mode, can lead to data loss and/or
> the application is not notified about failed I/O requests.
> * The bcache device might remain available for I/O requests
> even if backing device is offline, so writes are undefined.
> [Test Case]
> * Detailed test cases/steps for the behavior of many patches
> with code logic changes are provided in bug comments.
> * The patchset has been tested for regressions on each cache
> mode (writethrough, writeback, writearound, none) with the
> xfstests test suite (on ext4) and fio (sequential & random
> [Regression Potential]
> * The patchset is relatively large and touches several areas
> in bcache code, however, synthetic testing of the patches
> has been performed, and extensive regression/stress tests
> were run (as mentioned in Test Case section).
> * Many patches in the patchset are 'Fixes' patches to other
> patches, and no further 'Fixes' currently exist upstream.
> [Other Info]
> * Canonical Field Eng. deploys bcache+writeback extensively
> (e.g., BootStack, UA cloud, except rare all-flash cases).
> [Original Bug Description]
> This is a request for a backport of the following upstream patch from 4.18:
> "bcache: stop bcache device when backing device is offline"
> Field engineering uses bcache quite extensively and it would be good to have this in the GA/bionic kernel.
> Coly Li (9):
> bcache: add CACHE_SET_IO_DISABLE to struct cache_set flags
> bcache: add stop_when_cache_set_failed option to backing device
> bcache: add backing_request_endio() for bi_end_io
> bcache: add io_disable to struct cached_dev
> bcache: store disk name in struct cache and struct cached_dev
> bcache: count backing device I/O error for writeback I/O
> bcache: add wait_for_kthread_stop() in bch_allocator_thread()
> bcache: set dc->io_disable to true in conditional_stop_bcache_device()
> bcache: stop bcache device when backing device is offline
> Tang Junhui (2):
> bcache: fix inaccurate io state for detached bcache devices
> bcache: fix ioctl in flash device
> drivers/md/bcache/alloc.c | 8 +-
> drivers/md/bcache/bcache.h | 54 +++++++++
> drivers/md/bcache/btree.c | 11 +-
> drivers/md/bcache/debug.c | 3 +-
> drivers/md/bcache/io.c | 20 +++-
> drivers/md/bcache/journal.c | 4 +-
> drivers/md/bcache/request.c | 186 +++++++++++++++++++++++++-----
> drivers/md/bcache/super.c | 207 +++++++++++++++++++++++++++++-----
> drivers/md/bcache/sysfs.c | 50 +++++++-
> drivers/md/bcache/util.h | 6 -
> drivers/md/bcache/writeback.c | 37 ++++--
> 11 files changed, 497 insertions(+), 89 deletions(-)
I had to backport the same commits for a separate bug
(https://bugs.launchpad.net/bugs/1796292) and I produced the same
patches with positive test results, therefore:
Acked-by: Andrea Righi <andrea.righi at canonical.com>
More information about the kernel-team