APPLIED: [C][PATCH 0/1] Fix write()/fsync() deadlock in write_cache_pages()
Khaled Elmously
khalid.elmously at canonical.com
Tue Apr 23 05:14:14 UTC 2019
On 2019-04-15 13:32:09 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1824827
>
> [Impact]
>
> * Tasks of a multi-threaded workload doing write() and fsync()
> might deadlock in write_cache_pages(), preventing progress.
>
> * The fix addresses a corner case in write_cache_pages() on
> the range_cyclic implementation which allows the deadlock.
>
> * Patch:
> - commit 64081362e8ff4587b4554087f3cfc73d3e0a4cd7
> ("mm/page-writeback.c: fix range_cyclic writeback vs
> writepages deadlock"), present in v4.20-rc1~92^2~19.
>
> [Test Case]
>
> * This issue originally has been hit by the 'perforce' (p4d)
> tool in a XFS filesystem, but it's difficult/rare to occur.
>
> * We've written an userspace + kernel module (kprobes-based)
> to reproduce this problem and verify the test kernel/patch.
>
> * The kprobes are strictly tied to particular kernel versions
> because of the assembly instruction offsets. We'll provide
> updated versions for -updates and -proposed for verification.
>
> * Steps
> (see output examples in comments):
>
> - Userspace part:
> $ gcc -o test test.c -pthread
>
> - Kernel part:
> $ touch Makefile
> $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o clean
> $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o modules
>
> - Shorter hung task timeout and higher console logging level
> to notice the deadlocked tasks sooner, and watch progress:
> $ echo 10 | sudo tee /proc/sys/kernel/hung_task_timeout_secs
> $ echo 9 | sudo tee /proc/sys/kernel/printk
>
> - Load module / Run userspace part (logging to kernel log) in XFS:
> $ sudo insmod kprobe-test.ko
> $ cd /path/to/xfs-mountpoint && sudo sh -c 'stdbuf -oL /path/to/test >/dev/kmsg'
> $ sudo rmmod kprobe-test
>
> You may need to ctrl-z with the original kernel as 'test' doesn't finish.
>
> - Check kernel log or watch the system console:
> $ dmesg
>
> Check threads in D state.
> $ ps -eLo pid,tid,state,comm | grep D | grep -e test -e kworker
>
>
> [Regression Potential]
>
> * The patch is small but changes core writeback infrastructure,
> so there's a chance this may _affect_ some or other behavior
> that has not been validated with our regression testing; not
> exactly _break_ it. Please note our regression testing.
>
> * This has been verified with 'xfstests' (not only for XFS fs,
> despite its original name), used by major Linux filesystems
> for regression testing during development. It's been tested
> on systems with 24 and 4 CPUs (to exercise differences in
> scalability, parallelism, and workload) and XFS and ext4
> (reporter's environment + Ubuntu's default).
> No regressions were observed (the set of failed tests is
> the same in each system and tests failed in the same way).
>
> * This has also been verified with 'iozone' for write intensive
> tests, to exercise the writeback mechanism and no errors were
> observed.
>
> * The reporter has been running the test kernel with the patch
> for weeks and has not observed any other issues/regressions.
>
> [Other Info]
>
> * This is only required in Cosmic (for the Bionic HWE kernel),
> and is already applied in Disco.
>
> Dave Chinner (1):
> mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
>
> mm/page-writeback.c | 33 +++++++++++++++------------------
> 1 file changed, 15 insertions(+), 18 deletions(-)
>
> --
> 2.17.1
>
>
> --
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team
More information about the kernel-team
mailing list