APPLIED: [C][PATCH 0/1] Fix write()/fsync() deadlock in write_cache_pages()

Khaled Elmously khalid.elmously at canonical.com
Tue Apr 23 05:14:14 UTC 2019


On 2019-04-15 13:32:09 , Mauricio Faria de Oliveira wrote:
> BugLink: https://bugs.launchpad.net/bugs/1824827
> 
> [Impact] 
> 
>  * Tasks of a multi-threaded workload doing write() and fsync()
>    might deadlock in write_cache_pages(), preventing progress.
> 
>  * The fix addresses a corner case in write_cache_pages() on
>    the range_cyclic implementation which allows the deadlock.
> 
>  * Patch:
>    - commit 64081362e8ff4587b4554087f3cfc73d3e0a4cd7
>      ("mm/page-writeback.c: fix range_cyclic writeback vs
>      writepages deadlock"), present in v4.20-rc1~92^2~19.
> 
> [Test Case]
> 
>  * This issue originally has been hit by the 'perforce' (p4d)
>    tool in a XFS filesystem, but it's difficult/rare to occur.
> 
>  * We've written an userspace + kernel module (kprobes-based)
>    to reproduce this problem and verify the test kernel/patch.
> 
>  * The kprobes are strictly tied to particular kernel versions
>    because of the assembly instruction offsets.  We'll provide
>    updated versions for -updates and -proposed for verification.
> 
>  * Steps 
>    (see output examples in comments):
> 
>    - Userspace part:
>    $ gcc -o test test.c -pthread
> 
>    - Kernel part:
>    $ touch Makefile 
>    $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o clean
>    $ make -C /lib/modules/$(uname -r)/build M=$(pwd) obj-m=kprobe-test.o modules 
> 
>    - Shorter hung task timeout and higher console logging level
>      to notice the deadlocked tasks sooner, and watch progress:
>    $ echo 10 | sudo tee /proc/sys/kernel/hung_task_timeout_secs
>    $ echo 9 | sudo tee /proc/sys/kernel/printk 
> 
>    - Load module / Run userspace part (logging to kernel log) in XFS:
>    $ sudo insmod kprobe-test.ko
>    $ cd /path/to/xfs-mountpoint && sudo sh -c 'stdbuf -oL /path/to/test >/dev/kmsg'
>    $ sudo rmmod kprobe-test
> 
>    You may need to ctrl-z with the original kernel as 'test' doesn't finish.
> 
>    - Check kernel log or watch the system console:
>    $ dmesg
> 
>    Check threads in D state.
>    $ ps -eLo pid,tid,state,comm | grep D | grep -e test -e kworker
> 
> 
> [Regression Potential] 
> 
>  * The patch is small but changes core writeback infrastructure,
>    so there's a chance this may _affect_ some or other behavior
>    that has not been validated with our regression testing; not
>    exactly _break_ it.  Please note our regression testing.
> 
>  * This has been verified with 'xfstests' (not only for XFS fs,
>    despite its original name), used by major Linux filesystems
>    for regression testing during development. It's been tested
>    on systems with 24 and 4 CPUs (to exercise differences in
>    scalability, parallelism, and workload) and XFS and ext4
>    (reporter's environment + Ubuntu's default).
>    No regressions were observed (the set of failed tests is
>    the same in each system and tests failed in the same way).
>    
>  * This has also been verified with 'iozone' for write intensive
>    tests, to exercise the writeback mechanism and no errors were
>    observed.
> 
>  * The reporter has been running the test kernel with the patch
>    for weeks and has not observed any other issues/regressions.
> 
> [Other Info]
>  
>  * This is only required in Cosmic (for the Bionic HWE kernel),
>    and is already applied in Disco.
> 
> Dave Chinner (1):
>   mm/page-writeback.c: fix range_cyclic writeback vs writepages deadlock
> 
>  mm/page-writeback.c | 33 +++++++++++++++------------------
>  1 file changed, 15 insertions(+), 18 deletions(-)
> 
> -- 
> 2.17.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



More information about the kernel-team mailing list