NAK: [SRU][K][PATCH 0/6] sbitmap: fix IO hung due to waiters not waken up

Tim Gardner tim.gardner at canonical.com
Tue Jun 6 12:47:57 UTC 2023


On 6/6/23 4:26 AM, Gerald Yang wrote:
> I'm really sorry about this, let me resend the patches
> 
> On Tue, Jun 6, 2023 at 5:40 PM Stefan Bader <stefan.bader at canonical.com 
> <mailto:stefan.bader at canonical.com>> wrote:
> 
>     On 06.06.23 09:25, Gerald Yang wrote:
>      > BugLink: https://bugs.launchpad.net/bugs/2022318
>     <https://bugs.launchpad.net/bugs/2022318>
>      >
>      > SRU Justification:
>      >
>      > [ Impact ]
>      >
>      > When running fio on a NVME on an AWS test instance with 5.19
>     kernel, IOs get stuck and fio never ends
>      >
>      > fio command:
>      > sudo fio --name=read_iops_test --filename=/dev/nvme1n1
>     --filesize=50G --time_based --ramp_time=2s --runtime=1m
>     --ioengine=libaio --direct=1 --verify=0 --randrepeat=0 --bs=16K
>     --iodepth=256 --rw=randread
>      > read_iops_test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W)
>     16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=256
>      > fio-3.28
>      > Starting 1 process
>      > Jobs: 1 (f=0): [/(1)][-.-%][eta 01m:02s]
>      >
>      > IOs completely get stuck, after a while kernel log shows:
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.230970] INFO: task
>     fio:2545 blocked for more than 120 seconds.
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.232878] Not tainted
>     5.19.0-43-generic #44~22.04.1-Ubuntu
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.234738] "echo 0 >
>     /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237053] task:fio
>     state:D stack: 0 pid: 2545 ppid: 2495 flags:0x00000002
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237057] Call Trace:
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237058] <TASK>
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237061]
>     __schedule+0x257/0x5d0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237066]
>     schedule+0x68/0x110
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237068]
>     io_schedule+0x46/0x80
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237069]
>     blk_mq_get_tag+0x117/0x300
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237072] ?
>     destroy_sched_domains_rcu+0x40/0x40
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237076]
>     __blk_mq_alloc_requests+0xc4/0x1e0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237079]
>     blk_mq_get_new_requests+0xf6/0x1a0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237080]
>     blk_mq_submit_bio+0x1eb/0x440
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237082]
>     __submit_bio+0x109/0x1a0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237085]
>     submit_bio_noacct_nocheck+0xc2/0x120
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237087]
>     submit_bio_noacct+0x209/0x590
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237088]
>     submit_bio+0x40/0xf0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237090]
>     __blkdev_direct_IO_async+0x146/0x1f0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237092]
>     blkdev_direct_IO.part.0+0x40/0xa0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237093]
>     blkdev_read_iter+0x9f/0x1a0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237094]
>     aio_read+0xec/0x1d0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237097] ?
>     __io_submit_one.constprop.0+0x113/0x200
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237099]
>     __io_submit_one.constprop.0+0x113/0x200
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237100] ?
>     __io_submit_one.constprop.0+0x113/0x200
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237101]
>     io_submit_one+0xe8/0x3d0
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237103]
>     __x64_sys_io_submit+0x84/0x190
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237104] ?
>     do_syscall_64+0x69/0x90
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237106] ?
>     do_syscall_64+0x69/0x90
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237107]
>     do_syscall_64+0x59/0x90
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237108] ?
>     syscall_exit_to_user_mode+0x2a/0x50
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237110] ?
>     do_syscall_64+0x69/0x90
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237111]
>     entry_SYSCALL_64_after_hwframe+0x63/0xcd
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237113] RIP:
>     0033:0x7f44f351ea3d
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237116] RSP:
>     002b:00007fff1dcfe558 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237117] RAX:
>     ffffffffffffffda RBX: 00007f44f2272b68 RCX: 00007f44f351ea3d
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237118] RDX:
>     000056315d9ad828 RSI: 0000000000000001 RDI: 00007f44f224f000
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237119] RBP:
>     00007f44f224f000 R08: 00007f44e9430000 R09: 00000000000002d8
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237120] R10:
>     0000000000000000 R11: 0000000000000246 R12: 0000000000000001
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237120] R13:
>     0000000000000000 R14: 000056315d9ad828 R15: 000056315d9e1830
>      > Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237122] </TASK>
>      >
>      > This issue can not be reproduced on 5.15 and 6.2 kernels
>      >
>      >  From call trace, it got stuck more than 120 seconds waiting on
>     previous IOs to complete for freeing some tags, so new IO requests
>     are able to obtain tags
>      > But in fact, not all previous IOs got stuck, at least some of
>     previous IOs should have completed, but the waiters were not waken up
>      >
>      > This issue is fixed by the upstream commit below which has been
>     merged in kernel 6.1
>      > commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e
>      > Author: Keith Busch <kbusch at kernel.org <mailto:kbusch at kernel.org>>
>      > Date: Fri Sep 9 11:40:22 2022 -0700
>      >
>      >      sbitmap: fix batched wait_cnt accounting
>      >
>      >      Batched completions can clear multiple bits, but we're only
>     decrementing
>      >      the wait_cnt by one each time. This can cause waiters to
>     never be woken,
>      >      stalling IO. Use the batched count instead.
>      >
>      >      Link: https://bugzilla.kernel.org/show_bug.cgi?id=215679
>     <https://bugzilla.kernel.org/show_bug.cgi?id=215679>
>      >      Signed-off-by: Keith Busch <kbusch at kernel.org
>     <mailto:kbusch at kernel.org>>
>      >      Link:
>     https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com
>     <https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com>
>      >      Signed-off-by: Jens Axboe <axboe at kernel.dk
>     <mailto:axboe at kernel.dk>>
>      >
>      > This commit can not be cherry picked cleanly, so we also need to
>     SRU all its dependencies and one further fix below:
>      >
>      > 30514bd2dd4e sbitmap: fix lockup while swapping
>      > 4acb83417cad sbitmap: fix batched wait_cnt accounting
>      > c35227d4e8cb sbitmap: Use atomic_long_try_cmpxchg in
>     __sbitmap_queue_get_batch
>      > 48c033314f37 sbitmap: Avoid leaving waitqueue in invalid state in
>     __sbq_wake_up()
>      > ddbfc34fcf5d sbitmap: remove unnecessary code in
>     __sbitmap_queue_get_batch
>      > 040b83fcecfb sbitmap: fix possible io hung due to lost wakeup
>      >
>      > [ Test Plan ]
>      >
>      > This can be simply reproduced by launching an instance on AWS EC2
>      > And run the fio command on a nvme device for few hours to make
>     sure IOs don’t get stuck
>      >
>      > I’ve built a test kernel with above commits on 5.19.0-43 generic
>     kernel here:
>      >
>     https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic
>     <https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic>
>      >
>      > With this test kernel, fio has been running for few hours without
>     any issue
>      >
>      > [ Where problems could occur ]
>      >
>      > The sbitmap is mainly used in blk-mq in block layer, scsi drivers
>     and fungible ethernet driver
>      > If there is any issue happens in sbitmap, the symptom should be
>     IO hung, or packets get stuck in fungible driver
>      >
>      >
>      > Hugh Dickins (1):
>      >    sbitmap: fix lockup while swapping
>      >
>      > Jan Kara (1):
>      >    sbitmap: Avoid leaving waitqueue in invalid state in
>     __sbq_wake_up()
>      >
>      > Keith Busch (1):
>      >    sbitmap: fix batched wait_cnt accounting
>      >
>      > Liu Song (1):
>      >    sbitmap: remove unnecessary code in __sbitmap_queue_get_batch
>      >
>      > Uros Bizjak (1):
>      >    sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch
>      >
>      > Yu Kuai (1):
>      >    sbitmap: fix possible io hung due to lost wakeup
>      >
>      >   block/blk-mq-tag.c      |   2 +-
>      >   include/linux/sbitmap.h |   3 +-
>      >   lib/sbitmap.c           | 109
>     ++++++++++++++++++++++++++--------------
>      >   3 files changed, 73 insertions(+), 41 deletions(-)
>      >
> 
>     I can see no patches related to this in my inbox...
> 
>     -Stefan
> 
> 

Tie off this thread
-- 
-----------
Tim Gardner
Canonical, Inc




More information about the kernel-team mailing list