<div dir="ltr">I'm really sorry about this, let me resend the patches</div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Tue, Jun 6, 2023 at 5:40 PM Stefan Bader <<a href="mailto:stefan.bader@canonical.com">stefan.bader@canonical.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">On 06.06.23 09:25, Gerald Yang wrote:<br>
> BugLink: <a href="https://bugs.launchpad.net/bugs/2022318" rel="noreferrer" target="_blank">https://bugs.launchpad.net/bugs/2022318</a><br>
> <br>
> SRU Justification:<br>
> <br>
> [ Impact ]<br>
> <br>
> When running fio on a NVME on an AWS test instance with 5.19 kernel, IOs get stuck and fio never ends<br>
> <br>
> fio command:<br>
> sudo fio --name=read_iops_test --filename=/dev/nvme1n1 --filesize=50G --time_based --ramp_time=2s --runtime=1m --ioengine=libaio --direct=1 --verify=0 --randrepeat=0 --bs=16K --iodepth=256 --rw=randread<br>
> read_iops_test: (g=0): rw=randread, bs=(R) 16.0KiB-16.0KiB, (W) 16.0KiB-16.0KiB, (T) 16.0KiB-16.0KiB, ioengine=libaio, iodepth=256<br>
> fio-3.28<br>
> Starting 1 process<br>
> Jobs: 1 (f=0): [/(1)][-.-%][eta 01m:02s]<br>
> <br>
> IOs completely get stuck, after a while kernel log shows:<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.230970] INFO: task fio:2545 blocked for more than 120 seconds.<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.232878] Not tainted 5.19.0-43-generic #44~22.04.1-Ubuntu<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.234738] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237053] task:fio state:D stack: 0 pid: 2545 ppid: 2495 flags:0x00000002<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237057] Call Trace:<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237058] <TASK><br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237061] __schedule+0x257/0x5d0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237066] schedule+0x68/0x110<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237068] io_schedule+0x46/0x80<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237069] blk_mq_get_tag+0x117/0x300<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237072] ? destroy_sched_domains_rcu+0x40/0x40<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237076] __blk_mq_alloc_requests+0xc4/0x1e0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237079] blk_mq_get_new_requests+0xf6/0x1a0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237080] blk_mq_submit_bio+0x1eb/0x440<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237082] __submit_bio+0x109/0x1a0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237085] submit_bio_noacct_nocheck+0xc2/0x120<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237087] submit_bio_noacct+0x209/0x590<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237088] submit_bio+0x40/0xf0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237090] __blkdev_direct_IO_async+0x146/0x1f0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237092] blkdev_direct_IO.part.0+0x40/0xa0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237093] blkdev_read_iter+0x9f/0x1a0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237094] aio_read+0xec/0x1d0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237097] ? __io_submit_one.constprop.0+0x113/0x200<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237099] __io_submit_one.constprop.0+0x113/0x200<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237100] ? __io_submit_one.constprop.0+0x113/0x200<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237101] io_submit_one+0xe8/0x3d0<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237103] __x64_sys_io_submit+0x84/0x190<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237104] ? do_syscall_64+0x69/0x90<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237106] ? do_syscall_64+0x69/0x90<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237107] do_syscall_64+0x59/0x90<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237108] ? syscall_exit_to_user_mode+0x2a/0x50<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237110] ? do_syscall_64+0x69/0x90<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237111] entry_SYSCALL_64_after_hwframe+0x63/0xcd<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237113] RIP: 0033:0x7f44f351ea3d<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237116] RSP: 002b:00007fff1dcfe558 EFLAGS: 00000246 ORIG_RAX: 00000000000000d1<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237117] RAX: ffffffffffffffda RBX: 00007f44f2272b68 RCX: 00007f44f351ea3d<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237118] RDX: 000056315d9ad828 RSI: 0000000000000001 RDI: 00007f44f224f000<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237119] RBP: 00007f44f224f000 R08: 00007f44e9430000 R09: 00000000000002d8<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237120] R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000001<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237120] R13: 0000000000000000 R14: 000056315d9ad828 R15: 000056315d9e1830<br>
> Jun 1 03:57:52 ip-172-31-39-141 kernel: [ 370.237122] </TASK><br>
> <br>
> This issue can not be reproduced on 5.15 and 6.2 kernels<br>
> <br>
> From call trace, it got stuck more than 120 seconds waiting on previous IOs to complete for freeing some tags, so new IO requests are able to obtain tags<br>
> But in fact, not all previous IOs got stuck, at least some of previous IOs should have completed, but the waiters were not waken up<br>
> <br>
> This issue is fixed by the upstream commit below which has been merged in kernel 6.1<br>
> commit 4acb83417cadfdcbe64215f9d0ddcf3132af808e<br>
> Author: Keith Busch <<a href="mailto:kbusch@kernel.org" target="_blank">kbusch@kernel.org</a>><br>
> Date: Fri Sep 9 11:40:22 2022 -0700<br>
> <br>
> sbitmap: fix batched wait_cnt accounting<br>
> <br>
> Batched completions can clear multiple bits, but we're only decrementing<br>
> the wait_cnt by one each time. This can cause waiters to never be woken,<br>
> stalling IO. Use the batched count instead.<br>
> <br>
> Link: <a href="https://bugzilla.kernel.org/show_bug.cgi?id=215679" rel="noreferrer" target="_blank">https://bugzilla.kernel.org/show_bug.cgi?id=215679</a><br>
> Signed-off-by: Keith Busch <<a href="mailto:kbusch@kernel.org" target="_blank">kbusch@kernel.org</a>><br>
> Link: <a href="https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com" rel="noreferrer" target="_blank">https://lore.kernel.org/r/20220909184022.1709476-1-kbusch@fb.com</a><br>
> Signed-off-by: Jens Axboe <<a href="mailto:axboe@kernel.dk" target="_blank">axboe@kernel.dk</a>><br>
> <br>
> This commit can not be cherry picked cleanly, so we also need to SRU all its dependencies and one further fix below:<br>
> <br>
> 30514bd2dd4e sbitmap: fix lockup while swapping<br>
> 4acb83417cad sbitmap: fix batched wait_cnt accounting<br>
> c35227d4e8cb sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch<br>
> 48c033314f37 sbitmap: Avoid leaving waitqueue in invalid state in __sbq_wake_up()<br>
> ddbfc34fcf5d sbitmap: remove unnecessary code in __sbitmap_queue_get_batch<br>
> 040b83fcecfb sbitmap: fix possible io hung due to lost wakeup<br>
> <br>
> [ Test Plan ]<br>
> <br>
> This can be simply reproduced by launching an instance on AWS EC2<br>
> And run the fio command on a nvme device for few hours to make sure IOs don’t get stuck<br>
> <br>
> I’ve built a test kernel with above commits on 5.19.0-43 generic kernel here:<br>
> <a href="https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic" rel="noreferrer" target="_blank">https://launchpad.net/~gerald-yang-tw/+archive/ubuntu/361041-generic</a><br>
> <br>
> With this test kernel, fio has been running for few hours without any issue<br>
> <br>
> [ Where problems could occur ]<br>
> <br>
> The sbitmap is mainly used in blk-mq in block layer, scsi drivers and fungible ethernet driver<br>
> If there is any issue happens in sbitmap, the symptom should be IO hung, or packets get stuck in fungible driver<br>
> <br>
> <br>
> Hugh Dickins (1):<br>
> sbitmap: fix lockup while swapping<br>
> <br>
> Jan Kara (1):<br>
> sbitmap: Avoid leaving waitqueue in invalid state in __sbq_wake_up()<br>
> <br>
> Keith Busch (1):<br>
> sbitmap: fix batched wait_cnt accounting<br>
> <br>
> Liu Song (1):<br>
> sbitmap: remove unnecessary code in __sbitmap_queue_get_batch<br>
> <br>
> Uros Bizjak (1):<br>
> sbitmap: Use atomic_long_try_cmpxchg in __sbitmap_queue_get_batch<br>
> <br>
> Yu Kuai (1):<br>
> sbitmap: fix possible io hung due to lost wakeup<br>
> <br>
> block/blk-mq-tag.c | 2 +-<br>
> include/linux/sbitmap.h | 3 +-<br>
> lib/sbitmap.c | 109 ++++++++++++++++++++++++++--------------<br>
> 3 files changed, 73 insertions(+), 41 deletions(-)<br>
> <br>
<br>
I can see no patches related to this in my inbox...<br>
<br>
-Stefan<br>
</blockquote></div>