[Bug 1798212] Re: systemd stuck in uninterruptible sleep state
George Wilson
1798212 at bugs.launchpad.net
Tue Dec 11 22:40:57 UTC 2018
We just hit a similar issue on Ubuntu 18.04. From the crash dump, I'm
able to see this stack:
[12929.914040] kworker/u4:5 D 0 27285 2 0x80000080
[12929.914053] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[12929.914055] Call Trace:
[12929.914075] __schedule+0x291/0x8a0
[12929.914078] schedule+0x2c/0x80
[12929.914081] schedule_timeout+0x1cf/0x350
[12929.914086] ? select_idle_sibling+0x29/0x410
[12929.914089] ? __enqueue_entity+0x5c/0x60
[12929.914091] ? enqueue_entity+0x10e/0x6b0
[12929.914093] wait_for_completion+0xba/0x140
[12929.914094] ? wake_up_q+0x80/0x80
[12929.914099] __synchronize_srcu.part.13+0x85/0xb0
[12929.914101] ? trace_raw_output_rcu_utilization+0x50/0x50
[12929.914103] ? ttwu_do_activate+0x77/0x80
[12929.914105] synchronize_srcu+0x66/0xe0
[12929.914107] ? synchronize_srcu+0x66/0xe0
[12929.914109] fsnotify_mark_destroy_workfn+0x7b/0xe0
[12929.914113] process_one_work+0x1de/0x410
[12929.914115] worker_thread+0x228/0x410
[12929.914118] kthread+0x121/0x140
[12929.914119] ? process_one_work+0x410/0x410
[12929.914121] ? kthread_create_worker_on_cpu+0x70/0x70
Several other systemd processes are also hung. I was able to issue an
NMI to generate a crash dump so I can pull out more information if that
would be helpful.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1798212
Title:
systemd stuck in uninterruptible sleep state
Status in systemd package in Ubuntu:
Confirmed
Bug description:
Ubuntu 16.04 systemd 229 may get stuck in an uninterruptible sleep
state. Any subsequent systemctl command times out with "Failed to
execute operation: Connection timed out" error message. gdb and strace
is unable to attach to systemd process 1.
$ systemd --version
systemd 229
+PAM +AUDIT +SELINUX +IMA +APPARMOR +SMACK +SYSVINIT +UTMP +LIBCRYPTSETUP +GCRYPT +GNUTLS +ACL +XZ -LZ4 +SECCOMP +BLKID +ELFUTILS +KMOD -IDN
$ cat /proc/1/stack
[<0>] flush_work+0x129/0x1e0
[<0>] flush_delayed_work+0x3f/0x50
[<0>] fsnotify_wait_marks_destroyed+0x15/0x20
[<0>] fsnotify_destroy_group+0x48/0xd0
[<0>] inotify_release+0x1e/0x50
[<0>] __fput+0xea/0x220
[<0>] ____fput+0xe/0x10
[<0>] task_work_run+0x8a/0xb0
[<0>] exit_to_usermode_loop+0xc4/0xd0
[<0>] do_syscall_64+0xf4/0x130
[<0>] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[<0>] 0xffffffffffffffff
$ cat /proc/1/status
Name: systemd
Umask: 0000
State: D (disk sleep)
Tgid: 1
Ngid: 0
Pid: 1
PPid: 0
TracerPid: 9724
Uid: 0 0 0 0
Gid: 0 0 0 0
FDSize: 128
Groups:
NStgid: 1
NSpid: 1
NSpgid: 1
NSsid: 1
VmPeak: 251136 kB
VmSize: 185728 kB
VmLck: 0 kB
VmPin: 0 kB
VmHWM: 6620 kB
VmRSS: 6264 kB
RssAnon: 2380 kB
RssFile: 3884 kB
RssShmem: 0 kB
VmData: 18752 kB
VmStk: 132 kB
VmExe: 1392 kB
VmLib: 3692 kB
VmPTE: 128 kB
VmSwap: 0 kB
HugetlbPages: 0 kB
CoreDumping: 0
Threads: 1
SigQ: 1/257098
SigPnd: 0000000000040000
ShdPnd: 0000000000010000
SigBlk: 7be3c0fe28014a03
SigIgn: 0000000000001000
SigCgt: 00000001800004ec
CapInh: 0000000000000000
CapPrm: 0000003fffffffff
CapEff: 0000003fffffffff
CapBnd: 0000003fffffffff
CapAmb: 0000000000000000
NoNewPrivs: 0
Seccomp: 0
Speculation_Store_Bypass: vulnerable
Cpus_allowed: ffffffff
Cpus_allowed_list: 0-31
Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000003
Mems_allowed_list: 0-1
voluntary_ctxt_switches: 716968
nonvoluntary_ctxt_switches: 9565
$ dmesg
[...]
[1255076.993707] INFO: task systemd:1 blocked for more than 120 seconds.
[1255077.000295] Not tainted 4.15.0-32-generic #35~16.04.1-Ubuntu
[1255077.006804] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1255077.014996] systemd D 0 1 0 0x00000000
[1255077.015000] Call Trace:
[1255077.015011] __schedule+0x3d6/0x8b0
[1255077.015019] ? enqueue_entity+0x112/0x670
[1255077.015021] schedule+0x36/0x80
[1255077.015025] schedule_timeout+0x1db/0x370
[1255077.015028] ? check_preempt_curr+0x54/0x90
[1255077.015030] ? ttwu_do_wakeup+0x1e/0x150
[1255077.015033] wait_for_completion+0xb4/0x140
[1255077.015035] ? wake_up_q+0x70/0x70
[1255077.015040] flush_work+0x129/0x1e0
[1255077.015043] ? worker_detach_from_pool+0xb0/0xb0
[1255077.015045] flush_delayed_work+0x3f/0x50
[1255077.015052] fsnotify_wait_marks_destroyed+0x15/0x20
[1255077.015055] fsnotify_destroy_group+0x48/0xd0
[1255077.015058] inotify_release+0x1e/0x50
[1255077.015063] __fput+0xea/0x220
[1255077.015065] ____fput+0xe/0x10
[1255077.015068] task_work_run+0x8a/0xb0
[1255077.015074] exit_to_usermode_loop+0xc4/0xd0
[1255077.015077] do_syscall_64+0xf4/0x130
[1255077.015080] entry_SYSCALL_64_after_hwframe+0x3d/0xa2
[1255077.015084] RIP: 0033:0x7f29c9c3457d
[1255077.015085] RSP: 002b:00007ffec956fe30 EFLAGS: 00000293 ORIG_RAX: 0000000000000003
[1255077.015088] RAX: 0000000000000000 RBX: 0000000000000010 RCX: 00007f29c9c3457d
[1255077.015090] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000010
[1255077.015092] RBP: 00007f29cb36d700 R08: 00005625f0043420 R09: 0000000000000001
[1255077.015093] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000000b
[1255077.015095] R13: 00005625eeed6040 R14: 00005625eff179b8 R15: 00005625eeed7460
[1255077.015421] INFO: task kworker/u66:2:108669 blocked for more than 120 seconds.
[1255077.022960] Not tainted 4.15.0-32-generic #35~16.04.1-Ubuntu
[1255077.029469] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[1255077.037601] kworker/u66:2 D 0 108669 2 0x80000080
[1255077.037613] Workqueue: events_unbound fsnotify_mark_destroy_workfn
[1255077.037615] Call Trace:
[1255077.037621] __schedule+0x3d6/0x8b0
[1255077.037627] ? __update_load_avg_blocked_se.isra.37+0xc5/0x140
[1255077.037629] schedule+0x36/0x80
[1255077.037633] schedule_timeout+0x1db/0x370
[1255077.037636] ? add_timer+0x125/0x280
[1255077.037639] wait_for_completion+0xb4/0x140
[1255077.037641] ? wake_up_q+0x70/0x70
[1255077.037646] __synchronize_srcu.part.13+0x85/0xb0
[1255077.037650] ? trace_raw_output_rcu_utilization+0x50/0x50
[1255077.037653] synchronize_srcu+0xd3/0xe0
[1255077.037656] ? synchronize_srcu+0xd3/0xe0
[1255077.037660] fsnotify_mark_destroy_workfn+0x7c/0xe0
[1255077.037664] process_one_work+0x14d/0x410
[1255077.037678] worker_thread+0x22b/0x460
[1255077.037682] kthread+0x105/0x140
[1255077.037691] ? process_one_work+0x410/0x410
[1255077.037701] ? kthread_destroy_worker+0x50/0x50
[1255077.037712] ret_from_fork+0x35/0x40
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1798212/+subscriptions
More information about the foundations-bugs
mailing list