[Bug 1958284] Re: shutdown hangs at "Waiting for process: ..." for 90s, ignoring DefaultTimeoutStopSec
Launchpad Bug Tracker
1958284 at bugs.launchpad.net
Wed Mar 23 15:25:12 UTC 2022
** Merge proposal linked:
https://code.launchpad.net/~enr0n/ubuntu/+source/systemd/+git/systemd/+merge/417577
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1958284
Title:
shutdown hangs at "Waiting for process: ..." for 90s, ignoring
DefaultTimeoutStopSec
Status in systemd package in Ubuntu:
Confirmed
Status in systemd source package in Focal:
Confirmed
Bug description:
[Impact]
The systemd shutdown sequence does not honor systemd-system.conf
settings when waiting for remaining processes. This means that, for
example, if a systemd service specifies KillMode=process and a process
remaining from that service does not properly handle SIGTERM, then the
remaining process will not be killed until after the compiled-in
default value of DefaultTimeoutStopSec (90s), even if the user has
changed the setting of DefaultTimeoutStopSec. In such cases, this
impacts users by significantly increasing the time required for
shutdown/reboot.
[Test Plan]
* Create a new script, /usr/local/bin/loop-ignore-sigterm:
```
#!/bin/bash
loop_forever() {
while true; do sleep 1; done
}
(
trap 'echo Ignoring SIGTERM...' SIGTERM
loop_forever
)
loop_forever
```
This script will spawn a subshell which will loop forever and ignore
SIGTERM. This will force systemd to wait for the subprocess at
reboot/shutdown, and eventually send SIGKILL after TimeoutStopSec
(DefaultTimeoutStopSec in this case).
* Make the script executable:
$ chmod +x /usr/local/bin/loop-ignore-sigterm
* Create a systemd service for this script. Add the following to
/etc/systemd/system/loop-ignore-sigterm.service:
```
[Service]
KillMode=process
ExecStart=/usr/local/bin/loop-ignore-sigterm
```
* Start the service:
$ systemctl start loop-ignore-sigterm.service
* Edit /etc/systemd/system.conf, and uncomment the
'DefaultTimeoutStopSec=90s' line. Modify 90s to something much shorter,
e.g. 20s.
* Re-exec the daemon so this new default takes effect:
$ systemctl daemon-reexec
* Reboot, and monitor the logs. Observe that systemd-shutdown will wait
for the loop-ignore-sigterm process for 90s, instead of the 20s
configured earlier.
[Where problems could occur]
The patch moves the reset_arguments() call to the end of main, which
means reset_arguments() is no longer called before daemon re-execution
(if that branch is taken). If anything in that code path relied on
reset_arguments() being called before re-executing, those assumptions
could be broken. Any such problems would potentially be seen during
daemon re-execution, e.g. when calling systemctl daemon-reexec.
[ Original Description ]
With systemd v245 as shipped with 20.04, the shutdown sequence does
not use the value of `DefaultTimeoutStopSec` to wait for remaining
processes, it instead uses the compiled in default of 90s.
This is most visible with services that use `KillMode=process`
(docker, k8s, k3s, etc...), especially if the remaining processes do
not handle `SIGTERM` or choose to ignore it.
For example:
```
[ OK ] Finished Reboot.
[ OK ] Reached target Reboot.
[ 243.652848 ] systemd-shutdown[1]: Waiting for process: containerd-shim, containerd-shim, containerd-shim, fluent-bit
--- hangs here for 90s even if DefaultTimeoutStopSec is set to a lower
value ---
```
The bug has been fixed upstream here:
https://github.com/systemd/systemd/commit/7d9eea2bd3d4f83668c7a78754d201b22
Marc was kind enough to package the patch for 20.04 so I could test it
(https://launchpad.net/~mdeslaur/+archive/ubuntu/testing/+sourcepub/13210617/+listing-
archive-extra) and with that package, I can confirm that it indeed
fixes the issue.
Here's a few github issues I stumbled upon while trying to debug this,
along with a short writeup of the workaround I ended up using:
- https://github.com/moby/moby/issues/41831
- https://github.com/k3s-io/k3s/issues/2400
- https://github.com/systemd/systemd/issues/16991
- https://raby.sh/debugging-90s-hangs-during-shutdown-on-ubuntu-2004.html
Of course, it would be much better if all the processes would properly
handle `SIGTERM`, but having a way to enforce a maximum wait time at
shutdown is a decent workaround.
Given that the patch is relatively simple, would it be possible to add
it the package for 20.04?
Thanks
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1958284/+subscriptions
More information about the foundations-bugs
mailing list