[Bug 1949089] Re: systemd randomly fails to activate mount units in Ubuntu Core 18
Alberto Mardegan
1949089 at bugs.launchpad.net
Thu Nov 11 15:26:56 UTC 2021
An update to the comment above. We have found out a sequence of commands
that seems to reproduce the issue reliably: in the spread session
started by the command from the comment above, if you type this:
systemctl stop 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
systemctl daemon-reload
# The two commands above are just to ensure that the unit is in the unmounted state
systemctl start 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
systemctl daemon-reload
systemctl stop 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
systemctl start 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
The last command will hang for 90 seconds, after which the job will fail
and `journalctl -xe` will show that it failed because systemd timed out
while activating a loop device. In reality, the loop device is already
there and is available, but somehow the `deamon-reload` operation broke
the internal status.
If you remove the line with `systemctl daemon-reload` (or, on the other
hand, add one such line even after the `stop` command), then everything
proceeds normally. I'm EOD now, but tomorrow I'll verify if this happens
on classic too.
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1949089
Title:
systemd randomly fails to activate mount units in Ubuntu Core 18
Status in systemd package in Ubuntu:
New
Status in systemd source package in Bionic:
New
Bug description:
Since a month or so, we've been seeing random failures in our snapd
spread tests where systemd could not start the mount unit associated
with a snap because of a failed dependency.
The issue is described in the comments to PR
https://github.com/snapcore/snapd/pull/10935, but I'll summarize it
here.
When starting a snap, snapd creates a mount unit to mount the snap's
squashfs (the template is
https://github.com/snapcore/snapd/blob/release/2.53/systemd/systemd.go#L1186-L1205).
The snapd asks systemd to reload the configuration, and starts the
mount unit.
The failure we've observed is that sometimes systemd decides to stop
our mount unit (search for "Unmounting Mount unit for test-snapd-svc-
flip-flop" in the attached log), and then tries to reactivate it
again, and at that point it fails.
When I asked for help, Lukas pointed out that the latest update
contains a patch that is related to reload handling and mount units:
http://launchpadlibrarian.net/555420796/systemd_237-3ubuntu10.51_237-3ubuntu10.52.diff.gz
(the patch itself is better visible at
https://github.com/systemd/systemd/commit/f0831ed2a03fcef582660be1c3b1a9f3e267e656).
When looking at the systemd git log, though, I noticed another patch
that was applied shortly after this one, which also seems related but
was not backported:
https://github.com/systemd/systemd/commit/04eb582acc203eab0bc5c2cc5e13986f16e09df0
Since the stopping of our mount unit happens immediately after a
systemd reload, it actually seems very likely that the inclusion of
f0831ed2a03fcef582660be1c3b1a9f3e267e656 in the systemd update is what
causes our woes (though, indeed, the issue is not reliably
reproducible, so we cannot be sure).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1949089/+subscriptions
More information about the foundations-bugs
mailing list