[Bug 1949089] Re: systemd randomly fails to activate mount units in Ubuntu Core 18

Alberto Mardegan 1949089 at bugs.launchpad.net
Thu Nov 11 15:26:56 UTC 2021


An update to the comment above. We have found out a sequence of commands
that seems to reproduce the issue reliably: in the spread session
started by the command from the comment above, if you type this:

    systemctl stop 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
    systemctl daemon-reload

    # The two commands above are just to ensure that the unit is in the unmounted state
    systemctl start 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
    systemctl daemon-reload
    systemctl stop 'snap-disabled\x2dsvcs\x2dkept-x1.mount'
    systemctl start 'snap-disabled\x2dsvcs\x2dkept-x1.mount'

The last command will hang for 90 seconds, after which the job will fail
and `journalctl -xe` will show that it failed because systemd timed out
while activating a loop device. In reality, the loop device is already
there and is available, but somehow the `deamon-reload` operation broke
the internal status.

If you remove the line with `systemctl daemon-reload` (or, on the other
hand, add one such line even after the `stop` command), then everything
proceeds normally. I'm EOD now, but tomorrow I'll verify if this happens
on classic too.

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1949089

Title:
  systemd randomly fails to activate mount units in Ubuntu Core 18

Status in systemd package in Ubuntu:
  New
Status in systemd source package in Bionic:
  New

Bug description:
  Since a month or so, we've been seeing random failures in our snapd
  spread tests where systemd could not start the mount unit associated
  with a snap because of a failed dependency.

  The issue is described in the comments to PR
  https://github.com/snapcore/snapd/pull/10935, but I'll summarize it
  here.

  When starting a snap, snapd creates a mount unit to mount the snap's
  squashfs (the template is
  https://github.com/snapcore/snapd/blob/release/2.53/systemd/systemd.go#L1186-L1205).
  The snapd asks systemd to reload the configuration, and starts the
  mount unit.

  The failure we've observed is that sometimes systemd decides to stop
  our mount unit (search for "Unmounting Mount unit for test-snapd-svc-
  flip-flop" in the attached log), and then tries to reactivate it
  again, and at that point it fails.

  When I asked for help, Lukas pointed out that the latest update
  contains a patch that is related to reload handling and mount units:
  http://launchpadlibrarian.net/555420796/systemd_237-3ubuntu10.51_237-3ubuntu10.52.diff.gz
  (the patch itself is better visible at
  https://github.com/systemd/systemd/commit/f0831ed2a03fcef582660be1c3b1a9f3e267e656).
  When looking at the systemd git log, though, I noticed another patch
  that was applied shortly after this one, which also seems related but
  was not backported:
  https://github.com/systemd/systemd/commit/04eb582acc203eab0bc5c2cc5e13986f16e09df0

  Since the stopping of our mount unit happens immediately after a
  systemd reload, it actually seems very likely that the inclusion of
  f0831ed2a03fcef582660be1c3b1a9f3e267e656 in the systemd update is what
  causes our woes (though, indeed, the issue is not reliably
  reproducible, so we cannot be sure).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1949089/+subscriptions




More information about the foundations-bugs mailing list