[Bug 1949089] Re: systemd randomly fails to activate mount units in Ubuntu Core 18

Michael Vogt 1949089 at bugs.launchpad.net
Mon Nov 8 11:00:30 UTC 2021


Thanks Lukas for providing this revert of
f0831ed2a03fcef582660be1c3b1a9f3e267e656. Using
https://people.ubuntu.com/~slyon/uc18/core18_20211105_amd64.snap I can
no longer reproduce the isue.

To make it easier to reproduce what I did:
"""
$ git status
On branch tests-use-core18-from-gce
Your branch is up to date with 'sergiocazzolato/tests-use-core18-from-gce'.
...
$ git diff
diff --git a/tests/lib/prepare.sh b/tests/lib/prepare.sh
index e6b984c4d0..5c58a55b2f 100755
--- a/tests/lib/prepare.sh
+++ b/tests/lib/prepare.sh
@@ -973,7 +973,7 @@ EOF
     fi
 
     if os.query is-core18; then
-        curl -s -o core18.snap https://storage.googleapis.com/snapd-spread-tests/snaps/core18_20211102_amd64.snap
+        curl -s -o core18.snap https://people.ubuntu.com/~slyon/uc18/core18_20211105_amd64.snap
         EXTRA_FUNDAMENTAL="$EXTRA_FUNDAMENTAL --snap $PWD/core18.snap"
     fi
 
$ $GOPATH/bin/spread -repeat 100   google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy
...
2021-11-08 11:39:41 Preparing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805)...
2021-11-08 11:39:53 Executing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805) (1/1)...
2021-11-08 11:40:49 Restoring google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805)...
2021-11-08 11:41:00 Preparing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805)...
...

"""

To validate my findings I'm also running the same test against core18 in stable and it fails very quickly:
"""
$ git status
HEAD detached at upstream/master
$ git describe 
2.53.1-480-g2c39794030
$ $GOPATH/bin/spread -repeat 100   google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy
....
2021-11-08 11:58:46 Error executing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081048-716592) : 
-----
+ . /home/gopath/src/github.com/snapcore/snapd/tests/lib/disabled-svcs.sh
++ SVC_MISSING_ERR_MSG='state.json is missing last-active-disabled-services in it:'
+ echo 'CASE 1'
CASE 1
+ echo 'Install the snap'
Install the snap
+ snap install --dangerous disabled-svcs-kept_1.0_all.snap
disabled-svcs-kept 1.0 installed
+ echo 'Check that state.json doesn'\''t contain last-active-disabled-services'
Check that state.json doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'CASE 2'
CASE 2
+ echo 'Disable a service in the snap'
Disable a service in the snap
+ snap stop --disable disabled-svcs-kept.svc
Stopped.
+ echo 'Check that it was actually disabled'
Check that it was actually disabled
+ retry -n 10 --wait 1 sh -c 'snap services disabled-svcs-kept | MATCH "disabled-svcs-kept\\.svc\\s+disabled\\s+inactive"'
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'Disable the whole snap'
Disable the whole snap
+ snap disable disabled-svcs-kept
disabled-svcs-kept disabled
+ echo 'Check that state.json DOES contain last-active-disabled-services'
Check that state.json DOES contain last-active-disabled-services
+ check_state_json_yes_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' '!=' null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'Enable the whole snap'
Enable the whole snap
+ snap enable disabled-svcs-kept
disabled-svcs-kept enabled
+ echo 'Check that the service is still disabled'
Check that the service is still disabled
+ MATCH 'disabled-svcs-kept\.svc\s+disabled\s+inactive'
+ snap services disabled-svcs-kept
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'CASE 3'
CASE 3
+ echo 'Refresh the snap'
Refresh the snap
+ snap install --dangerous disabled-svcs-kept_1.0_all.snap
disabled-svcs-kept 1.0 installed
+ echo 'Check that the service is still disabled'
Check that the service is still disabled
+ MATCH 'disabled-svcs-kept\.svc\s+disabled\s+inactive'
+ snap services disabled-svcs-kept
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'CASE 4'
CASE 4
+ echo 'Revert the snap'
Revert the snap
+ snap revert disabled-svcs-kept --revision=x1
disabled-svcs-kept reverted to 1.0
+ echo 'Check that the service is still disabled'
Check that the service is still disabled
+ MATCH 'disabled-svcs-kept\.svc\s+disabled\s+inactive'
+ snap services disabled-svcs-kept
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'Refresh back to the new revision to unmark it as blacklisted'
Refresh back to the new revision to unmark it as blacklisted
+ snap refresh disabled-svcs-kept --revision=x2
disabled-svcs-kept 1.0 refreshed
+ echo 'CASE 5'
CASE 5
+ echo 'Refresh to a new revision with a snap service renamed'
Refresh to a new revision with a snap service renamed
+ snap install --dangerous disabled-svcs-kept_2.0_all.snap
error: cannot perform the following tasks:
- Setup snap "disabled-svcs-kept" (unset) security profiles (cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml)
-----
...
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:439: DEBUG: Running task 202 on Do: Make current revision for snap "disabled-svcs-kept" unavailable
Nov 08 10:58:45 nov081048-716592 systemd[1]: Reloading.
Nov 08 10:58:45 nov081048-716592 systemd[1]: Unmounting Mount unit for disabled-svcs-kept, revision x3...
Nov 08 10:58:45 nov081048-716592 systemd-udevd[645]: Network interface NamePolicy= disabled on kernel command line, ignoring.
Nov 08 10:58:45 nov081048-716592 systemd[1]: Unmounted Mount unit for disabled-svcs-kept, revision x3.
Nov 08 10:58:45 nov081048-716592 systemd[1]: Reloading.
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:439: DEBUG: Running task 203 on Do: Copy snap "disabled-svcs-kept" data
Nov 08 10:58:45 nov081048-716592 snapd[2817]: snapmgr.go:303: cannot read snap info of snap "disabled-svcs-kept" at revision x3: cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:439: DEBUG: Running task 204 on Do: Setup snap "disabled-svcs-kept" (unset) security profiles
Nov 08 10:58:45 nov081048-716592 snapd[2817]: task.go:337: DEBUG: 2021-11-08T10:58:45Z ERROR cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:271: [change 18 "Setup snap \"disabled-svcs-kept\" (unset) security profiles" task] failed: cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml
"""

-- 
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1949089

Title:
  systemd randomly fails to activate mount units in Ubuntu Core 18

Status in systemd package in Ubuntu:
  New
Status in systemd source package in Bionic:
  New
Status in systemd source package in Jammy:
  New

Bug description:
  Since a month or so, we've been seeing random failures in our snapd
  spread tests where systemd could not start the mount unit associated
  with a snap because of a failed dependency.

  The issue is described in the comments to PR
  https://github.com/snapcore/snapd/pull/10935, but I'll summarize it
  here.

  When starting a snap, snapd creates a mount unit to mount the snap's
  squashfs (the template is
  https://github.com/snapcore/snapd/blob/release/2.53/systemd/systemd.go#L1186-L1205).
  The snapd asks systemd to reload the configuration, and starts the
  mount unit.

  The failure we've observed is that sometimes systemd decides to stop
  our mount unit (search for "Unmounting Mount unit for test-snapd-svc-
  flip-flop" in the attached log), and then tries to reactivate it
  again, and at that point it fails.

  When I asked for help, Lukas pointed out that the latest update
  contains a patch that is related to reload handling and mount units:
  http://launchpadlibrarian.net/555420796/systemd_237-3ubuntu10.51_237-3ubuntu10.52.diff.gz
  (the patch itself is better visible at
  https://github.com/systemd/systemd/commit/f0831ed2a03fcef582660be1c3b1a9f3e267e656).
  When looking at the systemd git log, though, I noticed another patch
  that was applied shortly after this one, which also seems related but
  was not backported:
  https://github.com/systemd/systemd/commit/04eb582acc203eab0bc5c2cc5e13986f16e09df0

  Since the stopping of our mount unit happens immediately after a
  systemd reload, it actually seems very likely that the inclusion of
  f0831ed2a03fcef582660be1c3b1a9f3e267e656 in the systemd update is what
  causes our woes (though, indeed, the issue is not reliably
  reproducible, so we cannot be sure).

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1949089/+subscriptions




More information about the foundations-bugs mailing list