[Bug 1949089] Re: systemd randomly fails to activate mount units in Ubuntu Core 18
Michael Vogt
1949089 at bugs.launchpad.net
Mon Nov 8 11:00:30 UTC 2021
Thanks Lukas for providing this revert of
f0831ed2a03fcef582660be1c3b1a9f3e267e656. Using
https://people.ubuntu.com/~slyon/uc18/core18_20211105_amd64.snap I can
no longer reproduce the isue.
To make it easier to reproduce what I did:
"""
$ git status
On branch tests-use-core18-from-gce
Your branch is up to date with 'sergiocazzolato/tests-use-core18-from-gce'.
...
$ git diff
diff --git a/tests/lib/prepare.sh b/tests/lib/prepare.sh
index e6b984c4d0..5c58a55b2f 100755
--- a/tests/lib/prepare.sh
+++ b/tests/lib/prepare.sh
@@ -973,7 +973,7 @@ EOF
fi
if os.query is-core18; then
- curl -s -o core18.snap https://storage.googleapis.com/snapd-spread-tests/snaps/core18_20211102_amd64.snap
+ curl -s -o core18.snap https://people.ubuntu.com/~slyon/uc18/core18_20211105_amd64.snap
EXTRA_FUNDAMENTAL="$EXTRA_FUNDAMENTAL --snap $PWD/core18.snap"
fi
$ $GOPATH/bin/spread -repeat 100 google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy
...
2021-11-08 11:39:41 Preparing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805)...
2021-11-08 11:39:53 Executing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805) (1/1)...
2021-11-08 11:40:49 Restoring google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805)...
2021-11-08 11:41:00 Preparing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081028-580805)...
...
"""
To validate my findings I'm also running the same test against core18 in stable and it fails very quickly:
"""
$ git status
HEAD detached at upstream/master
$ git describe
2.53.1-480-g2c39794030
$ $GOPATH/bin/spread -repeat 100 google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy
....
2021-11-08 11:58:46 Error executing google:ubuntu-core-18-64:tests/main/services-disabled-kept-happy (nov081048-716592) :
-----
+ . /home/gopath/src/github.com/snapcore/snapd/tests/lib/disabled-svcs.sh
++ SVC_MISSING_ERR_MSG='state.json is missing last-active-disabled-services in it:'
+ echo 'CASE 1'
CASE 1
+ echo 'Install the snap'
Install the snap
+ snap install --dangerous disabled-svcs-kept_1.0_all.snap
disabled-svcs-kept 1.0 installed
+ echo 'Check that state.json doesn'\''t contain last-active-disabled-services'
Check that state.json doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'CASE 2'
CASE 2
+ echo 'Disable a service in the snap'
Disable a service in the snap
+ snap stop --disable disabled-svcs-kept.svc
Stopped.
+ echo 'Check that it was actually disabled'
Check that it was actually disabled
+ retry -n 10 --wait 1 sh -c 'snap services disabled-svcs-kept | MATCH "disabled-svcs-kept\\.svc\\s+disabled\\s+inactive"'
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'Disable the whole snap'
Disable the whole snap
+ snap disable disabled-svcs-kept
disabled-svcs-kept disabled
+ echo 'Check that state.json DOES contain last-active-disabled-services'
Check that state.json DOES contain last-active-disabled-services
+ check_state_json_yes_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' '!=' null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'Enable the whole snap'
Enable the whole snap
+ snap enable disabled-svcs-kept
disabled-svcs-kept enabled
+ echo 'Check that the service is still disabled'
Check that the service is still disabled
+ MATCH 'disabled-svcs-kept\.svc\s+disabled\s+inactive'
+ snap services disabled-svcs-kept
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'CASE 3'
CASE 3
+ echo 'Refresh the snap'
Refresh the snap
+ snap install --dangerous disabled-svcs-kept_1.0_all.snap
disabled-svcs-kept 1.0 installed
+ echo 'Check that the service is still disabled'
Check that the service is still disabled
+ MATCH 'disabled-svcs-kept\.svc\s+disabled\s+inactive'
+ snap services disabled-svcs-kept
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'CASE 4'
CASE 4
+ echo 'Revert the snap'
Revert the snap
+ snap revert disabled-svcs-kept --revision=x1
disabled-svcs-kept reverted to 1.0
+ echo 'Check that the service is still disabled'
Check that the service is still disabled
+ MATCH 'disabled-svcs-kept\.svc\s+disabled\s+inactive'
+ snap services disabled-svcs-kept
+ echo 'Check that state.json still doesn'\''t contain last-active-disabled-services'
Check that state.json still doesn't contain last-active-disabled-services
+ check_state_json_no_disabled_svcs
+ /home/gopath/src/github.com/snapcore/snapd/tests/lib/tools/snapd-state check-state '.data.snaps."disabled-svcs-kept" | ."last-active-disabled-services"?' = null 'state.json has invalid last-active-disabled-services in it:'
+ echo 'Refresh back to the new revision to unmark it as blacklisted'
Refresh back to the new revision to unmark it as blacklisted
+ snap refresh disabled-svcs-kept --revision=x2
disabled-svcs-kept 1.0 refreshed
+ echo 'CASE 5'
CASE 5
+ echo 'Refresh to a new revision with a snap service renamed'
Refresh to a new revision with a snap service renamed
+ snap install --dangerous disabled-svcs-kept_2.0_all.snap
error: cannot perform the following tasks:
- Setup snap "disabled-svcs-kept" (unset) security profiles (cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml)
-----
...
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:439: DEBUG: Running task 202 on Do: Make current revision for snap "disabled-svcs-kept" unavailable
Nov 08 10:58:45 nov081048-716592 systemd[1]: Reloading.
Nov 08 10:58:45 nov081048-716592 systemd[1]: Unmounting Mount unit for disabled-svcs-kept, revision x3...
Nov 08 10:58:45 nov081048-716592 systemd-udevd[645]: Network interface NamePolicy= disabled on kernel command line, ignoring.
Nov 08 10:58:45 nov081048-716592 systemd[1]: Unmounted Mount unit for disabled-svcs-kept, revision x3.
Nov 08 10:58:45 nov081048-716592 systemd[1]: Reloading.
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:439: DEBUG: Running task 203 on Do: Copy snap "disabled-svcs-kept" data
Nov 08 10:58:45 nov081048-716592 snapd[2817]: snapmgr.go:303: cannot read snap info of snap "disabled-svcs-kept" at revision x3: cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:439: DEBUG: Running task 204 on Do: Setup snap "disabled-svcs-kept" (unset) security profiles
Nov 08 10:58:45 nov081048-716592 snapd[2817]: task.go:337: DEBUG: 2021-11-08T10:58:45Z ERROR cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml
Nov 08 10:58:45 nov081048-716592 snapd[2817]: taskrunner.go:271: [change 18 "Setup snap \"disabled-svcs-kept\" (unset) security profiles" task] failed: cannot find installed snap "disabled-svcs-kept" at revision x3: missing file /snap/disabled-svcs-kept/x3/meta/snap.yaml
"""
--
You received this bug notification because you are a member of Ubuntu
Foundations Bugs, which is subscribed to systemd in Ubuntu.
https://bugs.launchpad.net/bugs/1949089
Title:
systemd randomly fails to activate mount units in Ubuntu Core 18
Status in systemd package in Ubuntu:
New
Status in systemd source package in Bionic:
New
Status in systemd source package in Jammy:
New
Bug description:
Since a month or so, we've been seeing random failures in our snapd
spread tests where systemd could not start the mount unit associated
with a snap because of a failed dependency.
The issue is described in the comments to PR
https://github.com/snapcore/snapd/pull/10935, but I'll summarize it
here.
When starting a snap, snapd creates a mount unit to mount the snap's
squashfs (the template is
https://github.com/snapcore/snapd/blob/release/2.53/systemd/systemd.go#L1186-L1205).
The snapd asks systemd to reload the configuration, and starts the
mount unit.
The failure we've observed is that sometimes systemd decides to stop
our mount unit (search for "Unmounting Mount unit for test-snapd-svc-
flip-flop" in the attached log), and then tries to reactivate it
again, and at that point it fails.
When I asked for help, Lukas pointed out that the latest update
contains a patch that is related to reload handling and mount units:
http://launchpadlibrarian.net/555420796/systemd_237-3ubuntu10.51_237-3ubuntu10.52.diff.gz
(the patch itself is better visible at
https://github.com/systemd/systemd/commit/f0831ed2a03fcef582660be1c3b1a9f3e267e656).
When looking at the systemd git log, though, I noticed another patch
that was applied shortly after this one, which also seems related but
was not backported:
https://github.com/systemd/systemd/commit/04eb582acc203eab0bc5c2cc5e13986f16e09df0
Since the stopping of our mount unit happens immediately after a
systemd reload, it actually seems very likely that the inclusion of
f0831ed2a03fcef582660be1c3b1a9f3e267e656 in the systemd update is what
causes our woes (though, indeed, the issue is not reliably
reproducible, so we cannot be sure).
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/systemd/+bug/1949089/+subscriptions
More information about the foundations-bugs
mailing list