[Bug 1590799] Re: nfs-kernel-server does not start because of dependency failure
Rafael David Tinoco
rafael.tinoco at canonical.com
Tue Feb 7 00:56:27 UTC 2017
Summarizing
>From systemd.special we have:
sockets.target:
A special target unit that sets up all socket units (see systemd.socket(5) for details) that shall be active after boot. Services that can be socket-activated shall add Wants= dependencies to this unit for their socket unit during installation. This is best configured via a WantedBy=sockets.target in the socket unit's "[Install]" section.
Lets see who is starting rpcbind.socket:
[Install]
WantedBy=sockets.target
Unit file will be linked as "wanted" by "sockets.target".
$ systemctl list-dependencies sockets.target
sockets.target
● ├─acpid.socket
● ├─apport-forward.socket
● ├─dbus.socket
● ├─dm-event.socket
● ├─rpcbind.socket
... <many others>
$ systemctl --reverse list-dependencies sockets.target
sockets.target
● └─basic.target
● └─multi-user.target
● └─graphical.target
basic.target starts the sockets.target.
sockets.target start rpcbind.socket.
rpcbind.target doesn't start rpcbind.service.
Lets who starts rpcbind.service:
[Install]
Also=rpcbind.socket
rpcbind.{socket,service} are enabled/disabled together (not started).
$ systemctl list-dependencies rpcbind.service
rpcbind.service
● ├─rpcbind.socket
● ├─system.slice
● ├─remote-fs-pre.target
● └─rpcbind.target
└─...
rpcbind.service depends on rpcbind.target (not with my patch) AND
rpcbind.socket (started by basic.target with the sockets.target). Lets
see rpcbind.target (its a special target made by a generator, its unit
file doesn't contain anything):
rpcbind.target:
The portmapper/rpcbind pulls in this target and orders itself before it, to indicate its availability. systemd automatically adds dependencies of type After= for this target unit to all SysV init script service units with an LSB header referring to the "$portmap" facility.
So, systemd puts an After=rpcbind.target in all SysV scripts with LSB
header containing "$portmap" as facility (insserv style):
$ systemctl status rpcbind.target
● rpcbind.target - RPC Port Mapper
Loaded: loaded (/etc/insserv.conf.d/rpcbind; static; vendor preset: enabled)
Drop-In: /run/systemd/generator/rpcbind.target.d
└─50-hard-dependency-rpcbind-$portmap.conf
/etc/insserv.conf.d/rpcbind contains "rpcbind".
/run/systemd/generator/rpcbind.service.d/50-rpcbind-\$portmap.conf contains:
[Unit]
Wants=rpcbind.target
Before=rpcbind.target
So rpcbind.target doesn't start "rpcbind.service" and any
"rpcbind.target" usage seems broken to me. rpcbind was ever started by
nfs-{kernel-server,mountd}.service (since they were relying on the
rpcbind.target). rpcbind (the portmap) was being activated by a
connection to the rpc socket.
But NOT the NETWORK socket, the LOCAL socket:
[Socket]
ListenStream=/run/rpcbind.sock
You can test this by stopping the "rpcbind.service" and trying:
$ rpcinfo -T tcp 127.0.0.1
rpcinfo: can't contact rpcbind: RPC: Remote system error - Connection refused
$ NETPATH=/run/rpcbind.sock rpcinfo
program version netid address service owner
100000 4 tcp6 ::.0.111 portmapper superuser
100000 3 tcp6 ::.0.111 portmapper superuser
100000 4 udp6 ::.0.111 portmapper superuser
100000 3 udp6 ::.0.111 portmapper superuser
100000 4 tcp 0.0.0.0.0.111 portmapper superuser
So accessing "/run/rpcbind.sock" is what "started" the rpcbind and the
race condition seen was born there, for sure. We have to make sure
rpcbind is loaded BEFORE any service depending on it. Lets analyze next
if my change (based only on backporting upstream code/fix) accomplishes
that.
--
You received this bug notification because you are a member of Ubuntu
Sponsors Team, which is subscribed to the bug report.
https://bugs.launchpad.net/bugs/1590799
Title:
nfs-kernel-server does not start because of dependency failure
Status in nfs-utils package in Ubuntu:
In Progress
Status in nfs-utils source package in Xenial:
In Progress
Bug description:
Immediately after boot:
root at feynmann:~# systemctl status nfs-kernel-server
● nfs-server.service - NFS server and services
Loaded: loaded (/lib/systemd/system/nfs-server.service; enabled; vendor preset: enabled)
Active: inactive (dead)
Jun 09 14:35:47 feynmann systemd[1]: Dependency failed for NFS server and services.
Jun 09 14:35:47 feynmann systemd[1]: nfs-server.service: Job nfs-server.service/start failed
root at feynmann:~# systemctl status nfs-mountd.service
● nfs-mountd.service - NFS Mount Daemon
Loaded: loaded (/lib/systemd/system/nfs-mountd.service; static; vendor preset: enabled)
Active: failed (Result: exit-code) since Thu 2016-06-09 14:35:47 BST; 7min ago
Process: 1321 ExecStart=/usr/sbin/rpc.mountd $RPCMOUNTDARGS (code=exited, status=1/FAILURE)
Jun 09 14:35:47 feynmann systemd[1]: Starting NFS Mount Daemon...
Jun 09 14:35:47 feynmann rpc.mountd[1321]: mountd: could not create listeners
Jun 09 14:35:47 feynmann systemd[1]: nfs-mountd.service: Control process exited, code=exited
Jun 09 14:35:47 feynmann systemd[1]: Failed to start NFS Mount Daemon.
Jun 09 14:35:47 feynmann systemd[1]: nfs-mountd.service: Unit entered failed state.
Jun 09 14:35:47 feynmann systemd[1]: nfs-mountd.service: Failed with result 'exit-code'.
root at feynmann:~# systemctl list-dependencies nfs-kernel-server
nfs-kernel-server.service
● ├─auth-rpcgss-module.service
● ├─nfs-config.service
● ├─nfs-idmapd.service
● ├─nfs-mountd.service
● ├─proc-fs-nfsd.mount
● ├─rpc-svcgssd.service
● ├─system.slice
● ├─network.target
● └─rpcbind.target
● └─rpcbind.service
root at feynmann:~# systemctl list-dependencies nfs-mountd.service
nfs-mountd.service
● ├─nfs-config.service
● ├─nfs-server.service
● ├─proc-fs-nfsd.mount
● └─system.slice
root at feynmann:~#
root at feynmann:~# lsb_release -rd
Description: Ubuntu 16.04 LTS
Release: 16.04
root at feynmann:~# apt-cache policy nfs-kernel-server
nfs-kernel-server:
Installed: 1:1.2.8-9ubuntu12
Candidate: 1:1.2.8-9ubuntu12
Version table:
*** 1:1.2.8-9ubuntu12 500
500 http://gb.archive.ubuntu.com/ubuntu xenial/main amd64 Packages
100 /var/lib/dpkg/status
Additional comments:
1. There seems to be a circular dependency between nfs-mountd and nfs-kernel-server
2. I can get it working by changing the AFter,Requires in /lib/ssystemd/system/nfs-{mountd|server}.service files. I have managed to get nfs-kernel-server to start but not nfs-mountd.
3. /usr/lib/systemd/scripts/nfs-utils_env.sh references /etc/sysconfig/nfs which is Centos/RedHat location of this file. Also /etc/default/nfs does not exist. (possibly unrelated to this bug)
4. A file "/lib/systemd/system/-.slice" exists. this file prevents execution of 'ls *' or 'grep xxx *' commands in that directory. I am unsure whether this is intended by the systemd developers but it is unfriendly when investigating this bug.
Attempted solution:
1. Edit /lib/systemd/system/nfs-server.service (original lines are
commented out:
[Unit]
Description=NFS server and services
DefaultDependencies=no
Requires=network.target proc-fs-nfsd.mount rpcbind.target
# Requires=nfs-mountd.service
Wants=nfs-idmapd.service
After=local-fs.target
#After=network.target proc-fs-nfsd.mount rpcbind.target nfs-mountd.service
After=network.target proc-fs-nfsd.mount rpcbind.target
After=nfs-idmapd.service rpc-statd.service
#Before=rpc-statd-notify.service
Before=nfs-mountd.service rpc-statd-notify.service
...
followed by a systemctl daemon-reload and a reboot.
This results in nfs-kernel-server starting correctly but nfs-mountd
not so. However starting nfs-mountd manually after reboot is
successful:
root at feynmann:~# systemctl status nfs-kernel-server.service
● nfs-server.service - NFS server and services
Loaded: loaded (/lib/systemd/system/nfs-server.service; enabled; vendor preset: enabled)
Active: active (exited) since Thu 2016-06-09 15:07:23 BST; 1min 25s ago
Process: 1391 ExecStart=/usr/sbin/rpc.nfsd $RPCNFSDARGS (code=exited, status=0/SUCCESS)
Process: 1384 ExecStartPre=/usr/sbin/exportfs -r (code=exited, status=0/SUCCESS)
Main PID: 1391 (code=exited, status=0/SUCCESS)
Tasks: 0
Memory: 0B
CPU: 0
CGroup: /system.slice/nfs-server.service
Jun 09 15:07:23 feynmann systemd[1]: Starting NFS server and services...
Jun 09 15:07:23 feynmann systemd[1]: Started NFS server and services.
root at feynmann:~# systemctl status nfs-mountd.service
● nfs-mountd.service - NFS Mount Daemon
Loaded: loaded (/lib/systemd/system/nfs-mountd.service; static; vendor preset: enabled)
Active: inactive (dead)
root at feynmann:~# systemctl start nfs-mountd.service
root at feynmann:~# systemctl status nfs-mountd.service
● nfs-mountd.service - NFS Mount Daemon
Loaded: loaded (/lib/systemd/system/nfs-mountd.service; static; vendor preset: enabled)
Active: active (running) since Thu 2016-06-09 15:09:02 BST; 3s ago
Process: 2044 ExecStart=/usr/sbin/rpc.mountd $RPCMOUNTDARGS (code=exited, status=0/SUCCESS)
Main PID: 2046 (rpc.mountd)
Tasks: 1
Memory: 904.0K
CPU: 12ms
CGroup: /system.slice/nfs-mountd.service
└─2046 /usr/sbin/rpc.mountd --manage-gids
Jun 09 15:09:02 feynmann systemd[1]: Starting NFS Mount Daemon...
Jun 09 15:09:02 feynmann rpc.mountd[2046]: Version 1.2.8 starting
Jun 09 15:09:02 feynmann systemd[1]: Started NFS Mount Daemon.
Enabling nfs-mountd.service (systemctl enable nfs-mountd.service) has
no effect in this case.
ProblemType: Bug
DistroRelease: Ubuntu 16.04
Package: nfs-kernel-server 1:1.2.8-9ubuntu12 [modified: lib/systemd/system/nfs-server.service]
ProcVersionSignature: Ubuntu 4.4.0-22.40-generic 4.4.8
Uname: Linux 4.4.0-22-generic x86_64
ApportVersion: 2.20.1-0ubuntu2.1
Architecture: amd64
Date: Thu Jun 9 14:38:58 2016
InstallationDate: Installed on 2016-06-08 (1 days ago)
InstallationMedia: Ubuntu-Server 16.04 LTS "Xenial Xerus" - Release amd64 (20160420.3)
ProcEnviron:
SHELL=/bin/bash
TERM=linux
PATH=(custom, no user)
LANG=en_GB.UTF-8
LANGUAGE=en_GB:en
SourcePackage: nfs-utils
UpgradeStatus: No upgrade log present (probably fresh install)
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/nfs-utils/+bug/1590799/+subscriptions
More information about the Ubuntu-sponsors
mailing list