Shutdown sequence and Network filesystems

Thierry Carrez thierry.carrez at ubuntu.com
Mon Mar 16 14:14:34 GMT 2009


Hello everyone,

We have various long-standing issues in the shutdown/reboot sequence of
events, when network filesystems are involved. Several bugs have been
filed (bug 113095, bug 211631) but they point to the same issue. Since
it affects several packages, I would like to discuss the possible
solutions here.

To cut those long threads short, the problem is that on
NetworkManager-enabled systems, we kill networking at S20sendsigs,
before S31umountnfs.sh. This leads to various timeout errors in
umountnfs.sh and potentially to data loss on the network filesystems.

There are three ways of handling your network:

1- Without using NetworkManager
Then everything works fine :)

2- With Networkmanager with connection in "Available for all users" mode
S20sendsigs kills NetworkManager, dhclient and wpasupplicant, resulting
in loss of network connection. S31umountnfs.sh fails with various
timeouts/errors.

3- With NetworkManager with connection in default mode (per-user)
When Gnome session is terminated, the network connection is killed... by
design. So umountnfs.sh fails as well.

Should we support option (3), mounting network filesystems on a network
connection in per-user mode ? It seems bound to failure by design. At
that point I would say it probably makes sense to fix it for option (2)...

How could we do this ? From my analysis it appears we need to protect
from the evil S20sendsigs the following processes:
- NetworkManager
- nm-system-settings (otherwise there is another timeout at shutdown)
- wpa_supplicant (if the connection is wireless)
- dbus (otherwise it will kill wpa_supplicant when killed)
- dhclient (for DHCP connections)

Then umountnfs.sh runs, then we should really kill the remaining processes.

There are two ways of doing it, making the sysv-initscripts aware of the
issue (teaching S20sendsigs to avoid network stuff and adding a
S35reallysendsigs that doesn't avoid anything), or doing it in
initscripts in each package by adding sendsigs.omit.d items and pushing
S35 scripts in rc0/rc6 to really kill them (slightly more complex since
it touches multiple packages and in most cases - wpa_supplicant,
dhclient, nm-system-settings - they are not started or stopped by an
init script).

Please let me know how do you think we should proceed to fix this.

-- 
Thierry Carrez
Ubuntu server team



More information about the ubuntu-devel mailing list