[Oneiric-Topic] Server Boot

Wed Mar 30 17:55:45 UTC 2011

On Wed, 2011-03-30 at 07:52 -0500, Serge E. Hallyn wrote:
> Quoting Scott Kitterman (ubuntu at kitterman.com):
> > There was a lot of discussion around improving the server boot experience 
> > before the UDS-M.  A number of people expressed interest in seeing more useful 
> > diagnostic information during boot.  Others expressed concerns with boot 
> > reliability on the more complex hardware typically found in servers.
> > 
> > How are we doing on this?  Personally, I can't remember the last time I 
> > rebooted a server and it wasn't via SSH and the hardware I use is the sort 
> > there were problems with.  Are these still issues for the Ubuntu Server 
> > community?
> > 
> > Scott K
> 
> I think right now these issues are oveshadowed by the fact that a
> great deal of server software is not yet upstartified.  I think that
> needs to be addressed for O.

I wonder if we need to address all of them.

There are hundreds of daemons that will always work perfectly fine
in /etc/init.d as a sysvinit script.

$ apt-file search /etc/init.d| wc -l
1179

If I narrow it down to main, that drops to 220, 50 or so of those are
already symlinks to upstart-job. So realistically, I'd say there are 150
- 170 left to convert in main, and probably about 1000 in universe.

Rather than focus on upstartifying everything, the focus should probably
be on getting the key infrastructure pieces working well in upstart
(kerberos, ssh, ldap, nfs, etc), and then in improving the sysvinit
compatibility layer so that Ubuntu continues to shine when something
uses a sysvinit job.

I think one issue with server boot is that its been left to the event
model without many fences. James Hunt's visualization tool shows arrows
going *everywhere*:

http://upstart.at/2011/03/25/visualisation-of-jobs-and-events-in-ubuntu-natty/
http://upstart.at/wp-content/uploads/2011/03/initctl2dot.png

But if you look at it, things get *much* more orderly around the
runlevel event.

This is a fence. We can reasonably say that the system, upon emitting
the runlevel 2 event, has crossed into a zone where it is ready for
network services to start.

The problem is, its not true. This event is emitted as soon as lo is up
by rc-sysinit:

start on filesystem and net-device-up IFACE=lo

Some services handle this quite well, some do not. Right now, the only
other fences are flawed:

start on net-device-up IFACE!=lo

Which means at least one real interface is configured. This will never
come on a machine that has no network. It has the benefit though, that
it is emitted every time a network appears, so for laptops bouncing from
no network, to wifi, and back, this is a great event to use to make sure
something is up whenever there is a real network. On a server though,
this just means that one of the possibly many interfaces is up, and so
probably shouldn't be used. 

Or

start on started networking

Which means 'ifup -a' has returned, which means that all static, auto
interfaces are configured. It also means we're missing dhcp interfaces.

We should change rc-sysinit to start on started networking. This carries
with it one problem, which is that if a static network interface needs a
sysvinit service to finish coming up, it will lock the boot up. So we
would have to review all scripts in /etc/network/ifup-pre.d
and /etc/network/ifup-post.d and make sure they don't rely on sysvinit
services. Likewise, we'd have to get this done quickly so users can
review any custom scripts they have before the next LTS. As a secondary
measure, running these scripts should time out so the boot can continue
if this deadlock is encountered.

This condition, of finishing 'ifup -a', was the case up until the
all-upstart boot was done. The way the deadlock was avoided was services
that expected to be needed before networking was available would specify
a low number for runlevel S. These services are quite few, and can be
easily identified and converted to upstart jobs that start at the right
time. If I look on a hardy system at /etc/rcS.d, with netbase
installed.. I see very little between loopback and networking:

lrwxrwxrwx 1 root root  18 Mar 30 10:45 S08loopback -> ../init.d/loopback
lrwxrwxrwx 1 root root  20 Nov 30 17:46 S11hwclock.sh -> ../init.d/hwclock.sh
lrwxrwxrwx 1 root root  26 Nov 30 17:46 S11mountdevsubfs.sh -> ../init.d/mountdevsubfs.sh
lrwxrwxrwx 1 root root  16 Nov 30 17:46 S17procps -> ../init.d/procps
lrwxrwxrwx 1 root root  22 Nov 30 17:46 S20checkroot.sh -> ../init.d/checkroot.sh
lrwxrwxrwx 1 root root  17 Nov 30 17:46 S22mtab.sh -> ../init.d/mtab.sh
lrwxrwxrwx 1 root root  20 Nov 30 17:46 S30checkfs.sh -> ../init.d/checkfs.sh
lrwxrwxrwx 1 root root  21 Nov 30 17:46 S35mountall.sh -> ../init.d/mountall.sh
lrwxrwxrwx 1 root root  31 Nov 30 17:46 S36mountall-bootclean.sh -> ../init.d/mountall-bootclean.sh
lrwxrwxrwx 1 root root  26 Nov 30 17:46 S37mountoverflowtmp -> ../init.d/mountoverflowtmp
lrwxrwxrwx 1 root root  20 Mar 30 10:46 S40networking -> ../init.d/networking

In fact, IMO, none of these would qualify for this condition, and are
likely just in this order for other reasons.