Upstart plans

Clint Byrum clint at ubuntu.com
Tue Aug 16 16:43:07 UTC 2011


Excerpts from James Hunt's message of Thu Aug 04 12:01:47 -0700 2011:
> 1.2.3 Failsafe Mode
> ~~~~~~~~~~~~~~~~~~~~
> 
>       Additional to this, Chromium OS has a novel "failsafe" method for ensuring
>       that certain key services are guaranteed to start even if the login
>       screen fails to display for some reasons. Examples of such jobs that are
>       started in failsafe mode are VT consoles and the ssh server. The
>       failsafe facility is implemented as 2 jobs:
> 
>       - failsafe.conf
>       - failsafe-delay.conf
> 
>       The failsafe-delay job specifies "start on started boot-services" such
>       that it starts early on in the boot. This job simply sleeps for 30
>       seconds (which is larger than the work-case overall boot time). Once
>       failsafe-delay stops, even if the main boot fails, failsafe starts since
>       it specifies, "start on starting system-services or stopped
>       failsafe-delay".
> 
>       Jobs that absolutely must start even if the boot fails for some reason
>       can then specify "start on started failsafe" and be assured of starting,
>       in the worst case scenario after a 30 second delay.


FYI, I stole this magnificent idea for Oneiric.

This is in since just after Alpha 3. We have a job, 'failsafe', which on
retrospect, might have been better named 'failsafe-runlevel'. It goes:

# failsafe

description "Failsafe Boot Delay"
author "Clint Byrum <clint at ubuntu.com>"

start on filesystem and net-device-up IFACE=lo
stop on runlevel

pre-start exec sleep 30


This mimics the old rc-sysinit which was based on two events virtually
guaranteed to happen. Then rc-sysinit has been changed to:

start on (filesystem and static-network-up) or started failsafe

And stat-network-up is an event that is emitted when all 'auto' interfaces
in /etc/network/interfaces are "up".

For non-servers, this means very little.. most of them only have lo in
/etc/network/interfacfes. For a server, it means we may delay entering
runlevel 2 by up to 30 seconds waiting for network interfaces to be
detected by the kernel and brought up. This gives a reasonable chance
that most multi-interface servers with services configured to start in
runlevel 2 bound to specific interfaces will have a successful boot.



More information about the upstart-devel mailing list