Design Changes

Tue Sep 12 08:54:49 BST 2006

Hi,

Scott James Remnant:
> Here's a summary of what we've been thinking about so far, note that
> none of this is a decision yet, so if you have any preferences or better
> ideas, please weigh in!
> 
Gladly. ;-)

> Naming
> ------
> 
The names split rather obviously in three parts
1- what happens
2- what is affected
3- anything else you need to know

"What happens" (1) should be the first word; it should be clearly
specified, for each of those words, what the first argument means and
what any other arguments or parameters might do.

We don't need a namespace for (1), other than reserving some prefix for
named subsystems, which don't need a hierarchy because uniqueness is
already enforced by the system's package management system (dpkg/rpm).

Thus, we might declare that all events introduced by X11 should start
with "app-xorg-". Example:

	app-xorg-login :1

which other handlers then can use for doing things that really belong
in /etc/X11/Xsession.d/, so this was a bad example -- but you see what I
mean. ;-)

> 	on gdm/started
>     becomes
> 	on gdm started
> 
	on job-started gdm

"gdm" is kept unique by virtue of the fact that you can only install one
package named "gdm" in the system (this is specified in the description
of "job-started"). If a package needs more than one job, it can use
slashes to distinguish between them, or add arguments.

> 	on hda1 added
	on disk-added hda1

> 	on hda1 mounted
That event should probably use a "path" namespace:
	on path-mounted /usr/local

because that job usually doesn't care which device you mount there.

For (un)mounting, I'd also specify separate pre- and post-events.
	on path-mounting /usr/local
	on path-mounted /usr/local
	on path-unmounting /usr/local
	on path-unmounted /usr/local

the third one is particularly important for taking down a service that
depends on that path (and possibly prevents it from unmounting in the
first place).

> Matching could even use fnmatch?
> 
That'd be good.

> 	on event-failed shutdown

That makes sense.

> Having ssh and gettys bounce back if shutdown fails would be ... nice.
> 
> Obviously these events shouldn't themselves generate success or fails,
> or we'd have a storm on our hands.
> 
Heh.

> [...]
> These would be additional events issued not by the state changes, but by
> the child reaper, and may look like:
> 
> 	on apache start-failed
> 	on apache failed
> 
... except that these should be reversed. Maybe

> 	on job-startup-failed apache
> 	on job-failed apache

> One useful side-effect of this is that it means the "respawning" state
> can go entirely, and all in-upstart handling of respawns.  Apache itself
> can just give:
> 
> 	start on apache failed
 	start on job-failed apache

I like this idea, esp. the difference between "job-startup-failed"
(something's wrong and respawning doesn't make much sense) and
"job-failed" (it died after starting up successfully, so respawning
probably works).

> Waiting for Events
> ------------------
> 
Mhh.

> 	stop on shutdown
> 	stop after apache stopped
> 	stop after tomcat stopped
> 
Maybe this idea how to waiting works would ... well, work. ;-)

We keep track of the last event that happened for a particular type and
first argument. Thus, if Apache signals the fact that it's coming up with

	job-starting apache some-arguments
and
	job-running apache

events, upstart remembers the latter (keyed by "job" and "apache").

Thus, if I want to wait for apache to not be running, I could write,
in my mysql script,

	stop after job-starting apache
	stop after job-running apache

then upstart would check this list for any entry which matches these
events, and defer shutting down until none are present.

This doesn't scale well, of course; modifying the mysql startup
scritp whenever somebody uses it makes no sense whatsoever.

But if tomcat-or-whatever triggers an "app-mysql-user tomcat" event
when it starts, and an "app-mysql-finished tomcat" event when it stops
or dies. Then the mysql daemon can simply state

	stop on shutdown
	stop after app-mysql-user *

and, voila, mysql is no longer killed until every "official" user has
terminated. In fact, you could add a 

	stop on job-running mysql
	stop after app-mysql-user *

section to its event.d script -- which would make mysql stop itself
automatically, as soon as all of its users have terminated. (Yes, this
is evil, as you now need an "initctl trigger app-mysql-user root" to
start the daemon manually. Assuming that you want it to stay up for more
than half a second. ;-)

> Meta-Events
> -----------
> 
> Another less resource-hungry possibility is to have a job and daemon
> that sits and waits until the required point has occurred, emits the
> event, and then terminates.
> 
That might run into race conditions... <Thinking> the aforementioned
"past-event" list would be very helpful here.

-- 
Matthias Urlichs   |   {M:U} IT Design @ m-u-it.de   |  smurf at smurf.noris.de