Upstart 0.5 Can We Stop For Ice Cream?

Sat Mar 8 01:38:51 GMT 2008

This is a second update to the Upstart 0.5 Roadmap sent to this mailing
list five months ago, which you can find in the archives here:
https://lists.ubuntu.com/archives/upstart-devel/2007-October/000468.html

You can find the first update in the archives here:
https://lists.ubuntu.com/archives/upstart-devel/2008-January/000573.html

Progress
--------

Much of the work since the last update has been on the interaction
between events and jobs, and what I've come to term the atomicity of
jobs.

One of the most immediate obvious changes is the loss of arguments to
events.  The simple reason for this is that with event expressions,
there's no logical way to pass all of the arguments of all of the events
to the job; so you'd have to duplicate the information anyway, ending up
with something like:

    interface-up eth0 00:11:D8:98:1B:37
	IFACE=eth0
	HWADDR=00:11:D8:98:1B:37
	TYPE=1

Obviously this is a bit silly.  The change means that events now only
have the environment variables ("parameters") part:

    interface-up
	IFACE=eth0
	HWADDR=00:11:D8:98:1B:37
	TYPE=1

The order they are specified in is preserved, so to match them you can
either do it by name:

    start on interface-up IFACE=eth*

or by position:

    start on interface-up eth*

as long as positional matches come first, you can use both:

    start on interface-up eth* TYPE=1

I figure that the documentation for events will indicate which ones you
can rely on being in order, and that they'll be the primary ones for the
event.

This change means that it's now predictable what the environment of an
event expression is and can be extracted at the point the expression
becomes true.  This environment is combined with that present in the job
configuration (which may take variables from init's own environment) and
stored in the new job instance when it is started.

If the start expression becomes TRUE while the instance is stopping, it
does not immediately replace it; instead the old environment remains
since the post-stop script may need it.  Once the job has finished
stopping, and restarts, the new environment is used.

ie. given the definition:

    start on foo
    stop on bar

    pre-start exec echo pre-start $FOO
    post-start exec echo post-start $FOO
    exec echo main $FOO && sleep inf
    pre-stop exec pre-stop $FOO
    post-stop exec post-stop $FOO

You would expect to see the following:

    $ initctl emit foo FOO=hello
    pre-start hello
    post-start hello
    main hello
    $ initctl emit bar ; initctl emit foo FOO=goodbye
    pre-stop hello
    post-stop hello
    pre-start goodbye
    post-start goodbye
    main goodbye

This, I think makes much more sense.

The list of events that started the job can now be found in the
$UPSTART_EVENTS variable, instead of as positional arguments, so that
they're consistently available.

The same holds true for the events that stop the job, except that the
environment from these is not generally useful for the job since it's
often just a match for what started it.  That being said, since pre-stop
is only run for natural stops and cancel the stop without ill effect, it
makes sense that this script should receive the stop event environment.

Thus it does so, overriding that from the start events where different.
The list of events that stopped the job can be found in the
$UPSTART_STOP_EVENTS variable.

One of the other changes that this introduced was removing the need for
the job to keep a reference to the event longer than it needed to block,
since it has the environment.  This was originally a fix for the
"respawn loses environment" problem, but that's irrelevant anyway now.

In order to reset the start and stop operators immediately after
matching (so that they need to be completely repeated), as well as
copying the environment out, we build a blocking list of events.

At the same time, the periods that events (and by inference, start and
stop commands) are blocked was rationalised.

"start job" will now block until the job is running (stopped again for
tasks), or until the command is somehow interrupted.  If a process
fails, or a stop event occurs, or another admin runs "stop job", the
start command will exit immediately.  Likewise for the "stop" command.

Previously the commands would still block until the job was resting
again, this seemed overkill and was causing problems for event
sequencing.

So what does this all buy us?  Jobs now have a table of environment
variables given to them by the events that started them.  We'll extend
the start/stop commands to also be able to do this as well.

We can use this environment to expand variable references in some
special job stanzas.  The first and most obvious one where this is
useful is "stop on":

    start on tty-added
    stop on tty-removed $TTY

The value of the $TTY variable is taken from the job's environment, so
thus from the start events.  Where a variable isn't found, it can never
match; so assuming you keep the names unique:

    start on tty-added or cua-added
    stop on tty-removed $TTY or cua-removed $CUA

Cannot pair a tty-added event with a cua-removed event.

The expansion is somewhat shell like, though we should stress that it is
only intended to be a limited subset that may not be truly compatible.
(Compared to the expansion in script/exec which is actually done by
passing the string unmodified to a shell, and letting it worry about it)

Current forms we support:

    $VAR		simple reference
    ${VAR}		reference where there might be confusion
    ${VAR:-foo}		foo used if $VAR unset or NULL
    ${VAR:+foo}		foo used unless $VAR is unset or NULL
    ${VAR-foo}		foo used if $VAR unset
    ${VAR+foo}		foo used unless $VAR is unset

I'd like to support the #, ##, % and %% forms too; but I haven't figured
those out yet.

The other stanza where these are expanded is a new one, well, an
extension to an existing one.  Previously Upstart has supported
singleton jobs where only one copy could be active at any one time and
instance jobs (now "unlimited-instance") where any number could be
running.

We now have a middle-ground; you can define a string by which instances
must be unique.  Only one instance of a given "name" may be active at
any one time.  The way to define these is by giving an argument to the
"instance" stanza:

    instance $TTY

Obviously it makes no sense to not specify any variable expansions here,
since the effect would be the same as a singleton job.

Other stanzas where these will be expanded will be the planned file
dependencies and resources stanzas (see the roadmap).  They may be
expanded for other similar service activation stanzas as and when we
invent them.

Variables are explicitly *not* expanded in process stanzas such as
"umask", "nice", etc.  This is because events aren't sanitised, so you
could be at risk of a malicious user injecting bad resource limits, etc.
The right way to do this is in the script itself, and to check the value
first.

(The reason this doesn't apply to the service activation stanzas is that
the worst you can do is start a service that will immediately fail.)

A surprising and last-minute change has been to the behaviour of the
"respawn limit" stanza.  This now only limits Upstart's automatic
respawning of the job (ie. the "respawn" command itself).  Manual
restarts of the job are expressly not limited in this way, since the
proper way to stop an administrator restarting a job in a while loop is
to hit them.

Missing Pieces
--------------

In other words, "when's the release?"

There are two remaining pieces to land before I'm ready to release an
0.5.0 version.

The first is a change to the state machine; this is to support features
such as resources in the future.  A new "inactive" state will be
introduced, which will replace "waiting" as the default and final state
of the job.  "waiting" will become an intermediate state between
"inactive" and "starting".

Jobs may go from "stopped" into "waiting" directly if being restarted,
and may go from "waiting" into "dead" directly if being 

Nothing will wait in the waiting state at first, but it means we have a
state where we can wait later on and leave without worrying about
countering event emissions with their opposites.

The second is to reintroduce the IPC layer with D-BUS, initially this
will likely be limited to the basic methods to get initctl working again
-- with more methods such as job registration coming in later releases.

Future
------

0.5.0 isn't intended to be a complete release by any stretch of the
imagination, but a first release of the work in trunk in order to widen
the testing that it can get and so we can discover what else we need to
do.

0.5.x releases will quite quickly see the addition of dependencies and
resources, only not targeted for 0.5.0 since they're new features and I
don't want to delay too long.

Another thing I want to work into a relatively early 0.5.x release is
the ability to disable jobs from automatic starting.  The favourite two
methods for doing this are "profiles" and "flags", and are basically
just different ways of doing the same thing.

A profile would be a first-class object of which any one can be active
at a time.  A profile either hand picks which jobs are enabled while
active, or excludes jobs that are to be disabled.  If the "single-user"
profile were active, only the jobs it lists would be able to start.

Flags are not so much first-class objects but tags that can appear in
job definitions, either positively or negatively.  Likewise any number
of flags can either be "switched on" or "switched off" on the kernel
command line, collectively enabling or disabling jobs.  If "!networking"
were placed on the kernel command-line, no job with "if networking" in
its definition would be able to start automatically but jobs with
"unless networking" would be able to start.

It's not clear which of these two is the right way to do it yet.

The current fork following code is relatively simple; when a process
forks, it follows the fork and stops tracing the parent expecting it to
go away.  This could be done in a much more heuristic way to provide the
right behaviour for most daemons.

Upstart would trace the process, and follow forks as it does now; but it
wouldn't forget the previous one or change the pid.  Instead it would
record the new pid as an additional process for the job.  Should any
process terminate or call exec(), it will be struck from the list of
known processes and the next one in the list selected instead.

If we run out of processes, then we deem it to have died.  We may also
keep a timer so that after a sensible time (30s?) if we've kept the same
process, we forget about the others.

Other interesting suggestions for the "wait for" stanza are the ability
to wait for the listen() syscall, the creation of a file or the
announcement of a D-BUS name.

Scott
-- 
Have you ever, ever felt like this?
Had strange things happen?  Are you going round the twist?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/upstart-devel/attachments/20080308/9b09b947/attachment.pgp