Upstart 0.5 Roadmap

Thu Oct 11 20:18:42 BST 2007

While I traditionally dislike roadmaps, mostly due to their inevitable
inaccuracy, I think that it's useful at this point, especially given the
recent period of inactivity, to define one for the next major Upstart
milestone: 0.5.

The main goal of this milestone is to define the structure and behaviour
of Upstart for its eventual 1.0 release.  It should be largely feature-
complete in terms of basic behaviour, allowing further development to
concentrate on extensions and improvements.

Quite by accident, the roadmap has been divided into different parts of
the code base; I realised when doing so that this was probably the right
order in which to make the changes as well.  Unsurprisingly, this starts
off quite detailed and becomes hand-wavy towards the end.

libnih
------

No, Upstart won't be switching over to glib anytime soon.

The main change for libnih is to increase the use of destructors to
avoid the problem of determining which *_free() function needs to be
called for a particular structure.

Each structure will have a *_new() function that allocates it, calls
an *_init() function to populate it and sets the destructor to an
*_destroy() function.  Separate *_free() functions like nih_list_free()
will be removed, all calls will just be to nih_free().

The use of an *_init() function will allow structures to be embedded in
others rather than using pointers for everything (saving on memory
overhead); the containing structure will just call *_init() in its own
*_init() function and *_destroy() in its own *_destroy() function.

Upstart, the Service Manager
----------------------------

The most important piece of Upstart is being as good a service manager
as it can be, so this will be a particular focus for 0.5.  Service
management in this case is the configuration and management of the
individually defined services, including their life cycle and processes;
the service manager isn't concerned with how services are started and
stopped, but just what to do once they are.

Bzr trunk already has a fair amount of changes to the configuration
code, supporting the ability to reload the configuration and identify
which jobs have come or gone.  This will be further developed so that
for each job name, there is a list of available configuration with that
name; with the appropriate one being selected.

This allows conflicting configuration to exist and be handled sanely;
which is especially important since I'd like to support job creation by
external processes.  They'll be able to register themselves as a
configuration source and define jobs under it.  Upstart will
intelligently two jobs attempting to own the same name, selecting one
until it is deleted.

Continuing in this vein I'm planning to separate the definition of a
job, which is static, from its state data.  Jobs which permit multiple
instances will simply have state structures, rather than existing as
a new copy of the job.  The intent is that this will allow jobs to
have arbitrary instance limits (e.g. one for each tty) rather than just
one or infinite.  It also rationalises the code somewhat.

Upstart currently doesn't pay much attention to a process once it's been
forked, other than to wait() for it when it dies.  Certainly no effort
is paid to check that the exec() call works, let alone any of the
earlier environment setup.  This is mostly fine, since the problem is
logged, but for various reasons we'd like to pay a little more
attention.  This will be done though a close-on-exec pipe to the child
process, on error information will be written to it; thus all the parent
has to do is poll for reading, and if it receives data, it knows there
was a problem.  This allows Upstart to better handle failures, for
example taking action if /bin/sh isn't present.

An important missing ability is to be able to disable a job from its
definition, without having to resort to deleting the file.  This will
be added, such jobs will still be visible but will report an error if
attempted to be started.  Transient disabling will also be permitted
through "dependencies"; these are lists of paths on the disk (such as
the one being exec'd) that must exist, if they do not, the job reports
an error on start.

Jobs will also support "resources" as a method of throttling jobs; a
resource is a string name and a floating point number, jobs define
how much of a particular resource they use while running and can only
run while the resource is greater than or equal to that number.  This
will typically be used for locking, or utilisation problems.  If a job
is started, but has insufficient resources, it will stay in the
start/waiting state until the resource goes above the necessary level.

Since service management is largely concerned with UNIX processes, the
environment that they run in remains important.  As well as letting the
job definition define environment variables and their values, this will
be extended to allow the definition to specify variables to be taken
from init's own environment (typically PATH, TERM, etc.) ceasing these
from being hard-coded.  In addition, it does not seem unreasonable for
environment to be specified when starting a job.

Continuing this thought, it becomes logical that the environment
variables for an instance are what makes that instance different from
others of the same job definition.  This may end up being the method by
which we define the uniqueness of instances, for example "instance TTY"
might mean that an instance is only spawned if the $TTY variable is
different from any others running.

This isn't fully decided yet, but it does seem to me that inventing some
other mechanism for doing this is folly since the method of passing
those values would just be environment variables anyway!

Assuming we develop in this manner, it becomes very reasonable to expand
the definition of environment variables in other configuration stanzas
than just exec.

It's useful to pass more than just environment variables to a job when
starting it, it's also useful to be able to pass file descriptors as
well.  Some safe and secure mechanism will be found where a job started
from the command-line can be told what its standard input/output/error
should be (normally the terminal from which it was called).

Finally the long-standing missing feature of being able to supervise
forking daemons will be finished; a few alternate methods of doing this
will be available, most likely finding the replacement process at
SIGCHLD-time (though we might need to do this twice!) and watching for
pid files.

Upstart, the System V init Emulator
-----------------------------------

Ironically, one of Upstart's stand-out features from other init
replacements is its ability to reasonably emulate sysvinit.  A few
improvements in this are planned for 0.5.

The main one is going to be the increased correct use of utmp and wtmp;
telinit will handle setting the runlevel itself, and include the
RUNLEVEL and PREVLEVEL variables in the event -- rather than relying
on runlevel from storing it in utmp and setting them.

The shutdown command will write a proper shutdown record, and there will
be something to write a startup/reboot record on boot.  More usefully,
the compatibility reboot tool will check the runlevel, and use that to
determine whether or not to call shutdown as the original sysvinit does
(rather than relying on -f).

Finally init will maintain INIT_PROCESS and DEAD_PROCESS entries in utmp
for jobs that require it, through a "utmp id" stanza.  The general user
of this will the getty job, where such utmp entries are necessary for
correct behaviour.

Upstart, the IPC Server
-----------------------

One minor, trivial change that it almost doesn't seem worth mentioning.
Upstart's own home-brew IPC will be dropped, and instead it will depend
on D-BUS.

IPC is very difficult to get right, and is the biggest focus for
security attacks on a daemon so the most important part *to* get right!

Maintaining Upstart's own IPC code has been a huge burden, a third of
Upstart's code is concerned with it.  Making even trivial changes such
as adding a "re-exec" command (a long-standing bug) require careful
changes and testing, with extensive consideration as to backwards
compatibility issues.

Switching to an out-of-the-box IPC protocol just makes sense; D-BUS is
the currently fashionable one, and its object model fits Upstart quite
well.

From an external point of view, this will make very little difference
since initctl will still appear to behave the same way; the only change
is that libupstart goes away, which nobody else is using anyway.

The likely method for communication is that the current UNIX abstract
namespace socket will remain, and that the protocol for communication
over it will be peer-to-peer D-BUS.  It will then also be possible to
mark a job as the "message bus" in some way, once it is running, Upstart
will connect to it as other processes do (while also keeping its socket
open).  Almost all contributed software will talk to it through the
bus, rather than the peer-to-peer socket.

Connecting to the message bus allows us to find out when certain D-BUS
names are claimed, and thus this too can become a method by which
daemons are held out of the running state.  Consider a job marked with
"dbus org.freedesktop.Hal", Upstart wouldn't mark it as running just
because it has been forked, but would wait until the dbus name was
claimed.

Thus services no longer need to try and use D-BUS to be singletons, they
can rely on Upstart handling that for them and wait to claim their bus
name until they are ready -- knowing that no other copy of themselves
can be running, because the service manager won't allow it.

Upstart, the Service Activation Manager
---------------------------------------

The other side of the Upstart coin to simply managing services is that
it also manages their activation, automatically starting and stopping
them when certain events are received.  This has unsurprisingly proved
to be Upstart's most compelling and controversial feature.

To those on the side of controversy, it's worth noting that you don't
*have* to start services in this manner.  Take the case of D-BUS Service
Activation for example; it makes sense for D-BUS to utilise Upstart to
handle this, just as Upstart will utilise D-BUS for IPC.

There's no reason for D-BUS to invent spurious "events" to pass to
Upstart; instead it can simply ask Upstart (over D-BUS) to start a
service by name, if it prefers to continue to maintain the D-BUS Name to
Service Name mapping, or Upstart could even support starting services by
D-BUS Name -- since it will likely have this information anyway to be
able to defer the running state until the bus name is claimed.

Initially it seems to make sense to discard the notion of "events"
entirely, since they appear to be handled already by D-BUS Service
Activation (managed by Upstart) and D-BUS Signals.  Even in this model
you'd want to be able to start or stop services by D-BUS Signals, which
D-BUS doesn't currently provide (Service Activation is only if you are
addressed by name).

This doesn't quite fill the entire picture either though; there are
still interesting cases where events can be considered methods instead
of signals.  Most notably the compatibility or near-compat events like
startup, runlevel, etc.

Even if signals were enough, there would need to be some way to pass the
data of the signal to the process being started -- since it would be
running too late to get on the bus and catch the signal before it was
lost.

Unfortunately many signals don't contain enough information or context
anyway, HAL is a notable culprit for this.  For example, a job is likely
to want to have an instance of itself running for each device of a
certain capability.  Unfortunately HAL's signals only include the
object path, so it's necessary to perform some communication first to
convert the DeviceAdded and DeviceRemoved signals into useful events
that can be matched and used to start/stop services which will want
to know what they are supposed to be handling.

And this doesn't even discuss events from non-D-BUS sources, such as
inotify or even temporal events (cron).

So we still appear to need the ability to define an abstract "event",
with the interesting distinguishing feature that rather than watching
an abstract flow, events are defined in advance and may actually require
some kind of code to run to find out more information.  This fits in
with one of the original plans for Upstart, where you would have
processes that performed particular jobs such as listening for signals
from HAL and converting them into useful events for jobs.

Not to mention that the one key reason to still think in terms of events
is translating them into environment variables for the jobs.

Since events still appear to need to stay around, so do states (defined
in terms of events); again the very earliest musings about Upstart
included the distinction between "edge" and "level" events -- and no
matter how hard we try, they just won't go away.

Jobs shouldn't need to care about tracking DeviceAdded signals, checking
for the camera capability, etc. neither should they have to wordily
repeat that they're started on one event and stopped on another...  It
should be enough that they can just say "while a camera is plugged in"
(and then use the inherent environment to define whether one copy is
run for all cameras, or each camera).

This is what makes Upstart more than just a dumb service manager, by
taking some effort to automatically start and stop services as required,
it can keep the number running to the minimum needed -- thus conserving
resources and improving performance of even the most hefty workstation.

Scott
-- 
Have you ever, ever felt like this?
Had strange things happen?  Are you going round the twist?
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
Url : https://lists.ubuntu.com/archives/upstart-devel/attachments/20071011/57ac1022/attachment.pgp