[RFC] Edge-triggered upstart

Sat Aug 15 08:47:46 BST 2009

I've been working on a patch that changes the way events are handled in 
Upstart (this inspired the N-ary event operators).

Currently events have 3 states: pending, handling, and finished. 
Processing an event works like this:

1) Move event from pending->handling, kick off any jobs that respond to 
the event. "Block" the event for each job we start.

2) Upstart goes and does other things while the processes spawn

3) As each job enters started, "unblock" the event.

4) When all jobs have unblocked the event, move from handling -> finished.

5) Tell whatever sent the event that it has been handled (if a job 
emitted it, let that job continue. If dbus sent it send the method 
reply). Free the event.

This creates some ugliness. Events retain a long multi-state lifecycle, 
which is kind of contrary to the nature of events. And theres some 
interesting bugs, such as a job which is start on starting a and 
starting b preventing a from starting until b has started, which are a 
bit difficult to address in this model.

The objective of edge-triggered upstart is to do a few things:

1) Make events non-blocking and stateless. An event is received, 
handled, and freed all in one continuous motion.

2) Handle more occurrences in upstart as events. For example, a process 
forking generates a formal event. This involves there being more than 
one type of event (most of which are not visible outside of the code 
itself) to prevent cluttering the namespace or exposing wrong things to 
the user.

3) Handle blocking as further event responses. An event doesn't start a 
service, then wait around for it to get finished. It starts a service, 
then tells Upstart to wait for another event signalling its readiness 
and go on its way when it occurs. The event can then be forgotten 
immediately.

4) Serialize the queue. We always read the leftmost and only the 
leftmost event in the queue and do not read the next one until it is 
handled.

To do this we use two new primitives:

Trigger: A trigger is simply a pairing of an event operator (one 
containing only OR operations) and a callback. The trigger exists in a 
global table, and when any of the events matched by the operator occurs, 
it "fires" and runs the callback.

Counter: A counter is a pairing of a numerical value and a callback. The 
value can be either incremented or decremented. When the value reaches 
zero, the counter runs the callback.

With this our workflow changes:

1) We associate a new counter object with the event.

2) We run the event through the trigger table. Several triggers react 
and fire callbacks which:
      A) begin to start a job
      B) Increment the counter
      C) Set up a new trigger to decrement the counter when the job 
fires its "started" event.

3) As each job comes online it sends an event. That event in turn fires 
a trigger which causes the counter to be decremented.

4) The counter reaches zero, its callback fires off whatever response is 
needed to respond to the sender of the original event.

An added advantage is that event operators can now be greatly 
simplified. They need no longer contain state. When a new job class 
registers, it simply transforms its event operators into a series of 
triggers and counters.

--CJD