Proposed 1.0 semantics specification
cdahlin at redhat.com
Mon Jun 15 15:10:15 BST 2009
On 06/15/2009 08:52 AM, Scott James Remnant wrote:
> On Sun, 2009-06-14 at 15:57 -0400, Casey Dahlin wrote:
>> This is my attempt to record and formalize Scott's description to me of
>> the way Upstart 1.0 should behave. Its not complete and need some
>> review, but there's a critical mass there now and I think its time to
>> draw attention to it.
> Thanks Casey, this is a great start to discussion! It's also a pretty
> good way for me to read things I've said, and figure out where I've gone
> wrong <g>
> My notes/thoughts/ramblings below:
> Events are also generated during Job state changes, they carry the
> same environment as the state. This is probably where most events come
> Semantically they look pretty similar:
> "while network-device eth0 up ADDRESS=xx:xx:xx:*"
> "on network-device eth0 up ADDRESS=xx:xx:xx:*"
> The first matches the state, the second matches the event.
> You call them Classe, which is what they're called in the 0.5 code ;)
> Sometimes when I write them down, I call them Prototypes.
> I'm vaguely moving towards just calling them Configs, since they are
> the object that directly represents the /etc/init/*.conf files.
> I wouldn't describe the config as having dependencies and start_on;
> instead a better way to look at it might be that a config defines the
> job creation criteria, and the start condition for a job.
> A class/config is exactly intended to be a template configuration for
> a service.
> And yes, it's fully intended that a config doesn't actually need to
> have any executable or daemon attached to it - and the resulting jobs
> represent states of the system.
I went with Class/Instance since I figured a good number of people have done some OOP and are familiar with that relationship. However it gets weird in that this spec describes both of them as Objects in the OO sense, which is a bit disorienting (unless you're a ruby person :). The phraseology might be worth grinding on, as we've named these things 4 or 5 times now and its gotten confusing.
> Configs are never started or stopped, the only exposed method is to
> create jobs from them.
> The D-Bus interface for that might look like:
> job =
> conf = Manager<com.ubuntu.Upstart.Manager>.GetConfigByName("apache")
> job = conf<com.ubuntu.Upstart.Config>.CreateJob()
Ah. As I've defined it there's no way to manually create jobs. However Upstart ensures that there's always a spare Waiting job of any conceivable type waiting around to be started.
Also, the class-level 'start' method is just a shortcut for "find an instance that's not busy and start it." Its there on the grounds that if a user says "start apache" it doesn't make sense to turn around and say "which one?"
> Commands like "initctl list" show the list of current *jobs*, not
> configs. "start" and "stop" take *jobs*, not configs.
> The act of loading a config into Upstart will automatically create jobs
> from it, as will state changes in the job creation criteria.
> Consider the infamous "D-Bus interface to apache" example:
> while apache and dbus-daemon
> When this is created, a job will be created for each possible pairing of
> apache and dbus-daemon.
> If we assume that these are both freshly created, there will be one job
> each, so there is only one possible pairing. Thus we have a single
> apachedbd job available that will show up in the lists and can be
> (The single apache and dbus-daemon exist because of the exact same
> method, at some point there will be a job that has no "while" clause -
> they always get a job anyway)
I don't think any of this disagrees with the spec as I wrote it. The Re-population section should describe automatic job creation exactly as you talk about it here.
> If we now start the apache job, then start the dbus-daemon job, the
> while condition is satisfied so the apachedbd job is started too
> (because there is no "on" and the job is not in manual mode).
> If we now create a *second* dbus-daemon job from the original config
> (assumedly with a different bus address), we'll now have two possible
> pairings of apache/dbus-daemon -- therefore a SECOND apachedbd job will
> be created to match it.
> When we start this *second* dbus-daemon job, our *second* apachedbd job
> will be started along with it.
> Both apachedbd will show up in job listings, and both can be started and
> stopped manually.
> Thus it is jobs (what you call instances) that have the Start and Stop
> method, not the Config.
> A few points:
> while specifies the job creation criteria and the ability to start
> on specifies the start criteria
> a config without while always has a job created
Implicit in the Re-population section as I put it (actually we could use some more text as to what an empty dependencies list means).
> a job without while may always be started
As I've specified it, a job doesn't exist unless it can be started.
> a job without on is always started when it can be
I think I mentioned this explicitly.
> Restart isn't "stop then start", it's actually an immediate atomic
> toggle of the state. The job state is changed to stop, the process
> begun and then immediately changed back to start.
This begs a lot of questions. What happens while waiting for the actual service to come up? Do we stall the queue and block on this operation? Can other jobs be operated on during this time?
> The Queue:
> Right now, the queue is for events only; not methods and it's only
> really there to avoid excessive recursiveness. Commands are not queued,
> they are acted on immediately.
> Not sure whether this is right or not, of course <g>
This gets into that opening clause on the document. Upstart should /behave/ this way. It doesn't necessarily have to /be/ this way. I suspect that right now, Upstart essentially acts as though the queue were done in this manner, even though it isn't.
I also think Upstart would be simpler if we did more as actions out of a queue, but whether it would be enough of a gain to warrant the code changes is left as an exercise to the reader :)
> Garbage Collection:
> I don't think you've got this right, instances are not nominally
> destroyed - since they need to exist so they can be started on.
> Certainly its while clauses being in waiting is not a reason to be
> destroyed, just a reason for that instance to remain in waiting itself.
The logic here is:
* If you can't start an instance it shouldn't exist
* If an instance that could be started doesn't exist, it should be created.
So we destroy:
* Instances that don't have their dependencies satisfied (Its ok, since as soon as those dependencies are satisfied again we'll re-create it)
* Instances that are exactly like other instances (if we want to start two of an instance, we start one and then another will be created for us to start).
More information about the upstart-devel