start/stop hook guarantees
william.reade at canonical.com
Tue Dec 6 15:50:28 UTC 2011
On Sat, 2011-12-03 at 15:01 -0200, Gustavo Niemeyer wrote:
> This all makes sense to me, William. Thanks for the write up and the heads up.
Sadly, it's started to make much less sense to me as I delve deeper into
the state restoration work. As I understand it, the intent is to allow
us to smoothly transition back into a "started" state, and that doesn't
sound like a bad goal in itself. However, consider the states the unit
workflow can be in:
We don't want to explicitly run the start hook here; the service hasn't
even been installed, and the normal process of starting the unit agent
will lead us through "installed" to "started" regardless.
As above; normal startup will transition us to "started" anyway.
The chances of "start" working correctly are minimal; and, if it doesn't
work, what should we do anyway? Switch to "start_error", and obscure the
real cause of the failure?
I guess it can't hurt, in the case of a charm that doesn't use upstart
or otherwise monitor itself.
May as well retry, I suppose (but I'm not sure what justification we
have for believing the result to be any different, or why this case is
special enough to overrule our preference for requiring explicit user
action to resolve error states).
Whether it works or not, a transition to "started" or "start_error" is
going to be profoundly misleading.
Definitely a Bad Thing; we'll be breaking the guarantee that the
upgrade-charm hook will be the *first* one called after the charm
Based on IRC discussion today, "stopped" should mean "the unit has gone
away and is never coming back" , and so if by some freak occurrence
we *do* restart a machine, and the unit agent comes up "stopped", we
definitely don't want to start it again.
As above; we can't do anything meaningful from this state, and starting
from this state is actively wrong.
...so. Assuming we still want to enable the weakly-written charms
discussed previously, I think it makes much more sense to offer a *much*
more limited guarantee; that, on the first run after reboot, the "start"
hook will be called again if the unit is in a "started" state.
The "start" hook may of course be called as a result of the unit
starting off in None or "installed", but that'd happen anyway, so it
doesn't need explicit mention.
Does this make sense?
More information about the Juju