start/stop hook guarantees
William Reade
william.reade at canonical.com
Tue Dec 6 15:50:28 UTC 2011
On Sat, 2011-12-03 at 15:01 -0200, Gustavo Niemeyer wrote:
> This all makes sense to me, William. Thanks for the write up and the heads up.
Sadly, it's started to make much less sense to me as I delve deeper into
the state restoration work. As I understand it, the intent is to allow
us to smoothly transition back into a "started" state, and that doesn't
sound like a bad goal in itself. However, consider the states the unit
workflow can be in:
* None
We don't want to explicitly run the start hook here; the service hasn't
even been installed, and the normal process of starting the unit agent
will lead us through "installed" to "started" regardless.
* installed
As above; normal startup will transition us to "started" anyway.
* install_error
The chances of "start" working correctly are minimal; and, if it doesn't
work, what should we do anyway? Switch to "start_error", and obscure the
real cause of the failure?
* started
I guess it can't hurt, in the case of a charm that doesn't use upstart
or otherwise monitor itself.
* start_error
May as well retry, I suppose (but I'm not sure what justification we
have for believing the result to be any different, or why this case is
special enough to overrule our preference for requiring explicit user
action to resolve error states).
* configure_error
Whether it works or not, a transition to "started" or "start_error" is
going to be profoundly misleading.
* charm_upgrade_error
Definitely a Bad Thing; we'll be breaking the guarantee that the
upgrade-charm hook will be the *first* one called after the charm
upgrade operation.
* stopped
Based on IRC discussion today, "stopped" should mean "the unit has gone
away and is never coming back" [0], and so if by some freak occurrence
we *do* restart a machine, and the unit agent comes up "stopped", we
definitely don't want to start it again.
* stop_error
As above; we can't do anything meaningful from this state, and starting
from this state is actively wrong.
...so. Assuming we still want to enable the weakly-written charms
discussed previously, I think it makes much more sense to offer a *much*
more limited guarantee; that, on the first run after reboot, the "start"
hook will be called again if the unit is in a "started" state.
The "start" hook may of course be called as a result of the unit
starting off in None or "installed", but that'd happen anyway, so it
doesn't need explicit mention.
Does this make sense?
Cheers
William
More information about the Juju
mailing list