[Maas-devel] RFC: "Serialising" power actions

Gavin Panella gavin.panella at canonical.com
Fri Sep 19 08:16:20 UTC 2014


On 19 September 2014 01:33, Julian Edwards wrote:
> On Thursday 18 Sep 2014 11:39:53 Gavin Panella wrote:
>> It doesn't matter. A power command is now more than fire-and-forget:
>> it seeks to leave the node in the requested state. If it fails to do
>> so, it marks the node as broken.
>
> Sorry, I've no idea how that relates to cancellation. Are you saying
> that if someone cancels a power op it should mark the node as broken?

Suppose a power-on command is cancelled (by a power-off command) before
it has confirmed that the node is up, e.g. it has sent a IPMI command
but hasn't checked for indications of life. MAAS doesn't know what state
the node is in.

However, the power commands, including power-off, nowadays seek to leave
the node in a known state, or, if they fail, mark the node as broken. In
other words, power-off will confirm that the node is indeed off before
declaring the job done.

There's a risk there that the node is actually girding up its loins to
turn on, but taking its time. A power off signal sent to the BMC may be
ignored (think AMT), but when MAAS's power driver checks a few seconds
later it may see that the node is off, and declare job done, just before
the node finally sputters into life.

Now, I don't know if that's actually possible: it may be that BMCs will
return something akin to "girding loins" when the node is off but about
to start, or it may even return "on", but it's the kind of thing where
we're going to have to refine the power drivers as we go. Here, for
example, the power-off command might check again that the node is still
off after 30 seconds before declaring that it's really off.




More information about the Maas-devel mailing list