[Maas-devel] RFC: "Serialising" power actions
Gavin Panella
gavin.panella at canonical.com
Thu Sep 18 10:39:53 UTC 2014
On 18 September 2014 00:16, Julian Edwards wrote:
> On Wednesday 17 Sep 2014 09:38:52 Gavin Panella wrote:
>> That's a good point. However, a reboot can be modelled as a single
>> power "task". If a power-off is issued subsequently, then I think it
>> should be allowed to override the reboot, and transition the machine
>> to a powered-off state. If a power-on is issued, we can say that it's
>> not permitted to override the in-progress power task. In other words,
>> where the final state of an in-progress power task is the same as a
>> subsequent power task, the subsequent power task should be discarded.
>
> I agree, but your case above doesn't meet these requirements. (ie
> final state of a reboot is "power on", so you can't allow a power-off
> to override it)
- A power-off command should be able to override a reboot, because the
end-state of power-off differs from reboot.
- A power-on command should be discarded when there's a reboot in
progress, because they share the same end-state.
Something I missed is that power-on implies booting too. If there's a
power-off command in progress, then a power-on comes in, we _should_
wait for the power-off to finish before powering the node back on, to
ensure that we go through a proper boot cycle.
The same is not true in reverse: if the end-state is off, that doesn't
need to wait for a power-on or a reboot.
>
>> Cancellation is possible. For example, an IPMI power-on command looks
>> something like the following:
>>
>> send an IPMI command to power on the node
>> wait a bit
>> is it up? Yes -> we're done
>> send an IPMI command to power on the node
>> wait a bit
>> is it up? Yes -> we're done
>> send an IPMI command to power on the node
>> wait a bit
>> is it up? Yes -> we're done; no -> that's an error
>>
>> There are many opportunities to stop in the process above.
>
> You can stop it but you will have no idea what in state the node will
> end up. Did the operation work but you cancelled it before the final
> ack? Or did it get cancelled early enough to actually stop it?
It doesn't matter. A power command is now more than fire-and-forget: it
seeks to leave the node in the requested state. If it fails to do so, it
marks the node as broken.
>
> Cancelling is inherently dangerous because you've no idea if it worked
> or not.
>
> J
More information about the Maas-devel
mailing list