[Maas-devel] RFC: "Serialising" power actions

Gavin Panella gavin.panella at canonical.com
Tue Sep 16 10:42:48 UTC 2014


On 15 September 2014 22:34, Graham Binns wrote:
...
> 1: The current power action blocks all others until it as completed. Other
> power actions will be queued and executed in turn.
> - or -
> 2: Each power action supersedes any action that is currently executing — the
> existing action is cancelled and then the new action is run.
> - or -
> 3. We track the current ("now") and "next" actions for the node, but drop
> every action that comes in once those two slots are full.

I think #1 is wrong; apart from stress-testing I can't think of a
situation where I'd want every panic-and-frustration-induced click of
the power buttons in the UI to be recorded and acted upon. I just want
it to do the last one.

#3 is like #2, but you have to wait for the currently executing command
to finish. Boring!

I think #2 is the right starting point. In addition:

- If a power-on command is sent to a cluster, and the cluster is already
  attempting to turn the node on, the command should be silently merged
  with the existing command.

- Likewise for power-off.

- It may be interesting to include a discriminator from a monotonically
  increasing sequence with each power command. A power-off command that
  is received by a cluster with a discriminator lower than a running
  power-on command would be rejected.

  In practice I doubt this will make much difference, but it's worth
  mentioning even if only to reject the idea.

>
> At first glance the second option is simpler — just cancel whatever's
> there and then do our thing. But I think that it's actually a bit
> deceptive. Consider:
>
>  - How do we "cancel" an action?

Cancelling means keeping a reference to a cancel function in a shared
location in the cluster. Shared state can make people feel dirty, but
Twisted's single-thread model makes it pretty benign.

Doing this would also allow the region to ask the cluster things like
"are you changing a node's power state?".

>  - How do we ensure that we're not going to end up in an inconsistent state
> if the node is already responding to action #1 when we cancel it?

The power control code has been improved to be less fire-and-forget and
more fire-and-keep-firing-until-it-is-in-the-desired-state. Cancelling a
power-on and starting a power-off should Just Work.




More information about the Maas-devel mailing list