[Maas-devel] RFC: "Serialising" power actions

Mon Sep 15 22:37:09 UTC 2014

* Graham Binns <graham.binns at canonical.com> [2014-09-15 16:34]:
> Hi all,
> 
> I'm handling the work to "serialise" power actions ??? at least, I'm getting
> started on it right now. I've  spent some time looking at the problem and I
> wanted to bounce ideas off you all ??? preferably whilst I sleep :)
> 
> So, the problem:
> 
>  When a power action is issued to a node (power on, power off, etc.), more
> than one can be in play for a node at once. We don't keep track of them
> once they've been fired, except for receiving a notification when they've
> been successful or failed.
> 
> This means that it's possible to issue two conflicting commands (e.g. power
> on followed by power off) in quick succession, which can then leave the
> node in an odd state:  it's theoretically possible that the node would stay
> powered on when MAAS expects it to be off, say if for some reason the power
> off command got executed first ??? this is even more likely with AMT BMCs,
> since there's a degree of did-I-cast-the-runes-right to get a command to
> work on those, at least when the moon is waning and the wind is from the
> east.

In OIL, we have the following situation:

many juju clients (all with separate maas creds) running in parallel
asking a single maas for machines from the same pool (--constraints
tags=hw-ok).

We encountered two issues:

  - When we run juju destroy-environment -e ${env} this releases the
  machines and marks them ready while queuing a power command (who knows
  if it succeeds or fails as MAAS doesn't appear to track that).  

  - As soon as enough machines are available (Marked Ready) for another
  run of OIL, we start up another deployment.  The nodes that have been
  marked Ready (though they may not have actually powered down) get
  allocated but may contain previous deployment state (nonce already
  used) or they "randomly" power off in the middle of a deployment
  (power command finally got run).

I assume (or hope) the serialization work is related to modelling the
full life-cycle of a machine as I think it affects the serialization
solution.

> 
> There are, so far as I can tell, two strategies for handling this problem
> properly. Both of them require keeping track of the current power action
> for a node, and both assume that only one action can run at once:
> 
> 1: The current power action blocks all others until it as completed. Other
> power actions will be queued and executed in turn.
> - or -
> 2: Each power action supersedes any action that is currently executing ???
> the existing action is cancelled and then the new action is run.
> - or -
> 3. We track the current ("now") and "next" actions for the node, but drop
> every action that comes in once those two slots are full.
> 
> At first glance the second option is simpler ??? just cancel whatever's there
> and then do our thing. But I think that it's actually a bit deceptive.
> Consider:
> 
>  - How do we "cancel" an action?
>  - How do we ensure that we're not going to end up in an inconsistent state
> if the node is already responding to action #1 when we cancel it?
> 
> The first option isn't without its problems either ??? having a queue of
> actions seems kind of awkward, and could lead to flip-flopping of a node's
> power state. But *not* having a queue could still lead to situations where
> several actions get  issued in quick succession.
> 
> The third option seems to offer a happy medium. We can track the current
> and next power actions for a node and then ignore anything else that comes
> in whilst both of those two slots are full. Each action must succeed or
> fail before the next one can be executed. This means we won't get
> potentially ridiculous amounts of flip-flopping, and we build this pretty
> easily. We'd have to have some kind of UI feedback for "hey, it looks like
> you're repeatedly powering this node on and off; I'm going to ignore you
> for a while," but that doesn't seem all that onerous.
> 
> So as it stands I'm leaning towards option #3. Questions, thoughts
> and comments are welcome.

(1) sounds like what I would expect w.r.t handling machine lifecycle,
but with modification.  I think the least suprising model is how the
power buttons work today on a x86 PC.  Once you've initiated a state
transition (off -> on, on -> off), no other input can modify that action
until the state has been achieved.  A second press to shutdown the
machine won't turn it back on.  Likewise many power on commands issued
while we're in the middle of powering on won't change the state.

Thus I dislike the idea of queuing/tracking the state commands.

I do think that *blocking* w.r.t the official node state in MAAS is
required for proper function.  If I've asked MAAS to turn on a machine,
it should keep working on making that transition happen no
matter how many other incoming commands suggest otherwise, until it's
reached some limit of time or attempts or some other failure (AMT fries
your NUC, ipimtools resets your creds...) which prevents power
state transitions. 

I can see the desire to have some sort of cancel, but I think that it
complicates things, though in the physical world there is the 'hold the
power button for two seconds' option which would suggest that maybe
there ought to be a
pull-the-power-plug-on-this-node-right-now-and-I-really-mean-it sort of
option;  but I don't think that it's neccessary for resolving the
node-lifecyle/power-serialization issues.

I think the above idea is fairly straight forward with some new states
for the nodes (Powering On|Powering Off).  No command queuing is needed
as all incoming power requests can be dropped/denied when state is
PoweringON|PowerOff or if the command is issued from someone who has not
acquired the node.  The bulk of the work would be in power command
reliability and graceful error handling and bubbling that up to the
user;  for example automatically marking a node as Broken if the power
commands fail/timeout after some threshold.

Something that meets the above criteria would be exactly what we need in
OIL.

-- 
Ryan Harper
Canonical, Ltd.
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 473 bytes
Desc: not available
URL: <https://lists.ubuntu.com/archives/maas-devel/attachments/20140915/7a6960dc/attachment.pgp>