[Maas-devel] RFC: "Serialising" power actions
Mark Shuttleworth
mark at ubuntu.com
Tue Sep 23 10:00:39 UTC 2014
On 17/09/14 01:09, Julian Edwards wrote:
> I have severe reservations with the approach discussed, which boil down to:
>
> * superseding power actions is undesirable
> * you cannot rely on cancellation of an outstanding operation (in what state
> would it leave the machine?)
> * Storing state in the pserv without a means to recover it is a recipe for
> disaster
Agreed.
> Here's my counter proposal again, which I think is a lot simpler:
>
> 1 Already implemented: pserv is dumb and just issues power commands as
> requested, with a callback to the region for failure and success.
The main process needs to be able to:
* know that something has timed out
* stop it from continuing
* verify that the BMC is not "stuck" in some way, before
* declaring the previous effort a failure and either retrying or giving
up with an error
I worry, if pserv is a "dumb fire and forget" mechanism, that this will
be hard. We will once again be throwing a job over the fence and hoping
we get a callback. Stay in control.
> 2 We do not allow concurrent power operations while an outstanding one is in
> progress (ie wait for the callback), although you could detect a request that
> is the same as the outstanding one and respond without an error.
Agreed, BUT you can't just wait for callback, you HAVE to be able to
decide time's up and resume control without having a rogue process out
there still trying to do something. Control means you STOP whatever was
being attempted, you reassert and verify your ability to talk to the
BMC, then you either retry or give up on that effort.
> 3 We add a new column to Node to indicate the desired power state (if it's
> different from the current one it indicates an outstanding operation). This
> has the bonus of being something you can display in the UI.
Implementation is your domain.
> 4 If the pserv (or its link) goes down, when it comes back up we need to
> either re-issue the outstanding power requests or request the current state
> and correct it as necessary. This is potentially work that can be deferred
> for now, but it cannot be left out altogether.
This smells bad to me - we are NOT IN CONTROL if pserv can go away and
we have to guess at what it is doing behind our back.
Mark
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 181 bytes
Desc: OpenPGP digital signature
URL: <https://lists.ubuntu.com/archives/maas-devel/attachments/20140923/95487431/attachment.pgp>
More information about the Maas-devel
mailing list