[Maas-devel] RFC: "Serialising" power actions
Gavin Panella
gavin.panella at canonical.com
Wed Sep 17 08:58:47 UTC 2014
On 17 September 2014 01:09, Julian Edwards <julian.edwards at canonical.com> wrote:
...
> I have severe reservations with the approach discussed, which boil
> down to:
>
> * superseding power actions is undesirable
This may be undesirable because there's more work to do, but I don't
think it's undesirable in general.
> * you cannot rely on cancellation of an outstanding operation (in
> what state would it leave the machine?)
Only cancel an in-progress task when there's something to supersede it,
and when the final desired state of the superseder is different to the
in-progress task.
> * Storing state in the pserv without a means to recover it is a
> recipe for disaster
I guess you mean that a crash or restart in pserv would mean that
in-progress power commands wouldn't be resumed. That's true, but it's
not a disaster. It means that for nodes in all states but DEPLOYED we
need to wait for the periodic power monitor to notice and reissue a
command (see later; it doesn't do this yet). For DEPLOYED nodes, sure,
the command will currently be lost, but these nodes are, one assumes,
under active management, and some process outside of MAAS will notice,
be that a human or a Juju or something else.
>
> Here's my counter proposal again, which I think is a lot simpler:
>
> 1 Already implemented: pserv is dumb and just issues power commands
> as requested, with a callback to the region for failure and success.
>
> 2 We do not allow concurrent power operations while an outstanding
> one is in progress (ie wait for the callback), although you could
> detect a request that is the same as the outstanding one and respond
> without an error.
>
> 3 We add a new column to Node to indicate the desired power state (if
> it's different from the current one it indicates an outstanding
> operation). This has the bonus of being something you can display in
> the UI.
We can infer the desired power state from statuses:
NEW = off
COMMISSIONING = on
FAILED_COMMISSIONING = off
MISSING = ? (unused status, afaik)
READY = off
RESERVED = off
ALLOCATED = off
RETIRED = ? (unused status, afaik)
BROKEN = off
DEPLOYING = on
DEPLOYED = not our business
FAILED_DEPLOYMENT = off
>
> 4 If the pserv (or its link) goes down, when it comes back up we need
> to either re-issue the outstanding power requests or request the
> current state and correct it as necessary. This is potentially work
> that can be deferred for now, but it cannot be left out altogether.
We can infer this from the table above; the periodic power monitoring
job can be enhanced to enforce this.
>
> So in terms of work to do, it's quite easy and quick.
>
> J
More information about the Maas-devel
mailing list