Unit assignment to "unused" machines
william.reade at canonical.com
Mon Dec 26 13:28:54 UTC 2011
On Sun, 2011-12-25 at 19:27 -0800, Adam Gandelman wrote:
> It seems a bit unfortunate that the only teardown event hook that exists
> is 'stop' I've found myself wishing for an additional and optional
> 'destroy' hook which does what it can to make sure the destroyed service
> and traces of it are removed from the system as best it can (service
> $foo stop; dpkg -P $foo, rm -rf this, rm -rf that, etc) The stop hook
> could, in theory, be used to do some of this cleanup but AFAIK it may
> also be called at other times during the units lifecycle to temporarily
> stop the service. I've got some ideas brewing about very interesting
> things I could do with a 'destroy-unit' hook that would help speed up
> some of the integration testing we plan on using Juju to drive, but
> thats for another thread...
Yeah, "stop" definitely shouldn't be cleaning up anything that might
need to be recovered; my understanding is that a "stop" should be safe
to follow with a "start" without data loss. (We don't really handle this
case properly at the moment, though, and even that'll be going away
if/when the robust-unit-agent pipeline makes it through review.)
> IMHO as a user, there's only so much Juju itself can guarantee. There
> are some guarantees that must be made by charms + charmers themselves.
> If I (user/charmer) were given the optional tear-down hook, and the
> ability to toggle machine reuse per env. (disabled by default?), the
> blame for the failed recycling a machine rests on my shoulders, not
> Juju's. That is user error. I'd be happy with that responsibility, as
> it would let me experiment, break it often and come up with "really cool
> stuff" in the end. Being too strict and trying too hard to eliminate
> *every* potential for user error really limits this, for me anyway...
Juju (with containerization) would be perfectly capable of guaranteeing
that a destroyed unit is really gone ; it would also make machine
reuse perfectly safe .
Aside: we might not be able to eliminate *every* potential for user
error, indeed, but I think we should have rock-solid justifications for
every instance in which we knowingly enable it: in the general case,
after all, we'll have users deploying charms with relatively little
knowledge about their contents. A screwed-up deployment will be blamed
on *juju*, not on the charm... and we absolutely cannot afford to
acquire a reputation for unpredictable flakiness.
That said, a "unit-destroy" hook would enable the sort of things you're
talking about: and in fact the ability to reuse *without* containers
would *depend* on sane unit-destroy hooks. Huh, that's what we're doing
OK, at the moment, we ideally want:
2) fast provisioning on orchestra.
...but we're not going to get that. So, perhaps we can make do with:
1) Reuse temporarily disabled by default;
2) "destroy-unit"  hooks, whose shoddiness/nonexistence won't matter
unless people explicitly enable reuse;
3) fast provisioning on orchestra.
...but even that feels like a lot to achieve by 12.04. So, IMO, the
smallest set of safe features is:
1) Disabled reuse;
2) fast provisioning on orchestra.
The common thread is, ofc, "fast provisioning on orchestra"; does anyone
have any idea how we can achieve this any time soon?
Please understand that I don't want to block the potential for "really
cool stuff", but the only scenario that actually allows for it is B (or,
potentially, A+B with *optional* containerization); and, IMO, the only
plausible short-term path is C.
 The tricky part would be destroying a container *without* destroying
the contained service...
 Assuming sufficiently advanced containerization, anyway. Regardless,
as I said, I *don't* want to *eliminate* machine reuse (and I consider
making it off-by-default to be near-enough the same, from the
perspective of the "average" user); I just want to make sure it's safe
 Don't really like that name. Better suggestions?
More information about the Juju