machine placement spec

William Reade william.reade at canonical.com
Thu Nov 10 16:48:23 UTC 2011


On Thu, 2011-11-10 at 13:18 -0200, Gustavo Niemeyer wrote:
> I'd not even mention that option. It was knowingly born as a hack that
> would have to die, rather than as a planned option.

Heh, wasn't explicit about the "yay, we can kill it!" subtext :).

> We've agreed to keep unit-level constraints out for the moment at the
> end of the sprint, so it'd be nice to leave them out of the spec as
> well, so we can avoid getting into details on that. No matter what
> happens, we have to get service-level constraints right, and the
> introduction of unit-level constraints down the road, if it turns out
> to be important, should be made without compromising good behavior of
> service-level constraints.

Ah, I'd missed that, thanks. I guess we can still do anything we need by
tweaking service config before add-unit.

> That's not the right place. Environment settings should be in the
> environment configuration, inside ZooKeeper. We've already went too
> far on the environments.yaml hacks, and are already getting problems
> out of it. This file is supposed to offer the identity of the
> environment so that communication with it can take place, but nothing
> else.

Yay! Seemed safest to assume juju's current state, where possible, but
I'm only too happy to see environments.yaml tidied up.

> > * additionally, service and environment values should be subsequently
> > editable with `juju set`, but I understand there's some work to do
> > before it'll work with environments.
> 
> "juju set" changes configuration of services. Not sure if that's the
> right place for that.

I had the impression we were in favour of expanding "set" and "get" to
cover as many cases as possible, to minimise verb count. I'll think some
more about it.

> > When specified on the command line, each individual constraint is
> > signalled with `--constraint` or `-c` followed by a `key=value` pair.
> 
> What about multiple constraints? Spaces?

The idea was:

  juju deploy foo -c ram=512M -c "orchestra-classes=blib blob blub"

Bad idea, or just badly expressed?

> >  * `ram`: The minimum desired memory for the machine, defaulting to
> > 512MB; any floating point number >= 0.0 and suffixed with M, G or T is
> > valid.
> 
> I don't think we need floats here.

I don't think we *need* to understand floats, but I think I'd rather
type "2.5T" than "2560G". I think the slightly greater ease for users
outweighs the minimal -- if any -- additional effort in parsing.

> >  * `storage`: The minimum desired persistent disk space for the
> > machine, defaulting to 4GB. Valid inputs as for `ram`.
> 
> I suggest keeping this out of the current interaction. This is a
> nebulous area with interactions with the storage-specific features and
> that won't affect the final outcome of the key design in this
> document.

Fair enough, I'd prefer not to hamstring an important future feature.

> >    * `ec2-zone`: availability zone within the region (which is an
> > environment-level setting, and will remain as such for now). Defaults to
> > "a"; valid values depend on the EC2 region setting.
> 
> Why is it an environment-level setting?

The region is an environment-level setting because it already is.
Assuming rearrangement of environment config, I imagine it won't be, so
this can change. ec2-zone was never intended to be environment-level.

> >    * `orchestra-name`: to allow admins to specify a specific instance
> > by name, if required. Unset by default; the valid values are the set of
> > instance names exposed by cobbler.
> 
> Can we please keep that out? There's nothing that this can achieve
> that is not doable more generically by orchestra-classes.

by orchestra-classes *plus* asking the admins to add a bunch of
single-machine classes named after their systems. I understand that we
may prefer that admins not think about individual machines; but they [0]
do, and they will continue to do so; and so, IMO, it would be better to
gracefully accept this (cheap, easy, consistent) requirement than to
deny it and force them to do extra work and abuse another constraint to
get the same effect.

> >  Provider constraints are only valid when used with the appropriate
> > provider, and cause errors when specified with a different provider.
> 
> That's not what we agreed to, I believe. We said we'd ignore them when
> used with the wrong provider, so that scripts and etc won't break.

...er, good point. Thanks.

> > * Override constraints: for determining machine placement in terms of
> > existing juju components:
> 
> The term "override" didn't ring a bell for me.in this context. None of
> this word's definitions seems related to the meaning of the options
> below.

It was originally because they ignore all parent constraints, but yes:
since I decided there was no reason not to let them use additional
constraints at the same level it doesn't fit so well. I'll try to come
up with a better name.

> >  * `place-in=<machine-id>`: On a separate container in the machine with
> > juju id `machine-id`. Only valid if the machine exists (in juju state),
> > and is not already holding a unit of the requested service. If
> 
> Can we please keep that out as well for now, or rather put it into a
> future ideas without detailing its semantics?  This is related to
> multi-unit machines that is not supported right now, and is worth some
> conversation on itself as the number of exceptions you list on its
> description clearly indicates.

True. I'm reluctant, because I think a lot of people consider it to be
an important feature, but I'm at more reluctant to put unbound units
together without some sort of isolation.

> >  * `place-with=<service-name>`: On a separate container in *any*
> 
> Same thing. Mentioning this is worthwhile for the list and for our
> history, but I'd keep this in a future ideas section without detailed
> semantics. We haven't agreed on proper semantics after much
> discussion, and we don't have to agree on them right now since we
> won't be implementing it just yet and it won't affect the outcome of
> the rest of the feature, I believe.

Yep, we can completely lose all the specified override constraints
without affecting anything else. I'm a little concerned that this kills
a use case that comes up quite frequently, but... yeah, unit
isolation :( [1].

> > * We need orchestra to expose `cpu`, `ram` and `storage`; and ideally,
> > in case we end up with megamachine orchestra deployments, an API which
> 
> mega-machine orchestra deployments can easily be based on classes for
> the moment. I'd drop the reference to a special API being a dependency
> at this point.

My understanding is that the orchestra team would prefer to expose a
distinct API rather than add this to cobbler, but I defer to someone who
knows for sure how they'd like to do it. Anyone? :)

Special API may not be a dependency, but the actual information
defnitely is.

> 
> > * We need to extend `juju set` to allow for (1) environment changes,
> > which could be a moderately large change, and (2) service changes that
> 
> It's not clear to me that 'juju set' is the proper place for that. The
> command has a completely different shape and outcome, and deserves its
> own options and help text.

So, something like "juju set-constraints", which can affect either
environment or service?

> > * An additional "generic" `gpu` constraint, defaulting to `0`, allowing
> > us to generically specify a cg1.4xlarge, and giving us the possibility
> > of extending orchestra to expose this as well. Not sure how we'd measure
> > GPU power.
> 
> Also feels like a nebulous area. It isn't just about having a GPU, but
> which GPU it is, etc.

Hence the "not sure". Probably indeed too nebulous to deserve inclusion
even as a "future" item.

> > * Additional provider constraints, including (surely non-exhaustive;
> > please contribute ideas):
> >
> >    * `ec2-image-id`: image ID. Will need to be used with care; could
> 
> This would be a significant mistake, IMO. Encouraging usage of custom
> AMIs for charms will degenerate the charm's content and undermine the
> overall design in ways we didn't really think through yet.

Cool, I'd be happy to lose the capability myself. Do we know if anyone's
currently depending on it?

> > * Max constraints: allow generic constraints to also take the value
> > `max`, meaning "the best available". (If you specify `cpu=max` and
> > `storage=max`, the constraints cannot be satisfied unless the available
> > machine with the (equal) greatest amount of storage also has the (equal)
> > most processing power.)
> 
> Feels dubious. Can't imagine good scenarios where an admin would care
> to use the maximum available without knowing what it is, and even
> 
I think this one came from the server team meal; the suggestion was
something like "I'd like to just ask for the machine with most
processing power for my nova-compute unit". Would a sysadmin give an
opinion here please?

> harder to imagine he'd be wiling to do nothing if the machine that has
> 1.5GB has less CPU than the one with 1GB.

That's only going to be the case if he specifies "cpu=max" *and*
"ram=max"; in which case he's quite explicitly saying that he wants a
machine with at least as much of both RAM and CPU as any other he has
access to. It's no different to failing because you specified, say,
"ram=38T": juju can't give you what you asked for, so it fails.

So: I'd expect people to generally use one "max" at once, but I thought
it was important to specify how multiple max~s would have to interact.

> As a general guideline, we should try to keep our focus on relevant
> use cases at this stage.

Everything in this document is either something we discussed in the
design sessions, or that at least one person at UDS mentioned
wanting/needing. We don't have to include them, but we should still
consider them.

> > of a unit of `service-name` will lead to the addition or removal of the
> > corresponding unit of the requested service. If `service-name` is
> > destroyed, the requested service will not be, but the only running units
> > will be any that were deployed separately from the scale-with request.
> 
> That said, let's not detail its semantics either to avoid having to
> debate about them right now. I'm not sure it makes sense to destroy
> service B's units on an explicit removal of service A in its entirety,
> for instance. Sorting out these details can be done in a future
> conversation.

Sounds sensible.

> > * Roles: named groups of constraints (better name than "roles"?). Useful
> > for OAOO-ness and reduced typing when scripting or running from the
> > command line; also useful for recording intended machine characteristics
> > when not otherwise translatable (for example, when serialising a
> > deployment as a stack, it would be thoughtful to include a
> > `fast-network` role to hold the `orchestra-classes=rack-c` constraint
> 
> Even though I suggested that in the sprint, I'd keep that out of the
> spec entirely. There are important shortcomings in that feature, like
> the fact roles would be a flat unorganized namespace, that multiple
> charms could potentially conflict without being aware of, and with
> strange interactions in the case of multi-layered stacks. It's also
> not clear if there's much benefit in comparison to the role concept we
> already have in place through the existence of services as a model.
> This feels like a big gray area to me, that could easily feel like a
> good idea, and be a big mistake.

Hmm, I'd thought that roles would have no place in charms (we're keeping
constraints out of charms, right?) and would have to be namespaced by
stack regardless. But, indeed, it's a significant feature of dubious
value; we can dust it off and look again if we need to in the future.

> Thanks a lot for the comprehensive spec WIlliam. This is very helpful.

My pleasure :).

Cheers
William


[0] Anecdotally, some of them do; speculatively, enough of them to
matter do.

[1] I do wonder whether the costs of non-isolated units may not in fact
be outweighed by the benefits of machine sharing, despite the extra
risk: the feature itself wouldn't need to change much if we implemented
it without isolation, and the people who really want multi-service
machines may consider it an acceptable cost.

Except, hmm. Security implications, I guess?




More information about the Juju mailing list