preliminary machine placement discussion

Kapil Thangavelu kapil.thangavelu at
Tue Nov 8 17:00:12 UTC 2011

Hi William,

Thanks for capturing the discussion from the UDS sprint, and kicking of the 
thread. Additional comments inline.

Excerpts from William Reade's message of Tue Nov 08 10:51:56 -0500 2011:
> Generic constraints
> ===================
> To start off uncontroversially, I think there is general agreement that
> the set of machine properties exposed by *all* providers is pretty
> small: we can be reasonably sure that we can know about a provider
> machine's CPU, memory, and storage, and that's it [0].

What CPU means is a pretty rich topic onto itself. CPUs differ widely
in architecture (sparc, arm, x86) and even by capability within the same
product family name, ie a penryn xeon is a vastly different beast both in terms 
ofperformance and power characterstics then a sandy-bridge xeon. I think
portability in this aspect will require some sort of normalization against a
known quantity, in that regard the ec2 compute unit is probably a reasonable
standard. As far calculating such a value, i'd see it possibly as a binary that 
does some computation across a few runs and spits out a number along with 
gathering cpu info.

Storage is also a rich topic, a single 1tb disk is a different quantity than
two 300gb disks and a 256gb ssd. We could probably simplify as some
available fs size. Ignoring that for a moment, the meaning of storage in a cloud 
environment is also different in that it touches upon volume management to 
fufill allocation. ie. storage is not nesc. an inherent quality but an allocated 
one. For now though simple gb count sounds reasonable.

> Given this, it's pretty easy to imagine a really minimal vocabulary for
> describing machines; something like the following (note, actual syntax
> may vary):
>   juju deploy nova-compute --constraint cores=128,ram=16G
> Notice that the deploy command doesn't place any constraints on the
> machine's available storage; an unset "storage" constraint, for example,
> will be taken to mean "at least 0 bytes of storage".
> However, there's already a bit of a problem here, and it's actually
> quite a significant one: what exactly does the constraint apply to? The
> service, or just the unit?
> If that's only a unit-level setting, I think our users will quite
> reasonably come to hate us: the last thing they want is to specify the
> same constraints every time they add-unit, but if they ever forget
> they'll end up deploying nova-compute to a bunch of m1.smalls.
> Therefore, I think, it *must* be a service-level setting, despite the
> additional hassle we'll encounter.
> However, users *will* sometimes want to deploy additional units of a
> service to machines different to the first unit, so machine constraints
> cannot be settable *only* at the service level. However, we immediately
> have a potential ambiguity of expression when a user comes to do the
> following:
>   juju add-unit nova-compute --constraint cores=64
> Our choices are to shadow the service-level setting (so that there's no
> RAM constraint on the new machine), or to combine them in a "helpful"
> way, such that the "ram=16G" constraint is kept while the "cores=128"
> constraint is overwritten.
> I do not know what the right answer is here, but I know it's an
> important distinction, and we should do our best to pick the least
> potentially annoying option; especially since this feature will almost
> certainly end up allowing for constraints at the environment level [1].
> However, I'd like to quietly acknowledge this problem and move on for
> now. We'll need to solve it regardless, and the problem of *how* we
> specify is subordinate to the really meaty problem of *what* we specify.

For deploy constraints, i think they apply generally to the service as 
deployment defaults. Unit level constraints automatically flow from the service 
level constraints, but can be overridden on a per service basis. Possibly, the 
service level deploy settings an exposed mechanism for manipulation of the 
default, unclear though since it has no activation against the existing units.

> Provider-specific constraints
> =============================
> The sad fact is that the generic constraints defined above are not
> sufficient to solve the placement problem to everyone's satisfaction.
> Scenarios that cannot be captured include:

Its not sad, its awesome ;-), environment specific features are differentiators, 
sadness only arises from portability, but this is a deploy time constraint, 
which allows taking advantage of the feature particulars of an environment 
which is great.

> * I want haproxy in rack c7 of my datacentre, because that's got a
> really fast connection out.
> * I'm on a budget, and I need to deploy on m1.smalls rather than
> m1.mediums, and this constraint actually has *nothing* to do with the
> actual machine resources that would be ideal for my task.
> * I want swift to run on some machine which has 8TB of storage *in some
> specific RAID configuration*.
> * I'm just playing, and I'm happy with a proof of concept deployed on
> t1.micros, even if they do just stop working quite often. 
> * I want to deploy this mongodb unit in EC2 availability zone B.
> I think that all these constraints can be expressed with a single
> mechanism: a provider-derived machine "class".
> * On EC2, the available classes are published anyway, so it's not hard
> to translate a "zone-b" or "m1.large" class into the appropriate
> request.
> * On Orchestra, the mgmt-classes field can hold arbitrary information of
> this nature (which would be defined by the sysadmin ahead of time; juju
> just accesses what's available). We can easily specify machines by
> mgmt-class (or classes) [2].
> * On OpenStack... um, I *hope* there's some way to query for this sort
> of thing, but I don't actually know how to do it, or what's available.
> Informed opinion, anyone?
> Now, classes also intersect uncomfortably with the "how" problem above
> -- there are groups of classes which are mutually exclusive, and others
> that aren't [3], and I don't think there's any way for us to tell the
> difference at juju level [4]. Again, we need thought and care to figure
> out how to combine requirements expressed at different levels, but that
> isn't what I'm fundamentally concerned with here.

class is a rather generic term, arrising i think from an orchestra impl detail, 
but the usage here would promote multiple inheritance.. oh wrong paradigm ;-)

i think of these as provider specific constraint vocabularies... ie location: 
foo, machine-type: m1.large, etc. we can extend to orchestra with some syntax 
for auto created management classes from inventory.

Where possible we can establish mappings of these to generic resource 
constraints, ie m1.large has x cpu, y mem, z disk to aid in portability
when capturing them.

> Capturing constraints
> =====================
> When we come to implement stacks, the choices we made here and now will
> rather, er, constrain what we're able to do, and it's important to me
> that we don't accidentally damage the utility of our future stacks
> implementation.
> Clearly, we could restrict ourselves to simple machine parameters only,
> but I feel these are tailor-made for screwing things up: this is
> because, IMO, people's hardware choices are inevitably strongly
> influenced by what they have available. When I pick an m1.medium I'm
> *not* picking it for one simple parameter only; I'm picking it because
> it's the best *of the available options*, and there's no guarantee that
> a machine tailor-made for my use case would precisely match -- or even
> equal -- an m1.medium's parameters.
> That is to say: it's reasonable for us to translate from concrete
> requirements into provider-specific resources, but not the other way
> round; put alternatively, if we attempt to determine what people want
> merely by inspecting what they happen to have already, we're unlikely to
> get it right.

interesting. its important to note that capturing deployment resource 
constraints isn't nesc. even meanigful in the same environment, based on 
intended usage (ie deploy time considerations) of the service, even ignoring 
resource availability. ie. we need to make it easy to redeploy a given service 
set, but also to modify that deployment wrt to constraints. 

I still think their is value in doing the reverse provider mapping to generic 
constraint where possible along with keeping the provider specific vocabulary 
attached to it ( at least by default for cloud providers).

> So... what can we do? While I hate to introduce another new concept, I
> think it's justified here: we want to be able to group constraints as
> "roles". This has two notable advantages:
> * We can simplify command lines -- considering all the other possible
> options we already handle, it'll be quite convenient to be able to do
> things like:
>   juju set-role compute --constraint cores=128,ram=64G
>   juju deploy nova-compute --role compute
>   juju set-role compute-spike --constraint cores=32
>   juju add-unit nova-compute --role compute-spike
> ...or even:
>   juju set-role compute --constraint cores=128,ram=64G
>   juju set-role compute-spike --constraint cores=32
>   juju deploy nova-compute --role compute
>   ...
>   juju set nova-compute --role compute-spike
>   juju add-unit nova-compute
>   juju add-unit nova-compute
>   ...
>   juju add-unit nova-compute
> * More importantly, it gives us a mechanism for capturing the *intent*
> of a set of constraint: so, even if we can't turn "rack-c7" into a
> provider-independent constraint, we *can* encode the fact that we'd
> prefer to deploy haproxy to a well-connected machine by specifying (say)
> a "fat-pipe-proxy" role.

hmm.. a purely semantic intent against an unstructured vocabuarly?

> When we come to implement stacks, I think this gives us the best of both
> worlds: a role is both a place to store what provider-independent
> preferences we can, *and* a hook off which we can hang additional
> provider-specific information; again, for a first cut at plausible
> syntax, consider:
>   juju set-role nova-cluster:compute --constraint cores=128,ram=16G
>   juju set-role nova-cluster:compute-spike --constraint cores=32
>   juju deploy nova-cluster
> So... does any of this make sense to anybody? I don't think the "role"
> mechanism is an appropriate part of the basic placement story, but I
> think it's a natural extension that will become useful when we implement
> stacks, and I'd prefer to ensure we don't accidentally implement
> something that will make things harder for us when the time comes.
> From my perspective, the really important questions are "what do you
> need to express?", and "can you do so with the vocabulary above,
> excluding roles, which we definitely won't have time for?".

I don't really see the value of doing additional role management. It seems like 
your just defining service level constraints again with a different management 
syntax. The value over just using the constraints seems a bit dubious. The 
posited reasons for the additional management layer are only around an 
abstraction for import/export. Yet the import usage itself means managing the 
values for this additional resource mapping contextualized to the usage 
environment prior to the stack usage, which require pre-definition mapping of 
roles for import or just accepting/defining the roles as per whats captured 
in the stack.

Really any of the deploy time resource constraints around an exported stack are 
subject to manipulation when choosing to use a stack. 

The concept of roles don't seem like they address the provider specific 
vocabulary question, the additional semantic capture in a unstructured 
vocabulary label requires human inspection and interpretation as well as 
management of the value. I'd like to just be able to deploy a 3rd party stack 

There is potentially some value here in the abstract constraint definition for 
various smart stack scaling logic, but i don't see that as something that's 
really needed for an initial constraint/placcement or even stack implementation.

Thanks again for kicking off the discussion.



> Cheers
> William
> ----------------------------------
> [0] My understanding is that we can depend on getting this information
> out of orchestra somehow, even if the mechanism isn't yet defined.
> [1] And one day at the stack level, too, which intersects uncomfortably
> with the simple preference ranking of [environment < service < unit].
> [2] In fact, we could fulfil a *big* sysadmin-perspective requirement
> here very easily with special class syntax understood by orchestra: if,
> in addition to the mgmt-classes, we make available a set of
> pseudo-classes like "name:node01", "name:node02", etc [5], the orchestra
> sysadmin has freedom to specify specific machines when he wants to.
> And, based on my conversations, sysadmins really *really* want to be
> able to specify individual machines; IMO we ignore this requirement at
> our peril.
> [3] For example, specifying "m1.small" conflicts with "m1.large", but
> not with "zone-b".
> [4] OK, we could easily encode knowledge about image type vs
> availability zone into the EC2 provider, but we have *no* such
> guarantees for an orchestra provider.
> [5] Hm: we could, I suppose, expose an awful lot of cobbler state by
> this mechanism, if we were so inclined. Can anyone chime in with whether
> this could be useful to them, or whether just exposing names is all they
> can imagine needing?

More information about the Juju mailing list