constraints - observations and questions (bringing to list)

William Reade william.reade at canonical.com
Wed Feb 13 17:27:33 UTC 2013


On Wed, 2013-02-13 at 11:40 +1300, Tim Penhey wrote:
> If I grab some info from the current documentation, it shows that the
> python version had the following constraint options:
>  * cpu
>  * mem
>  * arch
>  * instance-type
>  * ec2-zone
>  * maas-name
>  * orchestra-classes

Also maas-tags, fwiw, but it's essentially the same as
orchestra-classes :).

> After some more reading, and poking about it seems that cpu isn't cpu at
> all, but instead refers to the ECU (EC2 Compute Units).
> 
> >> I'm still unclear around the expected inheritance of constraints as
> >> mentioned in the constraints threads where the constraints specified
> >> with the boostrap command where then applied to future deploy commands.
> > 
> > Sorry, I was almost certainly assuming too much context there.
> > 
> > Constraints can be specified both at an environment level, and at a
> > service level; when you specify constraints at bootstrap time, you're
> > specifying what you want the bootstrap node to run on and *also* the
> > environment constraints. I'd be open to arguments that this is crack,
> > and that bootstrap constraints should not be used as starting
> > environment constraints, but I'm not sure.
> 
> It is my current understanding that right now we instantiate a machine
> for each service + 1 for the agent.  If I'm at the state where I want
> larger instances for some of my charms, isn't it a waste to have a big
> machine for the agent?
> 
> I'm not sure that having the bootstrap/agent machine being big and
> default makes a huge amount of sense.

Making it configurable, though, is (I think) necessary; AFAICT it's a
choice between making the constraints we start the initial machine with
persist as future defaults, and just dropping them on the floor. I
*think* the former is saner -- if they don't apply there's no difference
(just set env constraints before deploying anything else), and if they
do then the user has one fewer hoop to jump through.

> Do we allow the user to specify environment constraints in the
> environments.yaml file?  Instead of necessarily setting the constraints

Not at the moment. I'm not necessarily opposed, but I'm also not sure
it's an especially high-value feature, because environments.yaml is
itself of questionable value once you've bootstrapped.

> for the deploy commands on the command line, does it make sense to have
> the constraints specified in a file, whether that be the environments
> file or another one I don't much care, but I can envision the situation
> where you might want something like:
> 
>  * bootstrap machine is smallish
>  * 3 load balanced medium machines with service X
>  * 1 large machine for db Y
> 
> Having the ability to have the deployment constraints in a file gives
> you a simpler way to manage the defaults constraints for particular
> services.

I'm not sure of the semantics intended here. "Any service with name X
should have constraints Y"? I'd prefer to stick to setting constraints
on actual services.

> Let me see if I've got the above right.
> 
> The first line:
> 
>    $ juju set-constraints instance-type=m1.medium
> 
> specifies an environment level constraint saying that deployments I want
> on this environment should use this m1.medium, and then
> 
>    $ juju set-constraints -s s mem=16G
> 
> says for services "s" in that environment, make sure we have 16G of
> memory.  Is this right?

Yeah -- except that if service "s" doesn't exist I anticipate that juju
will politely barf, just as it would if you tried to otherwise configure
a service that didn't exist.

> > Originally, the following was considered to be crazy:
> > 
> >   $ juju set-constraints mem=2G instance-type=m1.small
> > 
> > ...but I think it's actually meaningful if you consider mem/cores to be
> > a cross-cloud fallback that should not override an explicit request for
> > a recognised instance type.
> 
> Here I think we can get a little smarter.  Again I'm going to ramble a
> bit, so tell me if I've got obvious misunderstandings.
> 
> We have back-end environment providers like AWS, OpenStack, LXC etc.
> Cloud providers like AWS and OpenStack (and soon Azure ...) have defined
> instance types (I'm taking a stab in the dark that OpenStack actually
> has a defined instance type - or is it worse than this in that each
> OpenStack provider like HP Cloud, Rackspace etc set their own individual
> names?)  Can we, with some rationality, convert the named instance types
> for the particular providers into some set of known meaningful constraints?

Yeah, every provider gets to make up their own names. That's the problem
with any generic instance-type constraint. (OpenStack falls them
"flavors" fwiw.)

Depending on what the various providers even expose, we may or may not
be able to sanely express an instance-type as a set of cpu/cores/mem
constraints. The confusion over "cpu" originally arose because OS
doesn't expose anything like that; conversely, everything I'm aware of
does expose something like "cores", and that measure appears to have
been popularly used.

So. In general, we should be able to take a pile of instance-type
information from $SOMEWHERE and determine which match (at least some of)
the "generic" constraints; I'm less sanguine about the prospect of
defining a set of ideal instance types and turning those into
provider-specific ones with any degree of sanity.

> Actually this leads me to another tangent.  The more I read around this,
> the more both CPU (in the traditional sense) and Cores are both
> crackful. I think that we should have some measure of computational
> power (like ECU) that we can then have different instance machines from
> different providers provide a calculated (albeit roughly) value that
> juju then uses in deployment instructions.

We can get a "cores" number just about everywhere, but I don't know
where we're meant to get a "cpu" ("jcu") number from in the general
case. This is why I feel they're separate...

> This is exactly what I just mentioned above, and that we have both come
> to it independently makes it more likely a useful measure.  Especially
> if there is a public, simply found, translation mechanism that we use to
> say:
> 
>   power 1 ~= X Hz, 1 core
>   power 5 ~= Y Hz, 4 core
>   power 10 ~= Z Hz, 8 core
>   etc.
> 
> As much fun as it may be to have power defined by "small, medium, large,
> OMG huge", I think a numeric value is better, and more understandable.

+1 to numbers... but I don't think that we can generally depend on
having Hz or 2007-xeons or whatever-other-arbitrary-measure numbers
available.

> Let me come back to the instance-type concept.
> 
> Instance types make sense only for deployment into a particular
> environment, but AFAICS the constraints only make sense at deploy time,
> and when I'm deploying, I know what type of provider I am deploying to,
> and what's more, I may well have particular instance types in mind for
> those services for that provider.  To me it makes sense to allow the
> user to be explicit in their instanct-type requests.

+100 -- I do not think it's acceptable to take that language off the
table. But it's not quite so simple: the goal is that a script written
for provider X should have the best possible chance of Just Working on
provider Y.

> Which brings us back to the above crackful constraint request:
> 
>   $ juju set-constraints mem=2G instance-type=m1.small
> 
> If we are able at the provider level to translate the "m1.small"
> instance request to { mem=1.7, power=1 } then we should be able to tell
> the user that the constraint fails as 1.7 < 2.

The intent there was covered in more detail in the call/sync/proposal
thread a few days ago; in short, it's to honour *valid* instance-type
requests and fall back to cpu/cores/mem in the cases where the name's
not recognised.

The above is intended to describe the 3rd or 5th cases; either someone
originally write a script specifying mem, and an ec2 user extended it to
fit an instance-type that works well for them; or an ec2 user originally
wrote it, and they or someone else added a roughly-matching generic
constraint that'll be used on alternative clouds.

> I'd also expect the provider to be able to tell me if I ask for an
> instance type that doesn't exist.  Is this possible?  Are we hard-coding
> values, or do we have a way somewhere to get the underlying provider to
> tell us what their instance types are?

This is an interesting question. In general, I don't think we can expect
a user to have an environments.yaml file available, or even necessarily
to have the direct provider access that may be necessary to determine
instance-type validity at the command line.

There's a blueprint that roughly covers this idea [0] that's not being
worked on at the moment; I have a lurking belief that we can use a
cut-down variant of the same concept, purely in code, to allow an
Environ to expose to the rest of juju the various resources it has
available in a useful fashion... but I'm explicitly avoiding that rabbit
hole for the moment ;).

For the moment, ec2 at least is hard-coded (I don't think the numbers
are published in a reliably machine-readable format anywhere); but the
openstack API does expose *some* useful information that could IMO be
plausible be munged into some sort of useful ResourceMap type (as could
the EC2 stuff, if we went this way).

> Obviously a lot of these constraints make no sense at all for a local
> deployment, but that is another topic altogether.

My belief is that we must always be prepared to accept all possible
constraints; but to warn about and ignore any that are invalid for the
current provider [1]. I don't think anything has changed since the
original implementation to justify changing this strategy..?

Cheers
William


[0]
https://blueprints.launchpad.net/ubuntu/+spec/servercloud-r-juju-resource-map

[1] I think the distinction between "invalid" (warn and ignore) and
"valid but unsatisfiable" (error) is legitimate and useful. Varying
mileages?




More information about the Juju-dev mailing list