overlapping constraints (ec2-instance-type with arch/cpu/mem)
William Reade
william.reade at canonical.com
Fri Dec 16 16:13:42 UTC 2011
All the constraints we've specced are currently independent... *except*
for ec2-instance-type, which implies settings for arch, cpu, and mem.
[0]
-----
The spec said that "the tightest constraints win", but that's wrong
because:
env: [mem=16G]
service: [ec2-instance-type=m1.small]
...would get you an m2.xlarge (which definitely doesn't match intent).
-----
It has also been suggested that the most specific constraint should win,
but IMO that's wrong too because:
env: [ec2-instance-type=m1.small]
service: [mem=16G]
...would get you an m1.small (which again differs noticeably from what
the user wanted).
-----
I feel, at the moment, that the least worst solution is to implicitly
convert ec2-instance-type constraints into cpu/mem constraints at each
level, so (for the purposes of constraint overriding, at least):
[ec2-instance-type=m1.large] == [cpu=4 mem=7.5G]
...and the fact that m1.large implies arch=x64 should be ignored.
Thus:
env: [ec2-instance-type=m1.small]
== [cpu=1 mem=1.7G]
service: [mem=16G]
...evaluates to [cpu=1 mem=16G], which turns out to be an m2.xlarge;
while:
env: [mem=16G]
service: [ec2-instance-type=m1.small]
== [cpu=1 mem=1.7G]
...evaluates to [cpu=1 mem=1.7G], which is exactly what we want.
-----
Note that if arch wasn't ignored:
env: [ec2-instance-type=m1.small]
== [arch=386 cpu=1 mem=1.7G]
service: [mem=16G]
...evaluates to [arch=386 cpu=1 mem=16G], which is impossible to
satisfy.
-----
Please also note that t1.micro is rather hard to describe, because of
the bursty cpus; I'd be inclined to say something like [mem=613
cpu=0.01], but I'd be happy to be corrected.
-----
Finally, I think that these are all actually edge cases: when a user
specifies [ec2-instance-type=m1.large], he's *thinking* in terms of
ec2-instance-type anyway, and he's *much* more likely to override with
[ec2-instance-type=m1.xlarge] than he is with [cpu=8].
Still, the consequences of accidentally firing up 50 cc2.8xlarges
instead of 50 t1.micros are more than somewhat serious, especially if
you leave them running for a couple of days (or weeks...), and I think
we should take the risk into account when choosing a solution.
Thoughts?
William
[0] orchestra-name and orchestra-classes overlap to a certain extent,
but the worst outcome from screwing those up is "no available machines",
which is easy to fix, while the worst outcome from an ec2-instance-type
screwup is a delayed "OMGIHAVENOMONEY", which I'd prefer not to
contribute to inflicting on anyone.
More information about the Juju
mailing list