Sprint Feedback

Tom Haddon tom.haddon at canonical.com
Thu Jun 2 12:02:11 UTC 2011


Dear Ensemble Team,

I've just come back from a Sprint for a subset of the sysadmins at
Canonical who are responsible for deploying and managing some of the
services that Canonical runs (Landscape, Launchpad, Ubuntu One, etc.)
during which time we were focused on implementing puppet for as many
services as we can. However, we're also very interested in Ensemble,
although we know that it's not quite ready for production usage yet. I
thought it might be useful to give you an idea of what things we like
and don't like about Puppet, and what things (from our investigations so
far) we like and don't like about Ensemble.

So first of all, a few comments about our general needs to give you some
context:
- We need to be able to install and configure new instances of existing
services so that we can scale services up, or to replace services that
are running on older hardware.
- We need to be able to deploy new versions of code to these services on
a frequent basis in a consistent way.
- We need to be able to relatively gracefully recover from deployments
of code that cause regressions.
- We need to be able to deploy distinct services in a consistent way. In
other words, deployments of Landscape from our perspective should look
as similar as possible to deployments of Launchpad.
- We need to be able to monitor and debug applications that have been
deployed over the lifetime of the application.
- We need to be able to easily understand the state of the servers that
our services run on.

== Things we like about Puppet ==

- Declarative state. This makes it easier to manage services over the
longer term, because you can be assured that systems are configured the
way you've told them to be configured.
- No-op mode allows you to test what changes would be applied by a given
update to puppet.
- Can run with different environments - this allows you to try things
out on some servers before applying to all servers.

== Things we don't like about Puppet ==

- Hard to do deployments from within Puppet (it's configuration
management tool, not a deployment tool - currently we plan to keep using
our own deployment tools).
- Hard to clean up after itself if we alter the configuration we want.

== Things we like about Ensemble ==

- Clean syntax and very simple to deploy services.
- Powerful concepts that hold the promise of allowing easy scalability
of services.

== Things we don't like about Ensemble ==

- Ensemble seems to currently require a cloud infrastructure (EC2/S3
specifically) to run. Are there plans in the future to allow Ensemble to
run on bare metal? Our usage of EC2 has been limited for a number of
reasons, including cost and performance. If the plan was to only ever
have Ensemble work on EC2, that'd make it hard to adopt it for our
services.
- Doesn't seem to be a way to maintain state across the servers that
Ensemble is managing.
- Can't preview changes before they happen to determine if they will do
what you want them to do. Can't test out new versions of different
formulas with different "environments".

== Some other comments based on the example formulas ==

- The "utility instance" seems to be a single point of failure. If this
goes down do we lose access to everything?
- Once you've hooked items together, it's confusing to me that the
"mysql" service is saying it's relation is "db: wordpress" - wordpress
isn't a DB, so shouldn't this be saying "app: wordpress" or "db for:
wordpress"?
- When you add-unit to the wordpress instance, I don't see how this
actually provides any scalability. Presumably you'd need to be using
round robin DNS, or have a load balancer in front of all these
instances, or something like that?
- Can you use your own AMI? Different instance sizes?
- How do you apply security updates to running instances, etc.?
- Shouldn't the formulas include author info in the yaml? I'd be loathe
to create my own formulas based on those someone else has provided
unless I know who I can go to if I have problems with the formula. Also,
is there any promise of version compatibility, or is it possible that if
you create formulas that import other formulas that your own formula
will no longer work?
- Can it use elastic IPs (DNS and for interacting with "static"
services)? Can it interact with services that are not part of Ensemble
(i.e. DB servers that are in a DC rather than in EC2, or servers that
you don't want to run with Ensemble for some other reason)?
- What security is there in terms of if one server in an ensemble
cluster is compromised? How much information is shared between the
instances with zookeeper and what's to prevent one server from querying
all information on other servers?
- What is the Ensemble approach to firewalls? Is it expected that this
is a formula issue?
- It's not entirely clear to me if you could use Ensemble to replace our
current deployment scripts - they are used to push out incremental code
updates to specific services, and work by copying code into a directory
that includes a unique identifying string (usually the bzr revision of
the code in question), bringing the service down, checking it's down,
switching the symlink for the code directory we're expecting to find the
active code in to the directory we've previously pushed to, and then
restarting services, and then checking the service is up. This can be
done in parallel or serial, or a combination of both (groups of servers
serially, each group in parallel). We can also add in custom hooks to do
things like "set read-only mode" for a given service fairly trivially.

== What's next? ==

Our plans from here are to continue testing Ensemble so that we can try
to realistically get an idea of what works for us and what doesn't over
the long term. Initially this involves testing how it deals with a bunch
of error states, but then we'd also like to begin writing some formulas
(I guess participating in https://launchpad.net/principia would be the
best thing here).

I think the overall takeaway as far as we can see is that Ensemble seems
suited for deploying services, but not necessarily managing services. Is
the idea that you would need to deploy your own management layer through
Ensemble, or outside of Ensemble, or is the idea that in the future
Ensemble will be able to manage services for you?

Thanks for reading!

Tom





More information about the Ensemble mailing list