[DRAFT] Co-located services

Benjamin Saller bcsaller at gmail.com
Fri Jul 29 18:18:30 UTC 2011

On Fri, Jul 29, 2011 at 8:03 AM, Gustavo Niemeyer
<gustavo.niemeyer at canonical.com> wrote:
> Thanks for putting this in place Ben.
> Some comments follow.
>> Services such as logging, monitoring and storage often require some
>> access to the runtime of the service they apply to. Under the current
> We haven't debated storage in this context in the sprint, and it's
> not clear the model of formula collocation fits well in it.  As we
> covered recently here in the list, it feels like a broader problem that
> we'll want to eventually cover internally in Ensemble as a first-class
> feature.
> My suggestion is that we keep this out of the spec for the moment.

I'm fine with that, but it makes sense to me that this abstraction
would be a portable way to address this. In a sense I see it like
adding an aspect in AoP to a class where its possible that the aspect
is available on many classes. In the case of something like storage I
can imagine a single storage service with a single config made
available to many services through this mechanism.

>> Co-located service/formula: A service designed for and deployed to the
>> running container of another service.
>> Parent service: The running service in whose service units we will be
>> executing units of the co-located service.
> Is it necessarily a parent/child relationship, or would it map better
> to a peer style where both services are simply co-located with each
> other?

The reason I went with the notion of a primary containing service has
to do with the idea that

- co-located services know they can co-locate, the containing service
isn't always aware of this
- the relationship (because of the above) is generally asymmetric
- a peer model implies equals which isn't the intention. The lifecycle
of the primary service's units arei not bound to the co-located
services lifecycle (picture logging/status services for example) but
the reverse is true

>> ensemble deploy <formula> [[--with <colo-formual [<service_name>]] ...]
>> Each `with` stanza takes the name of a formula and an optional service
>> name under which to deploy it as. The --with argument can be used
> Optional parameters on arguments generally don't work very well in
> practice. E.g.:
>    ensemble deploy --with a b c
> Is b a service name, or a formula name?

In this case it might be syntax that is causing the issue, there is a
single optional param per stanza much like we use today with optional
service names

ensemble deploy wordpress myblog --with rsyslogd mylog --with
munin-client mymunin

where any of the my* could be omitted and the formula name is used.

> I also wonder a bit if we need special syntax for co-location.  It'd
> be nice if we could find a way to have it just as a normal relation in
> general, and it does feel like the problem being solved above is a
> general issue: how to deploy services while establishing a relation.
> This is the same kind of problem we'll face when we start to enforce
> required relations, I believe.  If we solve that latter problem, we'll
> solve co-located deployment syntax too.

I agree its possible, I also think this syntax makes it very clear
what the user is intending to do. I feel like its very reasonable cli.
I would need to see a counter proposal to evaluate it.

>> <y> and <z> will have their install and start hooks fired before <x>.
>> This would ease services like storage into being properly setup within
> What if we have multiple services co-located in the same container?
> IMO we should restrict the difference of co-located relations to
> containment, without imposing additional rules, and should try to
> make both sides of the relation even (no parent-child relationship).
> It makes them a lot closer to normal relations, and much easier
> to understand and use (and develop! ;-).

In this cause <y> and <z> and both being co-located in service <x> as
you ask in the question. The additional rules I though made it behave
more like the system we have today with normal relationships while
internally acknowledging the small differences.

>> While the co-located services have relationships with the parent
>> service (and the parent service may supply them with general
>> information) its is generally considered a usage error for the parent
>> service to ever query the co-located service's relation-data.
> Why?  How's that different from a normal relation?

A design goal was that the primary service wouldn't have to know that
a logging/monitoring service was co-located within its container and
thus didn't have to track the state of the all formula development for
admins to take advantage of these new co-located aspects. This is
different from normal relationships in that the interface pairing is
known (and understood hopefully) by both sides of the relationship.

>> The current model of reporting services should under-go some
>> modifications. While evolving status is an evolved topic these options
> Given the above points, it feels like the status might be pretty much
> unchanged.

It could be, I worry that seeing n wordpress services with n logging
and n monitoring (and n storage?) services co-located within them will
pollute what the admin is trying to grok with status.

>> The socket nested in the formula directory should allow each unit
>> agent in the container the ability to communicate with the proper
> Good point.
>> Formula's YAML metadata file will grow new constructs. The first is a
>> `co-located` top level flag which can be 'allow', 'always', 'never'
> This section is diverging significantly from what we agreed in the sprint.
> IIRC, the idea was simply to allow defining a relation as co-located. E.g.:
> requires:
>    syslog:
>        interface: syslog-sync
>        colocated: true
> This would mean the services connected through this relation would
> co-locate.  In addition to that, we'd introduce as a separate step the
> concept of promiscuous relations.  This doesn't have to be done in a
> first incarnation of the spec/feature, though.

This avoids the design goal of the primary service not having to know
much (if anything) about what can possibly co-locate with it. The
model I'm suggesting allows for primary services to provide interfaces
that the co-locating services can 'consume' where that is the notion I
think you mean to hit on with 'promiscuous'. However it was designed
to capture that if the primary service didn't supply additional
relation data (and has no hooks to do so as we generally expect in a
volatile  environment of heavy reuse) the co-located service could
still make the relationship and function with only its basic platform

> Do we need anything else?
>> The second risk is that this system is abused in the wild to escape
>> possible limitations of the current system with regard to machine
>> reuse. This could inadvertently allow for a new class of bad practice
> Well put.  We need to enable multiple service units per machine before
> colocation.
> Thanks for starting the brainstorm on this Ben.

More information about the Ensemble mailing list