[DRAFT] Co-located services
clint at ubuntu.com
Sun Jul 31 21:17:29 UTC 2011
Ben, thanks for the write up. This is really close and I'm quite excited
to have these capabilities in Ensemble.
Excerpts from Benjamin Saller's message of Fri Jul 29 03:39:38 -0700 2011:
> [ This is an effort to collect the output of a number of discussions
> regarding services that are co-located within a single container. If
> you see something that isn't what we talked about or disagree with
> something please respond. We intend to use this as the basis for a
> specification moving forward. ]
> Ensemble Co-located Services
> Services execute in one or more service units. Service Units provide
> both a conceptual and a practical containment of a service's runtime.
> Until the implementation of this specification service units represent
> the smallest execution unit of a service at the concrete level and
> map. With the changes in this specification the container of a service
> unit will be able to nest additional unit agents in cases where it
> makes sense.
> Services such as logging, monitoring and storage often require some
> access to the runtime of the service they apply to. Under the current
> modeling of services its possible to relate things like monitoring to
> services by expressing in the monitored service that it supports a
> relationship with the monitoring service. This has possible issues,
> monitoring solutions might require direct access to log and state
> files as well as machine/container specific stats for load and IO. In
> addition this requires that every formula author that wishes to
> support logging provide additional interfaces in their formula
> specifications. It is desirable to allow for services to have complete
> access to the runtime state of another service with little to no
> change on the part of the formula author who may never know services
> are being co-located with their service.
> The following changes are designed to address both these issues and
> allow a class of formula that can nest within an existing container
> while still taking advantage of the existing relationship machinery.
+1, and for the record, I think storage *does* have a place here, even
if we model it later inside Ensemble, this may satisfy some of the
more common use cases. I could see a formula called 'ebsraid' which
takes required config params of a list of volume ids, and raid type,
and will have a promiscuous relation of interface "storage-dir" which
will mount the EBS raid either at /var/lib or where the other side of
the relation says to mount it.
> Co-located service/formula: A service designed for and deployed to the
> running container of another service.
> Parent service: The running service in whose service units we will be
> executing units of the co-located service.
I agree with Gustavo that parent/child doesn't fit what discussions we
had before. I see them all as peers, its just that these formulas happen
to have promiscuous relations which are useful when co-located. If
I destroy the "myblog" service which also had the "rsyslog" service
co-located with it, thats ok, rsyslog should still be there. Maybe an
optional --destroy-colocated or something like that to be clear that I
want the containers wiped out. Starts to feel like apt where we would
maybe want to say "these additional colocated services will be destroyed"
> Ensembles deploy command will take a few options designed to easily
> express how services co-locate.
> ensemble deploy <formula> [[--with <colo-formual [<service_name>]] ...]
Hmm, and now I wonder if parent/child semantics *do* have some value. What
if the --with's were named service-namer+formula ? As in
ensemble deploy mysql wiki-db --with rsyslog --with munin-node
This would prevent anyone ever being able to co-locate the same formula
twice. I can think of some use cases for that (like two munin-node
instances with different service settings). But maybe thats the kind of
thing that is an acceptable limitation since those formulas would have
to be crafted very carefully to be able to be co-located in such a way.
> Each `with` stanza takes the name of a formula and an optional service
> name under which to deploy it as. The --with argument can be used
> multiple times in a single deploy. All possible relation and
> co-location restrictions will validated before any services are
> ensemble deploy <formula> <service_name> [--in <service_name>]
> This format indicates we are deploying a new formula co-located to an
> existing service.
+1 I like this. It might also be the way to get multiple units of the
same formula co-located, *if* you have written your formula in a way
where that is possible.
> When service support co-location (see metadata section) the following
> pattern is used. In the event of
> deploy <x> --with <y> --with <z>
> <y> and <z> will have their install and start hooks fired before <x>.
> This would ease services like storage into being properly setup within
> a container before the parent service is initialized. In the event of
> deploy <y> --in <x>
> <x> must already be deployed and thus its install hook has already
> fired. In this event <y> will have its install and start hooks
> triggered before a relationship is established with the parent service
> unit. This essentially mimics the behavior of traditional relations.
> While the co-located services have relationships with the parent
> service (and the parent service may supply them with general
> information) its is generally considered a usage error for the parent
> service to ever query the co-located service's relation-data.
Hrm. I think thats fine. Relations are a 2-way communication channel, and
so one of these "promiscuous" relationships might include something where
the co-located formula provides hints for other services on how to make
use of it. For instance, collectd might set the path to a socket where
things that speak collectd-ese can write data for collection. Therefore
it wouldn't be an error to relation-get that and even expect it as part
of the 'collectd' interface.
> Service shutdown and unit removal follow a similar pattern. If the
> unit of the parent container is removed it will first transition to a
> stopped state (when possible) and then the co-located services will
> transition to stopped as well. The significant changes here are that
> the co-located service will undergo the same state transition to
> stopped as the parent and the container will be kept in place unit
> siuch time as the co-located services have finished their hook
> Implications for Status
> The current model of reporting services should under-go some
> modifications. While evolving status is an evolved topic these options
> should be evaluated for the short term to address this. Currently I
> suggest we show the services and mark them as collocated with a
> reference to the parent service but omit co-located units from the
> default output. In the future status will need to be much more query
I think an example of what status will look like is necesary for slow
folks like myself to be able to grok this change suggestion.
> Implications for hooks
> The socket nested in the formula directory should allow each unit
> agent in the container the ability to communicate with the proper
> agent. Some additional assurance might be needed to prevent malicious
> code from talking to the wrong agent.
> Metadata Changes
> Formula's YAML metadata file will grow new constructs. The first is a
> `co-located` top level flag which can be 'allow', 'always', 'never'
> (defaulting to never when omitted). 'allow' indicates if a formula is
> allowed to co-locate with another service (but might be useful through
> a normal relationship), 'always' means the service can only deploy in
> a co-located fashion and 'never' indicates that it will only support
> traditional single unit deployments.
Disagree on the default. Why restrict them when 99% of the time things
will work fine co-located. In fact the best clue for whether something
can or cannot be co-located with another formula is whether or not they
both provide the same interface.
> The other important change is the addition of the 'consumes' interface
> type. Previously we had 'provides' and 'requires', and these are
> mandates, 'consumes' indicates that the formula will create a
> relationship with the parent formula if its present but can otherwise
> attempt to function without it. This allows some parent formula to
> provide additional relationship information to their co-located
> Many of the advantages of containment are given away, we allow for the
> introduction of services who can now have conflicting packages,
> dependencies, make incompatible file system modifications, and so on.
> In many cases the parent formula doesn't know that it will have
> co-located services nested in its container and can do nothing in
> response to changes made with in its environment. In practice this
> should rarely be a serious issue *if* this system is used as intended.
> Orthogonal services and container level aspects used for generic
> infrastructure modification should be acceptable.
IMO this is mitigated completely by enabling a positive practice. What
will happen without this capability is users will start forking formulas
and adding their custom infrastructure bits to them, making Ensemble
formulas work just like existing config management solutions.
> The second risk is that this system is abused in the wild to escape
> possible limitations of the current system with regard to machine
> reuse. This could inadvertently allow for a new class of bad practice
> to become status quo. Its felt that through documentation and the
> forth coming changes to the system to support LXC we can navigate this
I'm not too worried about this. If the benefits of containerization
are well known and desirable enough, people will make their stuff use
More information about the Ensemble