Relation ordering: Does it matter?

Clint Byrum clint at ubuntu.com
Tue Feb 7 08:15:03 UTC 2012


Excerpts from Gustavo Niemeyer's message of Mon Feb 06 19:02:54 -0800 2012:
> Hi Clint,
> 
> > Upon establishing the database relation, there is nothing to signal to
> > juju that it should try again, the keystone<->compute relationship is
> > broken until an administrator uses 'juju resolved ...'.
> 
> This is the key problem we have to fix, and it's been on our TODO
> since inception: commands such as relation-get and relation-set must
> work out of band, outside of any hooks. We want that for a number of
> reasons, but one of the problems it will solve is the one you're
> reporting. Once the provider is ready to serve, it can simply set a
> relation option to awake the consumer.
> 

In that scenario, keystone would go through this interaction:

keystone<->nova: joined+changed
  keystone has no db, sets nothing
nova<->keystone: joined+changed
  nothing happens

keystone<->mysql: joined+changed
  keystone configures database, starts keystone
    for server in relation-list --name=nova
      runs the nova joined/changed hooks with JUJU_REMOTE_UNIT=server
      (as long as the joined/changed hooks are explicit about the
      relation name, this should work fairly easily)

This seems like it would work and solve the cited problem.  I think this
would make charms a little harder to write than ordering, but perhaps
thats the price for complex relationships. Another thought is to make
this simpler with a helper, something like 'relation-defer' to cause
the hook to be retried after the next succesful relation establishment.

What follows is probably tl;dr .. basically, it would be harder,
but I thinks till doable, to do ordering. See below for my
suggestions. Ultimately, I think removing the context requirement from
relation-* is the simpler solution.

> > We could just enforce that 'provides' relationships won't do anything
> > until a 'requires' relation exists. However, this will only narrow the
> > race window. Between creating the relationship, and it actually being
> > configured, the agents might still try to join the two together.
> >
> > I think we need a way to flag a relationship as being in a steady
> > state. This seems quite doable given the information we have.
> 
> Introducing ordering and heuristics about stability of relations is a
> rabbit hole. Just to point a few issues that come to mind as I write:

Indeed, I hope we can tumble down a little bit, as not having to do the
whole dance in charms would be quite welcome... when I say that a package
Depends on something, I can expect that it is there and working when
the postinst runs. This makes postinst maintainer scripts much easier
to write... though I do understand that doing things on one system is a
lot more straightforward than coordinating them across many.

> 
> - Relations are optional, so they may become steady simply because the
>    admin hasn't finished the add-relation commands yet.
> 

I'm still very confused why we call them 'requires' if they are optional.
I have always assumed that eventually we'd start enforcing these a
bit more, and making use of a "consumes" section, or "optional: true"
attribute.

Anyway, a relationship that does not exist would not seem to me to be
"steady", and therefore, would not be considered ready.  So relating
anything to that service's provided interfaces would result in that
relationship waiting for the user to 'add-relation', perhaps in a new
state something like 'wait-for-requires'.

> - Another consequence of optional relations is that services generally
>    will work without the requirements alive, in many cases. We shouldn't
>    make them unavailable just because a given relation isn't around.
> 

Mediawiki serves a wiki w/o memcached, this is true. However, its
sessions are stored on disk without memcached, and so, an add-unit
produces a broken 'website' relation to haproxy, since it is no longer
stateless. If we had this ordering in place, the http relation would
never be established unless the charm author marked memcached optional
or the admin established the required relationship to memcached.

So I think it would make sense to have some way to say "this relation
cannot be satisfied without that required relation". Perhaps we can
be more direct and make it

provides:
  website:
    interface: http
    depends: [ memcached, mysql ]

> - Machines stop/die and get restarted. No matter what order we
>    establish relations, the software has to know how to reestablish
>    itself to a live instance on the other side.
> 

If the database is dead, and I start trying to relate things to keystone,
I'd expect the hooks to error, as keystone could not satisfy the needs. An
admin would expect to have to use resolved --retry to fix problems caused
by a dead machine.

Upon recovery, right now with juju, there'd be no event to actually inform
keystone that the database is back and you can run relation-set again. So
those errors are actually critical to completing configuration steps,
as the admin will need to resolve them. Even if juju thinks it knows the
state of the two sides, it may be wrong, and these errors are just going
to happen.

My thinking is purely that we can eliminate complexity from charms by
doing some work for the user inside juju.

> - It's easy to find different use cases that require a different
>    notion of ordering requirements.
> 
> - Loops may easily happen depending on the topology
> 

Indeed, circular dependencies are the devil... however, we have the
dependency graph available to us, since the user has built it for us with
add-relation. So we can pinpoint the very moment at which the circle is
completed, and note this to the user.. "Circular Dependency detected."

Perhaps the state would be changed to something like 'circular-depends'
and resolved would be used to kick these off in the necessary order.



More information about the Juju mailing list