Feature Request: show running relations in 'juju status'

Wed Nov 19 12:59:21 UTC 2014

On Tue, Nov 18, 2014 at 9:37 AM, Stuart Bishop
<stuart.bishop at canonical.com> wrote:
> Ok. If there is a goal state, and I am able to wait until the goal
> state is the actual state, then my needs (and amulet and juju-deployer
> needs) will be met. It does seem a rather lengthy and long winded way
> of getting there though. The question I have always needed juju to
> answer is 'are there any hooks running or are there any hooks queued
> to run?'. I've always assumed that juju must already know this (or it
> would be unable to function), but refuses to communicate this single
> bit of information in any way.

Juju as a system actually doesn't know this. Unit idleness is known
only by the unit agents themselves, and only implicitly at that -- if
we're blocking in a particular select clause then we're (probably!)
idle, and that's it. I agree that exposing idleness would be good, and
I'm doing some of the preliminary work necessary right now, but it's
not my current focus: it's just a happy side-effect of what needs to
be done for leader election.

The impact of exposing goal/active state (and hooks that trigger on
changes to same) is rather different: it's internal to a unit, and is
essentially an alternative to joined/departed hooks. (There's nothing
stopping you having both, but not much point to it either.)

> That would work too. If all units are in idle state, then the system
> has reached a steady state and my question answered.

Sort of. It's steady for now, but will not necessarily still be steady
by the time you're reacted to it -- even if you're the only
administrator, imagine a cron job that uses juju-run and triggers a
wave of relation traffic across the system.

> I'm not entirely sure how useful this feature is, given the inherent
> race conditions with serialized hooks. Right now, you need to write
> charms that gracefully cope with dependent services that have gone
> down without notice. With this feature, you will need to write charms
> that gracefully cope with dependent services that have gone down and
> the notification hasn't reached you yet. Or if the outage is for
> non-juju reasons, like a network partition. The window of time waiting
> for hooks to bubble through could easily be minutes when you have a
> simple chain of services (eg. postgresql -> pgbouncer -> django ->
> haproxy -> apache seems common enough).

Yeah, you never get away from having to cope gracefully with
unexpected failures. But there is still value there -- when one of
your remotes takes itself voluntarily out of rotation, you can know
not to send it traffic until it tells you it's ready again.

> Your example with storage is particularly interesting, as I was just
> dealing with this yesterday in my rewrite of the Cassandra charm. The
> existing mechanism in the charm is broken. If you add a new unit to
> the service, it runs its install and configure hooks and is READY. It
> then joins the peer relation, and is still READY. The peer units start
> spewing data at it, as the replication ring is rebalanced.  We now
> have a race. Will the storage hooks fire in time? The new unit unaware
> that storage is due to be attached, and does not know that, unless the
> storage is attached and the data migrated from local disk soon, the
> local disk will fill and the unit will fall over. To solve this with
> the current storage-broker subordinate, I could require the operator
> to set an 'wait_for_block_storage' boolean in the service
> configuration before deploy. But requiring people to read and follow
> the documentation is an error prone solution :-( I'm wondering if I
> should simply not bother fixing this race, and trust that the block
> storage broker hooks will be invoked and completed before local disk
> is filled. I understand that work is underway to replace the block
> storage broker so it won't be an issue long term, or your goal state
> would be useful here if a unit can ask questions like 'is storage
> going to be attached' or 'will peers be joining me'.

So, in my mind, the goal/active stuff is relevant to all relations,
not just peers. So, on the one hand: yes, assuming the relation with
storage-broker exists at the time the unit starts up, it should be
aware that there will be storage.

But on the other... dynamically adding a storage-broker relation would
still be hard; and even when storage is in the juju model, handling
dynamic storage changes is going to take a bit of effort to migrate
your data.

Cheers
William