grouping state collections

Fri Sep 5 15:50:20 UTC 2014

This is actually a relatively minor proposal (or at least request for
clarification), so keep that in mind as you suffer through my
verbosity.  I may not have all the details right but I'm close enough
that my thoughts here should still be valid.  Enjoy!

-eric

tl;dr Let's group the list of collections in state/state.go by how
they relate to state ("state", "normal state operation", "other
state-related").

Conceptual Context
===============

My understanding is that "state" means the state a user wants an
environment's machines, services, and units to be in.  How juju makes
that happen is what I'll call "normal state operation".  There are
other components of juju that are related to "state", but are neither
a part of it nor of "normal state operation".  I expect that there are
yet other components of juju which are not related to state at all.
The goal of this message is to accomplish some degree of concrete
distinction between these 4 types of juju components.

Collections
===============

juju's state DB (currently using mongo) has a bunch of collections
(i.e. tables) where we store most state-related persistent data. We
define the names of all the collections in state's DB in
state/state.go.  Effectively this list of collections represents
(hopefully) our complete usage of state's DB in juju.

The collections in the list are almost completely lumped together
without much distinction between how they relate to state.  My
understanding is that we have collections for the 3 state-related
types of juju components I explained above.

Proposal
===============

Since the list of collections gives us a brief summary of how we are
using the DB, we should structure the list to be more informative.  It
would be enough to just group the collections by how they relate to
state.  I'd recommend grouping by "state", "normal state operation",
and "other state-related".  From what I can tell we don't have any
"not state-related" collections.

For example, for backups we are storing metadata in state's DB.
Backups is one of those components that is related to state but is
neither a part of "state" nor of "normal state operation" (apparently
the same is true of at least parts of the tools functionality).  We
are making use of state's DB, but state itself doesn't actually use
the backup-related collection (state.State does not have any methods
that reference that collection).  When I added the collection for
backups metadata, in state/state.go I kept the definition separate
from the rest of the list of collections.  My intention was to
indicate its conceptual distiction from the rest of the collections.

It would be beneficial to similarly group the rest of the collections
by the type of juju component (e.g. how they relate to state).  Doing
so gives a nice summary of how we are using the DB, right at the top
of the main state-related file.  This would help new-comers make sense
of state-related concepts sooner as well as how we make use of the DB.
It would also help the project as a whole by bringing more focus on
and serving as a reminder of conceptual boundaries relative to state.
Having it in a succinct form (the list of collections) means you can
take it in at a glance.

Food for Thought
===============

The same separation between juju components by how they relate to
state could be applied to the juju code base as a whole.  It may be
more of a cross-cutting concern sort of thing so moving everything
into one of four top-level packages may not make sense (it probably
wouldn't be worth it anyway).  However, for the most part the
structure juju code base does not clearly communicate how the various
parts relate to state.  Given that state is juju's driving concept and
permeates most of the code base, making relationship with state a
little more obvious would go a long way.  This would be a *much*
heavier lift than just shuffling lines around at the top of
state/state.go and ultimately probably too late to be worth doing in
one fell swoop.  However, it may be worth coming up with a vision for
how we *would* do it, tweaking any low-hanging fruit to conform, do a
little bit at a time, and make sure new stuff conforms.

The same idea and rationale applies to the distinction between
public-facing code and internal-to-juju code.