[rfc] mixin classes for Branch and similar per-format objects

Tue Nov 18 03:24:15 GMT 2008

We had a fairly discussion about the use of mixins and multiple
inheritance in Python at the recent Brisbane meetup, which I will echo and
summarize here.

First a caution: MMI in general can add complexity, and there are some
issues particular to Python where it has to be handled carefully, such as
passing parameters to super().__init__.

The particular case where it came up is in the Branch formats, where we
have several particular branch formats: you could call them 'concrete' in
that they correspond to a particular format found on disk.

[0] At the moment they inherit from each other in a very 'pragmatic' way
based on how they want to share code, but this is a bit confusing (since
the order of inheritance is I think 4 -> 5 -> 7 -> 6), not "true" in terms
of the OO is-a relationship (a Branch6 is not a Branch7), and because of
the preceding point it tends to generate some bugs or inflexibility.
It also means that our most current code depends on the oldest format,
when in fact we'd like to be able to remove it either to another file or
possibly eventually to a plugin.

It would be better to have the concrete classes as final classes (ie
not themselved subclassed) that combine the relevant behaviour that makes
up those classes.  For instance they have a type of lock, a behaviour for
storing their revision history, support for binding to a master branch,
support for tags.  The question is, how do we compose together these
behaviours.

[1] One approach which is approximated by the current code is to
successively subclass to build up to the currently supported format.  We
could get here from [0] by splitting the concrete formats off from a
'backbone' of classes that add the features or formats we've gone through
over time.  There are a couple of problems with this though, because we
don't just monotonically add new features over time: sometimes we change
the way they're implemented, and the features don't necessarily have
dependencies on each into a linear hierarchy.  For instance supporting a
fallback repository is barely or not at all related to supporting tags in
the branch.  So this would still tend to have the problem of freezing into
the class hierarchy the formats we supported in the past.

[2] Alternatively, we can split out the composable behaviour into public
attributes of the object, which is a bit like what's done by say
Branch.tags, or the VersionedFiles on Repository.  Obviously this can be a
powerful technique in separating out the different aspects, and allowing
them for example to be arbitrarily combined at runtime or tested
separately.  It has some drawbacks.  It makes the way the object is split
up public, which can complicate the interface to it: some queries a user
might reasonably expect to do against the Repository must instead be done
against one of it's attributes, which are arguably a Repository
implementation detail.  If we want to change the way the object is
composed in future we must change its public interface.

In some of these cases the composed object needs a pointer back to the
larger objects so that it can e.g. lock it, and this may be a clue that
they should really be the same thing.

[3] We can modify that to put the composable behaviour into private
attributes, and then make the public object forward calls to them.  This
keeps most of the benefits for testing, modularity and composability, but
keeps the public interface simple and stable, at the cost of writing
forwarding methods that expand the codebase, arguably make it less clear
where something actually happens, clutter the tags file and stack traces.
It also seems like methods that merely forward from one place to another
are not very pythonic.

[4] We can compose the concrete classes by mixing together multiple
superclasses that each add one desirable behaviour.  So they end up
directly having in their namespace the implementation of each particular
method, but methods that deal with independent aspects of the class can be
implemented separately.  They cannot so easily be tested separately, but
perhaps it's not realistic or important to test these aspects separately
from their instantiation in a particular concrete class.  This should
allow new mixins to be added in future and old ones to be discarded.

They should be 'pure' mixins, ie inheriting from object (or from another
class of the same aspect) not from Branch.  This may avoid some problems
with diamond inheritance.

In this approach the concrete classes would hold only information
genuinely unique to that concrete class (ie their on-disk format string)
and then define a composition of the mixin classes.

In some cases we want the mixin to provide a new method that's not
defined, or defined as NotImplementedError on the base class.  In others
if there really is some guaranteed-to-be constant code on the base class
it may be better for them to define template methods.

It's good to have consistency across similar class structures in the
library.  The above is not consistent with any pattern we use to date but
it does seem like a promising way to make them more consistent with each
other.

So [4], building these classes from mixins, looks promising to me.  Here
are some things we should watch or consider:

  * Will it make it harder or easier to provide new implementations in
    plugins?

  * We shouldn't define something as just a mixin if it can have a
    meaningful existence on its own, as for instance a pack or lockdir
    can.  However, we could write mixins that hook them up to a larger
    object.

  * We need to think carefully about cases where the different aspects
    need to coordinate with each other during e.g. initialization or
    locking.  Will calling through super objects or standard protocols on
    the mixed object be enough?

-- 
Martin