rich roots conversion

Wed Apr 15 17:02:15 BST 2009

Robert Collins writes:
 > On Wed, 2009-04-15 at 22:28 +1000, Ian Clatworthy wrote:

 > > Among other reasons, an advantage of late calculation is that we
 > > can change the algorithm over time as our needs change. And this
 > > is *non-trivial* logic already as jam explained to me in
 > > Brisbane.

 > > If we didn't calculate it as part of every commit, how much
 > > faster would commit of a merge be, say? Do we *really* need
 > > per-file-graphs now given other advances we're made like the
 > > CHK-based format?
 > 
 > I can't answer this without serious benchmarking :).
 > 
 > > IIRC, poolie said that git didn't store per-file-graph metadata but
 > > hg did. Can anyone confirm that?
 > 
 > Thats correct.

I'm mystified.  What is this "graph metadata" that Bazaar fusses over
like a starlet fusses over her makeup on Oscar night?

In git and Mercurial, there is *one* graph, and that is the commit
DAG.  In git, everything's simple.  A commit is a single object, it
refers to its parents and some commit-specific metadata (including a
tree), and that's that.  No "graph metadata."  In Mercurial, AIUI it's
a little bit harder (but not much) because the actual file ancestry
data is stored per file.  And because the merge algorithm is file-
oriented, rather than project-oriented, you get the benefit of a more
recent common ancestor for many files, at the cost of having to
compute the common ancestor for every file that differs rather than
once and for all for the whole project.  However, AFAICS git could
implement the Mercurial file-based merge.  Instead of tracing back in
the graph until the branches have a common ancestor, you just trace
back and test for equality of the file hash.  So keeping graph
information per-file is just an optimization, saving two SHA1
dereferences per revision in common ancestor searches.

Either way, the DAG is *read*, not computed.  So what is it that
Bazaar needs to compute?  Well, OK.  Unlike git and Mercurial, which
simply arbitrarily label one parent the "mother" and then are heavily
biased toward matrilineal operation, Bazaar treats "my" branch
specially in logs and so on.  So I guess you need to keep track of a
branch id or ids for each revision.  Git doesn't bother, Mercurial
does.  But in Mercurial the branch ID on a revision never changes.  I
don't see why any of this information would change, no matter what you
merge into your branch in the future.

What else does a Bazaar revision need to inherit from its parent(s) or
child(ren), and why does that change from one invocation of bzr log to
the next one?

Steve