Some questions on internals

Mon Feb 22 07:12:56 GMT 2010

Frits Jalvingh wrote:
[...]
> Andrew, with tracking versions only, can you give me an idea on how a
> merge at file level is calculated? Say you have (wildly) diverged
> branches and a newer one has added heaps of code. Now you add one line
> in the "old" version at line 100, then merge it upward - where it
> appears at line 987 because lots of other commits changed code before it
> in the new version.

By default, by a 3-way merge of the versions.  bzr will examine the
histories to find the most recent common ancestor of both revisions,
which is designated the “BASE” version, and the two resulting versions
are designated THIS and OTHER.  bzr then calculates the changes from
BASE to THIS, and BASE to OTHER, and combines them.  Conflicts occur
when the same region is changed by both sets of changes.  (See also
<http://en.wikipedia.org/wiki/Merge_%28revision_control%29#Three-way_merge>.)

It's a little more complicated in the case of criss-cross merge
histories and the LCA merge algorithm, but I hope you get the idea.  The
code for bzr's 3-way merge is in bzrlib/merge3.py.

But knowing how 'bzr merge' works doesn't really tell you how
annotations work.  After all, you can edit the files however you like
between running 'bzr merge' and 'bzr commit'.  And when you do commit
the new revision bzr doesn't intrisically try to record “this line came
from this version” (because it really doesn't know), it just records
“the new file contents are XYZ”.

Annotations are basically done by diffing one version of a file against
the previous version and seeing what changed, and doing this repeatedly
for every version of the file.  bzr uses bzrlib/patiencediff.py to
generate these diffs.  bzr does line-based diffing (rather than e.g.
word- or byte-based) for this.

-Andrew.