Pushing after merge considered harmful

Eli Zaretskii eliz at gnu.org
Sun Feb 28 18:42:56 GMT 2010


> Date: Tue, 02 Feb 2010 22:07:14 +0200
> From: Eli Zaretskii <eliz at gnu.org>
> Cc: bazaar at lists.canonical.com
> 
> > From: Neil Martinsen-Burrell <nmb at wartburg.edu>
> > Date: Wed, 27 Jan 2010 21:43:19 +0000 (UTC)
> > 
> > > Anyway, here's a proposal: if the Bazaar docs team explains the effect
> > > of the most important commands in terms of a history DAG, I will sit
> > > down and write a version of it that will be understandable by
> > > non-programmers.  Deal?
> > 
> > Gauntlet picked up.  See
> > http://bazaar.launchpad.net/~nmb/bzr/dag-docs/annotate/head%3A/doc/en/dag-guide/index.txt
> > which will be most readable by doing
> > 
> > $ bzr branch lp:~nmb/bzr/dag-docs
> > $ cd dag-docs/doc/en
> > $ make html
> > 
> > and looking at _build/html/dag-guide/index.html
> 
> Just a short notice: I did not forget and not crawled back under the
> rock where I came from.  I just didn't yet have time to work on this.
> But I will.

I finally had my ``rainy day'', so here's the first cut.

I deliberately post it here as a separate document, not as diffs to
the original one.  That's because the changes, especially at the
beginning, are extensive and add quite a few paragraphs about stuff
that was not in the original.  I felt that people who are afraid of
DAGs might not fully understand the reasons that a structure such as a
DAG is needed in a DVCS.  So I started with a short explanation of a
fundamental difference between a traditional VCS and a DVCS.

In addition to the existing images, the new text needs one more: a
linear sequence of revisions in a traditional VCS.  I don't have tools
handy to produce such an image, but I reckon it won't be hard to add.

That's it!  Let me know if you like the results:

--------------------------------------------------------------------
==========================
The History and the Bazaar
==========================

.. warning::

    This is advanced documentation of the Bazaar version control system and
    its underlying conceptual model.  It is not intended as a first
    introduction to the system.  However, it can provide a deeper level of
    understanding of the structures that underlie it and thus it may be
    helpful once you are familiar with using Bazaar for version control.

Bazaar's History Representation
===============================

Bazaar is a `distributed version control system` (DVCS) and like all
such systems, it makes it very easy to fork development off the
`official` series of revisions, known as the `mainline`, and later
combine the forked-off revision with mainline.  We call the forked-off
revision a `branch` and the process of combining a branch with mainline
a `merge`.

Traditional, centralized version control systems, such as CVS or SVN,
do not have integral support for branching and merging.  They consider
the history of a package as a linear sequence of revisions:

   [insert here an image of a linear history]

In this model, a new revision can only be committed if it is based on
the latest revision in the linear history recorded in the repository.
If the developer started from a revision that is not the latest by the
time she wants to commit, she needs to manually synchronize her code
with the latest revision in the repository, by applying all the
changes made in between the revision from which she started and the
current latest revision.  Only after that, she will be able to commit
without destroying the work of others.  This manual labor is what
makes distributed development hard with the traditional VCSs.

A DVCS makes all this much easier by tracking branching and merging
between revisions in the repository.  This is central to the support
of a distributed development model built into every DVCS, whereby a
number of separate teams independently develop features of a package.

To track branching and merging, a DVCS needs to be built around a
proper representation of the relations between revisions in the
repository.  In contrast with traditional VCSs, a DVCS can no longer
represent the history as a line, it needs a non-linear representation
that can express branching and merging of revisions.

This non-linear representation of revision history can be illustrated
as a set of revisions and interconnections between these revisions.
The interconnections have a direction: they point from each revision
to one or more of its parents.  A `parent` is the revision from which
a `child` revision was produced.  If the child revision was produced
by editing some of the files in the working tree, that child revision
will have a single parent -- the revision which was edited.  If the
child revision was produced by merging two divergent paths of
development, it will have two parents, one each from each one of the
two branches that were merged.

.. image:: dag_basics.png
    :scale: 50

(Note: Those of you who have some background in computer science will
recognize that this structure is a `directed acyclic graph` (DAG), in
which revisions serve as `nodes` and interconnections are `edges`.
Directed acyclic graphs provide a natural representation of historical
relationships between revisions in a DVCS.)

Note that, unlike the linear history in the traditional VCSs, the DVCS
history no longer has a simply defined order: there's no `latest`
revision anymore, strictly speaking.

Each path back through the history is a different line of development.
Uniquely among DVCS, Bazaar has a special role for the `mainline` of
development, defined as connecting the lefthand parents for every revision.
This will have implications later on as we consider branching and merging
operations.  It also introduces an additional complication to the
history representation itself, since the parents of a revision now
form an *ordered*
collection, not just a set.  Additionally, Bazaar keeps track of the
current revision using a `head pointer` that identifies one revision as the
`head` of the history.  (This head revision, also known as the `tip`,
and the development mainline are a partial compensation in Bazaar for
the loss of strict order present in the traditional VCSs.)

Bazaar assigns each revision a `revision number` or `revno`, which is
a simple integer or a sequence of integers separated by periods.
(Note that a revno is distinct from each revision's unique identifier
called a `revision id`.)  While the set of revisions in the history of
a package is generally unordered, each line of development (branch)
does have an order.  To convey something about the ordering of
revisions in each branch, Bazaar assigns the numbers in ascending
order to successive revisions on each branch.

The process of assigning revision numbers is as follows.  First, follow the
mainline back to the first revision and assign numbers 1, 2, 3, and so on
along the mainline.  For revisions that are not on the mainline, follow their
lefthand parents back until they first diverge from the mainline.  If this is
in revision `n`, then the revno of the first revision on the branch is n.1.1.
For later revisions on the branch, increment the last number: n.1.2, n.1.3,
etc.  For other branches starting from the same revision, increment the
penultimate number: n.2.1, n.3.1, etc.  For branches from other branches,
Bazaar uses two more numbers n.1.2.1.1, etc.

Operations
==========

We will try to explain here with words and pictures the effects of each of
Bazaar's basic operations on the revision history representation.  In the
pictures, revision numbers are given with the prefix `r` and the head
of the history is labeled with HEAD.

Commit
------

A commit is the operation that creates a new revision from the current state
of the working tree (or some portion thereof) and connects it to the existing
revision history.  In the most common case, a new revision is
created whose parent is the current head.  The head pointer is then
updated so that the new revision is the head.  Using `merge <Merge>`_
it is possible to create new revisions that have more than one parent.

.. image:: commit.png
    :scale: 50

Branch
------

In terms of the history representation, a branch is just another
opportunity to make a line of development.  It provides an additional
head pointer for keeping track of that line of development.  We can
show this diagrammatically by just designating a different revision
not as HEAD, but as BRANCH, meaning the head of that branch.  Branches
are also the situation where Bazaar's revision numbers become involved.

.. image:: branch.png
    :scale: 50

Bazaar's concept of revision numbers and HEADS are relative to the current
branch.  So, although the new revision above was labeled BRANCH, if we were
working with that branch, the same revision could be labeled HEAD.

Merge
-----

Merge is an operation that signals the working tree to make changes based on
another revision.  It brings the changes from the other revision into the
current working tree (possibly with some conflicts) and also records the fact
that the other revision should be linked as an additional parent the next time
a revision is committed.

.. image:: merge.png
    :scale: 50

Because of Bazaar's special treatment of the lefthand parent, the merge
operation is not symmetric.  The revision that was the HEAD of the current
branch before the merge is the revision that will become the lefthand parent,
while the revision that was merged *into* the current branch will be the
other parents.  This means that the following histories are not equivalent

.. image:: merge-asymmetry.png
    :scale: 50

because the first one merges the trunk into the branch (note which parent is
on the left) to create the new revision, while the second merges the branch
into the trunk to create the new revision.  Since revision numbers are
determined using lefthand ancestors the revision numbers in these two
histories are not the same.

Push
----

Pull
----

Uncommit
--------



More information about the bazaar mailing list