[rfc] bzr-colo into core

Thu Mar 24 11:00:34 UTC 2011

John Arbash Meinel <john at arbash-meinel.com> writes:

> ...
>
>> The situation you describe there is pretty much what bzr-colo gives
>> you.  The main issue that comes out of this thread seems to be that
>> people like having multiple working trees all addressing branches
>> within the same repository: either creating and deleting those trees
>> as needed, or switching a small number of trees among several
>> branches.
>> 
>> I guess in hg you would do this by having a repository for each
>> interesting tree, and pushing/pulling between them?
>
> Actually, there are some very fundamental differences. A revision only
> ever belongs to a single named-branch. Ever. You can't pull the last
> revision in "default" to be the tip of "issue123".

Yes, changesets are on one branch only. They can become part of the
history of another branch by merging it into that branch. So if the
"issue123" is ready to be merged back to the mainline of development,
then you do

  $ hg update default
  $ hg merge issue123

> (so you don't really have 'mirror branches' as we call them. You just
> get a local named branch that matches the remote named branch.)

Named branches are not really local or remote, they just "are" :) What I
mean is that they are part of the changesets themselves, so you get a
named branch when you pull in changesets bearing that name.

I often think of them as "changeset labels" instead since that better
conveys how they work.

> If you pull someone who has a named branch that matches your own named
> branch, you just get a "+1 heads" warning. Telling you that your local
> branch has multiple heads.

Yes -- named branches are supposed to be used for long-term branches
where the names stay stable over a long time. The names are global and
immutable which is why you want to choose them with a view for the long
term.

> For example:
>
> upstream
> default: [A] --- [B]
>
> my repo
> default: [A] --- [B] --- [C]
>
> your repo
> default: [A] --- [B] --- [D]
>
> After I 'pull' your repo
> default: [A] --- [B] --- [C]
>                     \--- [D] #still on "default"
>
> That is, I believe, how 'hg merge' knows what revisions to merge. By
> default it merges all of the extra heads in the named branch you are
> currently working on.

Right, 'hg merge' will pick the other head on the current branch. It
aborts if there are more than one head and you have to pick one.

> There are other oddities that stem from this. To *create* a branch,
> you have to commit (since otherwise the tip of the branch would be a
> revision on another branch).

Correct, the branch is not created before you make the first commit on
it. That is indeed a little puzzling. You can make an empty commit,
though, so it's not that bad:

  $ hg branch X
  $ hg commit -m 'Create branch X'

always work, even if there are no modified files.

> To determine what the current branch tips are, I believe Mercurial
> actually walks the DAG. I *think* it stores the [A] revision in its
> branch file, which never changes. All children of that revision which
> are heads (not in another branch) are considered to be the tips of
> that name.

Mercurial maintains a branchcache where the heads of each branch is
stored.

> I don't really know how this scales when you have an Emacs branch with
> 100k revisions on "default". As such, they might do something
> different to track more-recent revisions. They might even update the
> branch file on every commit, and just sometimes store multiple values
> there for a given branch name.

It scales quite fine because of the cache -- the OpenOffice repository
has 270k revisions, all on the "default" branch.

> That is just what I know of off-hand, for some very real differences
> between what Mercurial considers a "named branch" and what bazaar
> thinks of as a "Branch".
>
> John
> =:->

-- 
Martin Geisler

aragost Trifork
Professional Mercurial support
http://aragost.com/en/services/mercurial/blog/