VCS comparison table

Andreas Ericsson ae at op5.se
Tue Oct 17 08:50:54 BST 2006


Aaron Bentley wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
> 
> Linus Torvalds wrote:
>> On Mon, 16 Oct 2006, Aaron Bentley wrote:
>>> Bazaar's namespace is "simple" because all branches can be named by a
>>> URL, and all revisions can be named by a URL + a number.
> 
>> I pretty much _guarantee_ that a "number" is not a valid way to uniquely 
>> name a revision in a distributed environment, though. I bet the "number" 
>> really only names a revision in one _single_ repository, right?
> 
> Right.  That's why I said all revisions can be named by a URL + a
> number, because it's the combination of the URL + a number that is
> unique.  (In bzr, each branch has a URL.)
> 

The revision will change between different repos though, so 
random-contributor A that doesn't have his repo publicised needs to send 
patches and can't log his exact problem revision somewhere, which makes 
it hard for random contributor B that runs into a similar problem but on 
a different project sometime later to find the offending code. I prefer 
the git way, but I'm a git user and probably biased.

That said, it shouldn't be impossible to add fixed, user-friendly 
bazaar-like revision numbers for git. We just have to reverse the
<committish>[^~]<number> syntax to also accept <committish>+<number>.

This would work marvelously with serial development but breaks horribly 
with merges unless the first (or last) commit on each new branch gets 
given a tag or some such.

Either way, I'm fairly certain both bazaar and git needs to distribute 
information to the user in need of finding the revision (which url and 
which number vs which sha). I also imagine that the bazaar users, just 
like the git users, are sufficiently apt copy-paste people to never 
actually read the prerequisite information.

> 
>> If you give 
>> the SHA1 name, it's well-defined even between different repositories, and 
>> you can tell somebody that "revision XYZ is when the problem started", and 
>> they'll know _exactly_ which revision it is, even if they don't have your 
>> particular repository.
> 
> When two people have copies of the same revision, it's usually because
> they are each pulling from a common branch, and so the revision in that
> branch can be named.  Bazaar does use unique ids internally, but it's
> extremely rare that the user needs to use them.
> 

Well, if two people have the same revision in git, you *know* they have 
pulled from each other, because ALL objects are immutable. The point of 
"naming" the revision is moot, because it's something all SCM's can do.


>> Now _that_ is true simplicity. It does automatically mean that the names 
>> are a bit longer, but in this case, "longer" really _does_ mean "simpler".
>>
>> If you want a short, human-readable name, you _tag_ it. It takes all of a 
>> hundredth of a second to to or so.
> 
> But tags have local meaning only, unless someone has access to your
> repository, right?
> 

I imagine the bazaar-names with url+number only has local meaning unless 
someone has access to your repository too. One of the great benefits of 
git is that each revision is *always exactly the same* no matter in 
which repository it appears. This includes file-content, filesystem 
layout and, last but also most important, history.


>>>> About "checkouts", i.e. working directories with repository elsewhere:
>>>> you can use GIT_DIR environmental variable or "git --git-dir" option,
>>>> or symlinks, and if Nguyen Thai Ngoc D proposal to have .gitdir/.git
>>>> "symref"-like file to point to repository passes, we can use that.
>>> It sounds like the .gitdir/.git proposal would give Git "checkouts", by
>>> our meaning of the term.
>> Well, in the git world, it's really just one shared repository that has 
>> separate branch-namespaces, and separate working trees (aka "checkouts"). 
>> So yes, it probably matches what bazaar would call a checkout.
> 
> The key thing about a checkout is that it's stored in a different
> location from its repository.  This provides a few benefits:
> 
> - - you can publish a repository without publishing its working tree,
>   possibly using standard mirroring tools like rsync.
> 

Can't all scm's do this?

> - - you can have working trees on local systems while having the
>   repository on a remote system.  This makes it easy to work on one
>   logical branch from multiple locations, without getting out of sync.
> 

This I'm not so sure about. Anyone wanna fill out how shallow clones and 
all that jazz works?

> - - you can use a checkout to maintain a local mirror of a read-only
>   branch (I do this with http://bazaar-vcs.com/bzr/bzr.dev).
> 

Check. Well, actually, you just clone it as usual but with the --bare 
argument and it won't write out the working tree files.

>> Almost nobody seems to actually use it that way in git - it's mostly more 
>> efficient to just have five different branches in the same working tree, 
>> and switch between them. When you switch between branches in git, git only 
>> rewrites the part of your working tree that actually changed, so switching 
>> is extremely efficient even with a large repo.
> 
> You can operate that way in bzr too, but I find it nicer to have one
> checkout for each active branch, plus a checkout of bzr.dev.  Our switch
> command also rewrites only the changed part of the working tree.
> 

Works in git as well, but each "checkout" (actually, locally referenced 
repository clone) gets a separate branch/tag namespace.

-- 
Andreas Ericsson                   andreas.ericsson at op5.se
OP5 AB                             www.op5.se
Tel: +46 8-230225                  Fax: +46 8-230231




More information about the bazaar mailing list