VCS comparison table

Linus Torvalds torvalds at osdl.org
Sun Oct 22 01:47:40 BST 2006



On Sat, 21 Oct 2006, Jeff Licquia wrote:
> 
> You know what?  It occurs to me that much of the problem with git
> branches vs. bzr branches might be solved when bzr gets proper tagging
> support.  Because, after all, aren't branches more like special tags in
> git?

Both branches _and_ tags in git are 100% the same thing: they're just 
shorthand for the commit name. That's _literally_ all they are. They are a 
symbolic name for a 160-bit SHA1 hash.

So yes, you can say that branches are like special tags, or that 
(unsigned) tags are like special branches. There's no real "technical" 
difference: in both cases, it's just an arbitrary name for the top commit.

However, there are some purely UI differences between tags and branches, 
which really don't affect any of the "name->SHA1" translation at all, but 
which affect how you can _use_ a tag-name vs a branch-name.

 - A branch is always a pointer to a _commit_ object.

   In contrast, a tag can point to anything. It can point to a tree (and 
   that means that you can do _diff_ between a tag and a branch, but such 
   a tree doesn't have any "history" associated with it - it's purely 
   about a certain "state", so you cannot say that it has a parent or 
   anything like that).

   A tag can also point to a single file object ("blob": pure file 
   content), which is soemthing that the git.git repository uses to point 
   to the GPG public key that Junio uses to sign things, for example.

   But perhaps more commonly, a tag can also point to a special "tag" 
   object, which is just a form of indirection that can optionally contain 
   an explanation and a digitally signed verification. When I cut a kernel 
   release, for example, my tag's don't point to the commit that is the 
   release commit, they point to a GPG-signed tag-object that in turn 
   points to the commit. 

   With those signed tags, people can verify (if they get my public key) 
   that a particular release was something I did. And due to the 
   cryptographic nature of the hash, trusting the tag object also means 
   that you can trust the commit it points to, and the whole history that 
   points to.

   So while from a _revision_lookup_ standpoint a "branch" and a "tag" do 
   100% the same thing, we put some limitations on branches: they always 
   have to point to a commit.

 - Thanks to the limitation on branches being commits, branches can be 
   "checked out" which is saying that you can make it the active working 
   tree state. You cannot "check out" a tag: you need to have a branch 
   that you check out and can do development on.  So a "tag" is considered 
   purely a stationary pointer: it cannot be committed to, and it cannot 
   participate directly in development.

   This literally has nothing to do with looking up the SHA1 name 
   associated with a tag or a branch, this is _purely_ an agreed-upon 
   convention (that is enforced by higher-level commands like "git 
   checkout"). So if you want to check out the state as of some tag, you 
   must always do it within the confines of some branch.

   So for example, you could do

	git checkout -b newbranch v2.6.18

   which uses a tag ("v2.6.18") to define where to start the branch, and 
   then creates a branch called "newbranch" and checks that out. That's 
   purely shorthand for

	git branch newbranch v2.6.18	# create 'newbranch', initialize 
					# it at v2.6.18

	git checkout newbranch		# make 'newbranch' our currently 
					#active branch

   but you are _not_ allowed to do

	git checkout v2.6.18

   because that would leave you with a situation where your "top-of-tree" 
   is a tag, and you couldn't do any development on it because you don't 
   have a branch to develop _on_.

But all of these kinds of differences between tags and branches are really 
not "core technology" and are purely about having adopted a convention. It 
is literally about just having certain "usage rules" for specific 
"symbolic namespaces".

"branch" and "tag" are just the normal namespaces git gives you and always 
has. You can have others too (and you can define your own) and those names 
will automatically be used for lookup by all the basic git tools. Git 
won't _touch_ those names in any other way, but it means that you can 
create your own tools around git that have their own rules about how the 
names are managed, and you can still use them for lookup.

For example, you could have a "svn" namespace for a project imported from 
svn, and that namespace would contain the SVN revision names for the 
project, so that you could do

	git diff svn/56..

to see the difference between "svn revision 56" and your current HEAD, 
without necessarily polluting the "real" git tag namespace.

(Which can matter, since some commands take arguments like "--tags", which 
just collects all the regular tags - so you might not want to use normal 
tags to remember your SVN revision mapping, even if it might technically 
be fine).

(The above was a totally made-up example. I don't think any of the svn 
importers actually do anything like that: but we do use a few other 
"namespaces" internally: "git bisect" puts the bisection results in the 
"bisect" namespace, and the "remotes" namespace can be used to track 
remote heads as something _different_ than a local branch - so that you 
won't check such a "remote branch" out directly by mistake)

			Linus




More information about the bazaar mailing list