VCS comparison table

Linus Torvalds torvalds at
Mon Oct 23 20:18:06 BST 2006

On Mon, 23 Oct 2006, Jelmer Vernooij wrote:
> Bzr stores a checksum of the commit separately from the revision id in
> the metadata of a revision. The revision is not used by itself to check
> the integrity of a revision.

That wasn't what I was trying to aim at - the problem is that the bzr 
revision ID isn't "safe" in itself. Anybody can create a revision with the 
same names - and they may both have checksums that match their own 
revision, but you have no idea which one is "correct".

So you just have to trust the person that generates the name, to use a 
proper name generation algorithm. You have to _trust_ that your 64-bit 
random number really is random, for example. And that nobody is trying to 
mess with your repo.

This isn't a problem in normal behaviour, but it's a problem in an attack 
schenario: imagine somebody hacking the central server, and replacing the 
repository with something that had all the same commit names, but one of 
the revisions was changed to introduce a nasty backhole problem. Change 
all the checksums to match too..

It would _look_ fine to somebody who fetches an update, and the maintainer 
might not ever even notice (because he wouldn't send the _old_ revision 
again, and _his_ tree would be fine, so he'd happily continue to to send 
out new revisions on top of the bad one on the public site, never even 
realizing that people are fetching something that doesn't match what he is 

In contrast, in git, if you replace something in a git repository, the 
name changes, and if I were to try to push an update on top of a broken 
repo like that, it simply wouldn't work - I couldn't fast-forward my own 
branch, because it's no longer a proper subset of what I'm trying to send.

So in git, you can _trust_ the names. They actually self-verify. You can't 
have maliciously made-up names that point to something else than what they 

[ Also, as a result, and related to this same issue: the git protocol 
  actually never sends object names when sending the object itself. It 
  just sends the object data, and the _recipient_ generates the name from 

  So you can't do the _other_ kind of spoofing, and make a repository that 
  _claims_ to have one name and the data would differ - because if you do 
  that, anybody who pulls from the spoofed repository will re-create 
  different names than you claimed, and won't even be able to pull such a 
  malicious repository. ]


More information about the bazaar mailing list