VCS comparison table

Nicolas Pitre nico at cam.org
Thu Oct 26 18:03:49 BST 2006


On Thu, 26 Oct 2006, David Lang wrote:

> On Thu, 26 Oct 2006, Andreas Ericsson wrote:
> 
> > > 
> > > There are _not_ scalability improvements.  There may be some slight
> > > performance improvements, but definitely not scalability.  If you have
> > > ever tried to use git to manage terabytes of data, you will see this
> > > becomes very clear.  And "rebasing with 3-way merge" is not something
> > > often used in industry anyway if you've followed the more common models
> > > for revision control within large companies with thousands of engineers.
> > > Typically they all work off mainline.
> > > 
> >
> > Actually, I don't see why git shouldn't be perfectly capable of handling a
> > repo containing several terabytes of data, provided you don't expect it to
> > turn up the full history for the project in a couple of seconds and you
> > don't actually *change* that amount of data in each revision. If you want a
> > vcs that handles that amount with any kind of speed, I think you'll find
> > rsync and raw rvs a suitable solution.
> 
> actually, there are some real problems in this area. the git pack format can't
> be larger then 4G, and I wouldn't be surprised if there were other issues with
> files larger then 4G (these all boil down to 32 bit limits). once these limits
> are dealt with then you will be right.

There is no such limit on the pack format.  A pack itself can be as 
large as you want.  The 4G limit is in the tool not the format.

The actual pack limits are as follows:

	- a pack can have infinite size

	- a pack cannot have more than 4294967296 objects

	- each non-delta objects can be of infinite size

	- delta objects can be of infinite size themselves but...

	- current delta encoding can use base objects no larger than 4G

The _code_ is currently limited to 4G though, especially on 32-bit 
architectures.  The delta issue could be resolved in a backward 
compatible way but it hasn't been formalized yet.

The pack index is actually limited to 32-bits meaning it can cope with 
packs no larger than 4G.  But the pack index is a local matter and not 
part of the protocol so this is not a big issue to define a new index 
format and automatically convert existing indexes at that point.


Nicolas




More information about the bazaar mailing list