VCS comparison table

David Lang dlang at digitalinsight.com
Thu Oct 26 18:04:34 BST 2006


On Thu, 26 Oct 2006, Nicolas Pitre wrote:

> On Thu, 26 Oct 2006, David Lang wrote:
>
>> On Thu, 26 Oct 2006, Andreas Ericsson wrote:
>>
>>>>
>>>> There are _not_ scalability improvements.  There may be some slight
>>>> performance improvements, but definitely not scalability.  If you have
>>>> ever tried to use git to manage terabytes of data, you will see this
>>>> becomes very clear.  And "rebasing with 3-way merge" is not something
>>>> often used in industry anyway if you've followed the more common models
>>>> for revision control within large companies with thousands of engineers.
>>>> Typically they all work off mainline.
>>>>
>>>
>>> Actually, I don't see why git shouldn't be perfectly capable of handling a
>>> repo containing several terabytes of data, provided you don't expect it to
>>> turn up the full history for the project in a couple of seconds and you
>>> don't actually *change* that amount of data in each revision. If you want a
>>> vcs that handles that amount with any kind of speed, I think you'll find
>>> rsync and raw rvs a suitable solution.
>>
>> actually, there are some real problems in this area. the git pack format can't
>> be larger then 4G, and I wouldn't be surprised if there were other issues with
>> files larger then 4G (these all boil down to 32 bit limits). once these limits
>> are dealt with then you will be right.
>
> There is no such limit on the pack format.  A pack itself can be as
> large as you want.  The 4G limit is in the tool not the format.
>
> The actual pack limits are as follows:
>
> 	- a pack can have infinite size
>
> 	- a pack cannot have more than 4294967296 objects
>
> 	- each non-delta objects can be of infinite size
>
> 	- delta objects can be of infinite size themselves but...
>
> 	- current delta encoding can use base objects no larger than 4G
>
> The _code_ is currently limited to 4G though, especially on 32-bit
> architectures.  The delta issue could be resolved in a backward
> compatible way but it hasn't been formalized yet.
>
> The pack index is actually limited to 32-bits meaning it can cope with
> packs no larger than 4G.  But the pack index is a local matter and not
> part of the protocol so this is not a big issue to define a new index
> format and automatically convert existing indexes at that point.

the offset within a pack for the starting location of an object cannot be larger 
then 4G.

David Lang




More information about the bazaar mailing list