Some unscientific timing results (on the Python source tree)
Talden
talden at gmail.com
Sat Mar 29 00:53:05 GMT 2008
> | bzr - 262Mb
> | hg - 140Mb
> | Subversion checkout - 333Mb
>
> Something seems fishy here. Specifically, if the SVN checkout is 333Mb, that
> sounds like the size of your working tree is 333/2 = 160MB. (SVN creates 2
> copies of every file so it has a pristine copy to 'diff' against.)
>
> I don't see how hg could have your working tree in less space than the raw files
> on disk take.
>
> I'm also a little surprised that we are that much larger than hg, since usually
> our on-disk tests show packs as taking up less space. (hg has 1 or 2 files per
> versioned file, which usually causes a lot of 'wasted' space because of block
> sizes.)
>
> I'm wondering if you don't have a lot of stuff in .bzr/repository/obsolete_packs
> which will be cleaned up over time. (When we generate new packs we leave the old
> ones around a bit to make sure that you can recover even if the OS decides to
> process deletes before writes and crashes in the middle.)
I'd be interested in some space/count comparisons that reflect what I
expect is a more common developer work-flow - multiple working-trees.
I think interesting metrics are file/folder counts, content size and
actual disk space used (disk-space - content-size = slack-space).
Space is becoming less scarce but is still important in estimating IO.
A higher number of files/folders also produces measurable
deterioration in performance on many file-systems.
As an example, most developers in my team have more than one CVS
working-copy, typically 2-3 with some of us having 6-8 at a time.
These working-copies are tied to various different branches (sometimes
several working-copies on the same branch) and all of the branches
have common and somewhat recent ancestors.
All of these developers are using Windows (though the CVS server
itself is on BSD) with 4k NTFS filesystems.
The 'slack-space' of 45MB+ per working-copy is a non-trivial but also
non-critical quantity given our working-tree of ~20,000 files and
~3,500 files is only 600MB of actual content. This was going to be
even worse of course for Subversion with it's high file count in
working-copy overhead.
A comparison of this type might be providing on the Bazaar website as
it's largely not machine specific (apart from slack-space of course)
and sets some user expectations of what to expect. I know that people
initially react quite negatively to the 2xcontent-size cost of
Subversion working-copies (though the benefits over CVS are many).
--
Talden
More information about the bazaar
mailing list