sample large weave
Martin Pool
martinpool at gmail.com
Wed Aug 17 21:20:44 BST 2005
I imported the changelog and Makefile.in from gcc into a bzrlib weave.
(Not a full tree, yet, just these files.) These are real examples of
files that are large and have long histories.
Both compress very well:
% weave stats gcc-Makefile.in.weave
versions 276
weave file 2979815 bytes
total contents 109245151 bytes
compression ratio 36.66x
average size 395815 bytes
relative size 7.53x
% weave stats =(zcat gcc-changelog.weave.gz)
versions 1112
weave file 507721 bytes
total contents 317646022 bytes
compression ratio 625.63x
average size 285652 bytes
relative size 1.78x
-r--rw-r-- 1 1837342 2005-08-17 18:51 ChangeLog,v
-r--r--r-- 1 1077955 2005-08-17 20:00 Makefile.in
-r--rw-r-- 1 18510053 2005-08-17 18:51 Makefile.in,v
-rw-r--r-- 1 202088 2005-08-17 20:00 gcc-Makefile.in.weave.gz
-rw-r--r-- 1 151812 2005-08-17 22:42 gcc-changelog.weave.gz
The gzipped version of the Makefile.in weave, storing 276 versions, is
actually less than half the size of the current Makefile.in text.
The performance, I would say, is reasonably good. add performance
tends to bog down to nearly a second on the Makefile.in, which I
suspect is because the pure-python difflib diff is too slow on large
files. That should be fixable.
Extraction is quite fast: annotate of a 12038-line version of the
changelog takes 0.2 user seconds.
The files are in http://bazaar-ng.org/tmp/ in case anyone wants to
experiment with them.
--
Martin
More information about the bazaar
mailing list