Compressing weaved revisions?
James Blackwell
jblack at merconline.com
Sat Oct 8 14:38:38 BST 2005
On Thu, Oct 06, 2005 at 12:26:11PM -0500, John A Meinel wrote:
> James Blackwell wrote:
> >On Fri, Sep 30, 2005 at 10:04:41AM -0400, Aaron Bentley wrote:
> >
> >>For trees with, e.g. 500 revisions, the revision storage may actually be
> >>larger than tree storage.
> >
> >
> >I converted the Bazaar-NG tree to newformat. I got a ratio of about 7:1,
> >which is much better than the old 29:1, but not as good as Mercurial's
> >2.5:1.
> >
>
> Are you using --apparent, and are you considering just revision-store or
> all of .bzr?
Not at that time I wasn't. Rob mentioned it to me this morning and I ended
up with 5.1:1 :
jblack at pluto:~/nf$ du -s --apparent .
10735 .
jblack at pluto:~/nf$ du -s --apparent .bzr
8644 .bzr
10735 / (10735 - 8644) = 5.13
I'm not sure why we end up with different apparent numbers (the
nonapparent would easily be because of blocksize)
The mercurial numbers were non-apparent, btw. I think they have less files
in their revision stores, so their ratio may actually be _higher_.
> So .bzr vs working is 5:1 (3.2:1 apparent)
> .bzr/everything-else vs .bzr/revision-store is 1:1 (7:1 apparent)
>
> With my revstore2sql plugin, if I trim out all the inventory stuff, I
> can get a revisions.sqlite down to 561K, it gets compression mostly by
> not duplicating the revision_id everywhere (switching to just a number).
> It loses some by having indexes, but in theory that would make access
> very fast.
>
> I don't know if we would want to use an sqlite store for everything,
> since there isn't a way to remotely access it, or download only part of
> it. But it is small, and you could download it locally and then upload it.
>
> If I pack all of the inventory into the sqlite db, then the size goes up
> to 9.4M, but 8.1M of that is because I don't do any delta compression of
> inventories (so each revision has a complete list of inventory entries).
> (though this is better than the old 24M version).
> But, I can get any inventory out of it in about 0.05s, whereas on this
> machine branch.get_inventory() takes 1.2s. (That is with bzrlib already
> loaded)
>
> John
> =:->
>
More information about the bazaar
mailing list