compressed weaves, and revision.weave

Michael Ellerman michael at ellerman.id.au
Tue Oct 25 14:43:44 BST 2005


On Tue, 25 Oct 2005 18:43, John A Meinel wrote:
> John A Meinel wrote:
> > Martin Pool wrote:
> >>On 25/10/05, John Arbash Meinel <john at arbash-meinel.com> wrote:
> >>
> >>I think compressing the storage is a good idea, but I'd like to switch
> >>to an append-only indexed weave-like format at some time in the
> >>future.  I have the start of some code for that here.
> >>
> >>  http://people.ubuntu.com/~mbp/bzr.mbp.knit/
> >>
> >>There is some tension between such a format and compression; I suppose
> >>we could just compress each appended record of the file independently.
> >> The ratio might not be as good but it would eliminate some text
> >>redundancy, and I suppose we can rely on the delta compression to get
> >>some more.
> >>
> >>I think I'd like compression to be optional; for local access the CPU
> >>cost may be more than people wish to pay.
> >
> > I believe I have implemented it, I was able to upgrade the bzr.dev tree.
> > But I didn't write any more tests.
>
> I just fixed the tree so that all tests pass again. This is now revno=1357.
> ...
> To give some of the benefits of this branch, here are the statistics.
>
> bzr.dev 2369 revisions
> 	inventory.weave	1.8M 1843774
> 	revision-store	1.0M 1064510
> 	.bzr/		8.0M 8346945
>
> bzr.dev upgraded uncompressed
> 	inventory.weave	1.1M 1176233
> 	revision.weave  1.1M 1132526
>         .bzr/		7.4M 7748158
>
> bzr.dev upgraded compressed
> 	inventory.weave.gz 361.3K 369922
> 	revision.weave.gz  327.6K 335477
> 	.bzr/		     2.1M 2189365

For my kernel tree, only 47 revisions.

257M	working tree

Mainline:
5.3M    .bzr/inventory.weave
380K    .bzr/revision-store/
283M    .bzr

Compressed weaves (compressed):
1.1M    .bzr/inventory.weave.gz
8.0K    .bzr/revision.weave.gz
117M    .bzr

Which is pretty impressive. Although there's not much history, it's pretty 
nice to be storing 47 revisions in 45% the size of the working tree.

Having said that, it's a tad on the slow side:

concordia ~/src/work/ckexec$ cbzr --profile st
1588655 function calls (1535159 primitive calls) in 18.747 CPU seconds

vs mainline:

concordia ~/src/work/kexec$ bzr --profile st
1368395 function calls (1314899 primitive calls) in 11.668 CPU seconds

cheers

-- 
Michael Ellerman
IBM OzLabs

email: michael:ellerman.id.au
inmsg: mpe:jabber.org
wwweb: http://michael.ellerman.id.au
phone: +61 2 6212 1183 (tie line 70 21183)

We do not inherit the earth from our ancestors,
we borrow it from our children. - S.M.A.R.T Person
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 189 bytes
Desc: not available
Url : https://lists.ubuntu.com/archives/bazaar/attachments/20051025/f34f69f5/attachment.pgp 


More information about the bazaar mailing list