Early numbers on multi-parent diffs

John Arbash Meinel john at arbash-meinel.com
Wed Apr 11 15:34:33 BST 2007


Aaron Bentley wrote:
> I've written an implementation of multi-parent diffs, and early numbers
> do show significant space wins.
> 
> Here's my code:
> http://code.aaronbentley.com/bzr/mpknit/
> 
> I've written a utility, "mpknit", that emits all the deltas for all
> versions of a file.  It can force the number of parents to 1, for
> comparison, but the output format remains the same (except that each
> diff refers to only the first parent).  So this is more a comparison of
> possible space-savings, not comparison against the existing knit format.
>  In particular, this format does not have snapshots, annotations or gzip
> compression.
> 
> I've attached an except of its output.
> 
> File            single-parent multi-parent relative
> errors.py       474K          181K         0.38x
> builtins.py     1.5M          640K         0.42x
> NEWS            647K          294K         0.45x
> knit.py         229K          206K         0.89x
> iterablefile.py 9.3K          9.3K         1x
> 
> (errors.py's knit is 620K, but that's not a fair comparison, since it
> includes snapshots and annotations.)


One of the ones I'm most interested in would be inventory.knit. Care to
try it there? (I realize that will take the longest, but might also see
the largest gain).

John
=:->



More information about the bazaar mailing list