xdelta multi-parent hack ?

John Arbash Meinel john at arbash-meinel.com
Fri Jun 22 15:48:59 BST 2007


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Robert Collins wrote:
> I wonder, if doing xdelta on merged texts would get better compression -
> 
> e.g.
> cat parent1 parent2 combinedparents
> xdelta combinedparents childtext
> 
> In principle this would allow the output to select regions from parent2
> for use in childtext, but would obviously complicate the extraction
> routine somewhat - but perhaps acceptable.
> 
> -Rob

Yes and no... xdelta actually works by considering

cat source target

and then doing something like LZ'77 on the combination, but restricting
the output to start at the "target" boundary.

Which means that doing

cat source1 source2 target

Should theoretically do something nice.


However, because of memory limitations, it actually uses windowing to
compare regions of the texts. The default target window size is (1 <<
23) == 8MB, and the default source window size is (1 << 26) == 64MB.

With windows that big, it means that we can easily fit two source texts
into the source window.

Now, it complicates things a small amount. Because now our addresses
have to span a much greater range, as does the "string matching"
routine. So... I think it is definitely worth trying, I'm not sure what
the results would be.

John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFGe+FbJdeBCYSNAAMRAvDQAJwMXUszbinQ2HFDi7YN8SV4Nz6sHwCePVNO
vbxYriS9SydDGj7IfDok2Xs=
=0Rsg
-----END PGP SIGNATURE-----



More information about the bazaar mailing list