xdelta multi-parent hack ?
John Arbash Meinel
john at arbash-meinel.com
Fri Jun 22 15:48:59 BST 2007
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
Robert Collins wrote:
> I wonder, if doing xdelta on merged texts would get better compression -
>
> e.g.
> cat parent1 parent2 combinedparents
> xdelta combinedparents childtext
>
> In principle this would allow the output to select regions from parent2
> for use in childtext, but would obviously complicate the extraction
> routine somewhat - but perhaps acceptable.
>
> -Rob
Yes and no... xdelta actually works by considering
cat source target
and then doing something like LZ'77 on the combination, but restricting
the output to start at the "target" boundary.
Which means that doing
cat source1 source2 target
Should theoretically do something nice.
However, because of memory limitations, it actually uses windowing to
compare regions of the texts. The default target window size is (1 <<
23) == 8MB, and the default source window size is (1 << 26) == 64MB.
With windows that big, it means that we can easily fit two source texts
into the source window.
Now, it complicates things a small amount. Because now our addresses
have to span a much greater range, as does the "string matching"
routine. So... I think it is definitely worth trying, I'm not sure what
the results would be.
John
=:->
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.3 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org
iD8DBQFGe+FbJdeBCYSNAAMRAvDQAJwMXUszbinQ2HFDi7YN8SV4Nz6sHwCePVNO
vbxYriS9SydDGj7IfDok2Xs=
=0Rsg
-----END PGP SIGNATURE-----
More information about the bazaar
mailing list