Diff-debs: bsdiff (92% reduction) vs zsync (~70%)
John C. McCabe-Dansted
gmatht at gmail.com
Mon Jan 16 06:03:51 GMT 2006
I wrote a utility to compare the effectiveness of different tools for creating
Deb-diffs. (attached)
I quickly eliminated xdelta, on the basis that the bsdiff patches were always
smaller; sometimes as much as 5 times smaller. Bsdiff on average reduced the
amount that needed to be downloaded by 92%. Higher compression rates could be
achieved if we repack the .gz files contained in data.tar.gz
Interestingly bsdiff was able to save 33% on the download of the i686 kernel
image if patched against the i386 image. However, this reduction is probably
not worth the effort.
Although zsync only reduced the required bandwidth, in my tests, by about 70%,
it stayed in the race because of it greater flexibility - it should be able
to use any existing .deb on the system, even ones regenerated from installed
files by dpkg-repack. If we use bz2 to compress instead of gz, zsync is
likely to become useless however. (bz2 would also have the problem that any
processing of deb files will take much more cpu time).
The extra space required by either of these methods should be minimal. The
size of a zsync file seems to be about 1% of the size of the original. With
bsdiff files, we should be able to limit the extra storage required on the
mirrors to under a gig by only including
a) patches against the files on the official CD-ROM(s).
b) updates that occurred in the last (e.g.) 10 days.
(a) should allow (e.g dial up) users to easily get up-to-date immediately
after install, while they still have the official CD-ROM in their computer.
(b) would help users keep up to date, and perhaps also help keep mirrors
up-to-date regardless of network congestion.
clearly if an appropriate patch isn't found we can still download the
whole .deb normally.
Some more info is available on my blog at:
http://www.livejournal.com/users/flyingreptile/101020.html
These results seem very promising to me. I am very busy at the moment, but if
no-one else steps up, I'll start work in a couple of months.
--
John C. McCabe-Dansted
Masters Student
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compare.sh.gz
Type: application/x-gzip
Size: 519 bytes
Desc: not available
Url : http://lists.ubuntu.com/archives/ubuntu-devel/attachments/20060116/93d46422/compare.sh.bin
More information about the ubuntu-devel
mailing list