Diff-debs: bsdiff (92% reduction) vs zsync (~70%)

John C. McCabe-Dansted gmatht at gmail.com
Mon Jan 16 06:03:51 GMT 2006


I wrote a utility to compare the effectiveness of different tools for creating 
Deb-diffs. (attached)

I quickly eliminated xdelta, on the basis that the bsdiff patches were always 
smaller; sometimes as much as 5 times smaller. Bsdiff on average reduced the 
amount that needed to be downloaded by 92%. Higher compression rates could be 
achieved if we repack the .gz files contained in data.tar.gz

Interestingly bsdiff was able to save 33% on the download of the i686 kernel 
image if patched against the i386 image. However, this reduction is probably 
not worth the effort. 

Although zsync only reduced the required bandwidth, in my tests, by about 70%, 
it stayed in the race because of it greater flexibility - it should be able 
to use any existing .deb on the system, even ones regenerated from installed 
files by dpkg-repack. If we use bz2 to compress instead of gz, zsync is 
likely to become useless however. (bz2 would also have the problem that any 
processing of deb files will take much more cpu time).

The extra space required by either of these methods should be minimal. The 
size of a zsync file seems to be about 1% of the size of the original. With 
bsdiff files, we should be able to limit the extra storage required on the 
mirrors to under a gig by only including 
 a) patches against the files on the official CD-ROM(s).
 b) updates that occurred in the last (e.g.) 10 days.

(a) should allow (e.g dial up) users to easily get up-to-date immediately 
after install, while they still have the official CD-ROM in their computer.

 (b) would help users keep up to date, and perhaps also help keep mirrors 
up-to-date regardless of network congestion.

clearly if an appropriate patch isn't found we can still download the 
whole .deb normally.

Some more info is available on my blog at:
	http://www.livejournal.com/users/flyingreptile/101020.html

These results seem very promising to me. I am very busy at the moment, but if 
no-one else steps up, I'll start work in a couple of months.

-- 
John C. McCabe-Dansted
Masters Student
-------------- next part --------------
A non-text attachment was scrubbed...
Name: compare.sh.gz
Type: application/x-gzip
Size: 519 bytes
Desc: not available
Url : http://lists.ubuntu.com/archives/ubuntu-devel/attachments/20060116/93d46422/compare.sh.bin


More information about the ubuntu-devel mailing list