Compressing packages with bzip2 instead gzip?

John C. McCabe-Dansted gmatht at gmail.com
Wed Jan 18 15:08:28 GMT 2006


On Thursday 19 January 2006 02:41, Paul Sladen wrote:
> The big problem is that the server-side data also needs to be uncompressed.
> (Or faked-uncompressed with gzip --rsyncable, which 'zsyncmake' can
> generate itself).  The mirrors likely cannot take a ten-fold increase of
> having uncompressed .debs but can probably take the X% hit of rsyncable
> .debs.

Just to be clear, we do not have generate any deb files using --rsyncable. We 
only have to put up .zsync files, zsync should do the rest. (although  at 
present  zsync cannot recognise that the archive contains gzipped files).

> If bzip2 (a block encoder) is used, then identity points only exist every
> 900kB.
...snip...
> Conclusion, 10% extra data on the mirrors in exchange for 90% less
> bandwidth for the user (with an already-installed version).  Some
> brain-thinking required about signature handling and bzip2 packages.

I don't think it is possible to use zsync for .bz2 files. Between 	
koffice-libs_1%3a1.4.1-0ubuntu7.{1,2}_i386.deb (4.7MB) 99% saving is achieved 
with gzip, but with bzip2 no saving occurs. Even with "bzip2 -1" only 12.3% 
of the bandwidth is saved.

Zsync files are only about 1%-2% of the size the deb, but also typically 
require ~30% of the new deb to be downloaded.

My solution would be to just use bsdiff patches against data.tar and 
control.tar. My experimentation has lead me to believe that a bsdiff patch is 
typically 8% of the size of the whole deb . Hence putting up a bsdiff against 
every file in  the official i386 Ubuntu CD should not use more than 60MB. 
Putting up (n->n+1) patches for ten days would also allow people to follow 
the latest version with minimal bandwidth.

I know there is more than one official CD, but I suspect the total extra space 
required on the mirrors would be insignificant anyway.

Perhaps we could add this as a feature of apt-torrent so that patches remain 
so long as there exist seeds for them?

-- 
John C. McCabe-Dansted
Masters Student



More information about the ubuntu-devel mailing list