Compressing packages with bzip2 instead gzip?

Anderson Lizardo andersonlizardo at yahoo.com.br
Sat Jan 21 22:23:12 GMT 2006


Paul Sladen wrote:
> I think I'll just hack zsync to extend the Zmap table to support linear
> mappings, and possibly bzip2 mappings, aswell.  It maybe be possible to
> 'trick' the existing decoder into doing a straight copy for the linear part
> by letting it try to compress the stream, file to get the correct checksum
> and resort to downloading raw instead.  A deb would look like:
> 
>   linear mapping (ar header, control.tar.gz, data.tar header)
>   gzip mapping (data.tar)
>   linear mapping (ar tail)

I've been silently following this thread (mainly because I'm working on
other things :), and while playing with raw .deb handling, I wrote the
following script which scans a .deb file and prints ranges that can be
used to download _only_ a specific member of a .deb archive (ie.
data.tar.gz, control.tar.gz or debian-binary), using the HTTP RANGE
feature. Here it is:

===cut===
#!/usr/bin/perl
# (C) 2006 Anderson Lizardo
# License: GNU GPL v2
# Print ranges from .deb files that can be used to download only a
# specific member from the archive (using HTTP ranges).

use strict;
use warnings;

my $name; # file name
my $skip; # data to skip
my $size; # file size, in bytes
my $offset; # file offset inside .deb archive

read(STDIN, $name, 8);

die "Not a .deb file\n" unless $name eq "!<arch>\n";

while (!eof()) {
        read(STDIN, $name, 16);
        read(STDIN, $skip, 32); # skip unneeded header entries
        read(STDIN, $size, 10);
        read(STDIN, $skip, 2); # skip header separator
        $size =~ s/\s+$//;
        $offset = tell(STDIN);
        printf "$name %d-%d\n", $offset, $offset + $size - 1;
        read(STDIN, $skip, $size + ($size % 2)); # skip file's contents
}
===cut===

Sample output (for cpio_2.6-10_i386.deb):

$ ./print_deb_offsets.pl < cpio_2.6-10_i386.deb
debian-binary    68-71
control.tar.gz   132-1404
data.tar.gz      1466-93726

In this example, to download only the control.tar.gz file from
cpio_2.6-10_i386.deb, one could use:

curl -o control.tar.gz -r132-1404 \
http://archive.ubuntu.com/ubuntu/pool/main/c/cpio/cpio_2.6-10_i386.deb

So, my idea basically was to extend zsync to allow sync'ing ranges (e.g.
a --range option, just like curl) that could be used this way to "sync"
a .deb file:

1) keep a <package>.ranges file for each .deb file on the repository
containing the output of the script above.
2) Unpack a previously downloaded .deb package to get its 3 members
(debian-binary, control.tar.gz and data.tar.gz). Here we can
alternatively use the "dpkg-repack" idea from Paul Sladen to get enough
"local data" to zsync (modified data will be re-downloaded from the
server anyway)
3) zsync them separately using the new --range option
4) repack the .deb file and do additional integrity checks.

Advantage: minimal modifications to zsync (I suppose), as it already has
code to support HTTP ranges.

Regards,
-- 
Anderson Lizardo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
Url : https://lists.ubuntu.com/archives/ubuntu-devel/attachments/20060121/3dc568dc/signature.pgp


More information about the ubuntu-devel mailing list