Yet another incremental package download proposal

Phillip Susi psusi at cfl.rr.com
Thu Apr 14 18:32:19 UTC 2011


On 2/10/2011 11:27 PM, Mehmet Kose wrote:
> data.tar will compressed by 100K chunks. Gzip and xz permits this and
> can extract without problems. For gzip, there is avoidable small
> increase at archive size. For xz, about 10%. Xz with 1 MB chunks, no
> big difference.

I'm not seeing where in your proposal you make use of this --rsyncable
option.

> Apt starts downloading the package. If package format is 2.1 and there
> is an older version of .deb in local storage, stops after getting
> control.tar.gz.

At this point you have initiated an HTTP GET, parsed the incoming data,
almost certainly recieved additional data past the end of
control.tar.gz, some of which may or may not be data you ultimately
want, then you abort the connection.  This adds a good deal of latency
from multiple round trips.

> Extracts older version .deb it already has. Checks md5sums and symlink
> names, determines new and changed files, calculates where this files
> are in tarball. (Files are sorted alphabetically, there is no empty
> directory, symlinks are always at the end.)

If one byte in a 50 MB file changes, then you still end up downloading
the whole file.  If a few bytes in each of a dozen <10KB files changes,
then you would need to be careful to notice if they are contiguous and
download the whole chunk with one request, instead of a dozen separate
requests.

> If it sounds crazy, another option is simply using 'dar'.
> (http://dar.linux.free.fr/)

What does this have to do with anything?  dar is just an alternative for
tar.

The last time this was discussed, the best idea seemed to be zsync with
lookinside.



More information about the ubuntu-devel mailing list