Binary diffs for deb files

Hervé Fache Herve at lucidia.net
Thu May 4 16:56:18 BST 2006


Actually, you're probably better off having full packages and the diff
for the last [two], so the system only profits the [almost] up-to-date
guys, but things don't get out of control.

If you're two versions late, the diff is likely to be large anyway;
more like development scenario I suppose: stable distributions don't
produce changes very day.

What? Who said 2.6.16.13? Just kidding ;-)

Hervé.

2006/5/3, James Hall <rio at x5g.com>:
> On Wed, 2006-05-03 at 21:15 +1200, john wrote:
> > > 2. How do we resign all these packages?
> >
> > I believe the consensus was that it would be best to maintain binary
> > equality.
>
> My current set of scripts will preserve checksums so a patched deb can
> share exactly the same signature as a full deb.
>
> > My results suggested that zsync would reduce bandwidth to ~33% for
> > regular upgrades (not necessarily dist-upgrades) compared to 10%-5% for bsdiff.
>
> Bsdiff does very little for compressed Debs. As I illustrated earlier,
> using bsdiff on _uncompressed_ debs reduces bandwidth by around 70% and
> even more for larger packages.
>
> Looking at other mailing lists and blogs - a main concern for not
> implementing something like this seems to be server disk space. Ideally
> there would be a diff between each package version and the original
> packages themselves remain intact but there are alternatives.
>
> Here are a few scenarios we could try: (A script could do some clever
> sums and figure out how to best use the space available maybe)
>
> -------------------------------------------
>
> 1. As it is now, download 10MB for each update:
>
> Version 1             Version 2             Version 3
>   10MB                   10MB                  10MB   =  30MB
>
> 2. Diff between all with originals:
> (The best but most disk space consuming)
>
> Version 1  <--diff--> Version 2  <--diff--> Version 3
>   10MB        3MB       10MB        3MB        10MB   =  36MB
>
> 3. Diff between all with one original:
> (Worst, saves space)
>
> Version 1  <--diff-->            <--diff-->
>   10MB        3MB                   3MB               =  16MB
>
> 4. Some originals, diffs between all
> (A mix of 2 and 3, this one shows 7 package versions instead of 3)
>
> F = Full package 10MB
> D = Difference   3MB
>
> 1   2   3   4   5   6   7
> F D   D   D F D   D   D F = 48MB (vs. 70MB for Scenario1)
>
> -------------------------------------------
>
> Scenario 3 actually saves server space but could be inconvenient for
> users who don't update regularly. Scenario 3 could be improved by
> ensuring it doesn't become substantially less efficient for people not
> updating frequently (As shown in Scenario 4).
>
> Looking at Scenario 4 more carefully
> ====================================
>
> Say the client already has version 3, and needs version 7:
>
> a. The client calculates how much downloading is needed just with diffs:
> Diff between 3 and 4: 3MB
> Diff between 4 and 5: 3MB
> Diff between 5 and 6: 3MB
> Diff between 6 and 7: 3MB
> Total: 12MB
>
> b. The client calculates again using the latest full package size +
> diffs if necessary:
> Full package 7: 10MB
> Total: 10MB
>
> After doing these two simple add ups, it goes ahead and downloads the
> least amount required. Which in this case would be B.
>
> ---------------------------------
>
> So there are other ways around the server space problem. The bigger the
> gap between full packages - the less efficient it is for people who
> don't update regularly. The smaller the gap - the more disk space
> required by the mirrors. 'Real-world' numbers need to be made to find a
> healthy compromise between the two. The goal of binary diffs is to make
> regular upgrades as painless as possible, security updates are TINY with
> bsdiff. The diff between firefox 1.0.7 and 1.0.8 is over 20 times
> smaller than a normal update. The advantages are obvious, but we need to
> reduce the disadvantages if we're going to see this implemented in my
> lifetime ;).
>
> Kind Regards,
> James Hall
>
> --
> ubuntu-devel mailing list
> ubuntu-devel at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
>


--
In a world without walls and fences, who needs Windows and Gates?


More information about the kubuntu-devel mailing list