Binary diffs for deb files
Herve at lucidia.net
Thu May 4 16:56:18 BST 2006
Actually, you're probably better off having full packages and the diff
for the last [two], so the system only profits the [almost] up-to-date
guys, but things don't get out of control.
If you're two versions late, the diff is likely to be large anyway;
more like development scenario I suppose: stable distributions don't
produce changes very day.
What? Who said 184.108.40.206? Just kidding ;-)
2006/5/3, James Hall <rio at x5g.com>:
> On Wed, 2006-05-03 at 21:15 +1200, john wrote:
> > > 2. How do we resign all these packages?
> > I believe the consensus was that it would be best to maintain binary
> > equality.
> My current set of scripts will preserve checksums so a patched deb can
> share exactly the same signature as a full deb.
> > My results suggested that zsync would reduce bandwidth to ~33% for
> > regular upgrades (not necessarily dist-upgrades) compared to 10%-5% for bsdiff.
> Bsdiff does very little for compressed Debs. As I illustrated earlier,
> using bsdiff on _uncompressed_ debs reduces bandwidth by around 70% and
> even more for larger packages.
> Looking at other mailing lists and blogs - a main concern for not
> implementing something like this seems to be server disk space. Ideally
> there would be a diff between each package version and the original
> packages themselves remain intact but there are alternatives.
> Here are a few scenarios we could try: (A script could do some clever
> sums and figure out how to best use the space available maybe)
> 1. As it is now, download 10MB for each update:
> Version 1 Version 2 Version 3
> 10MB 10MB 10MB = 30MB
> 2. Diff between all with originals:
> (The best but most disk space consuming)
> Version 1 <--diff--> Version 2 <--diff--> Version 3
> 10MB 3MB 10MB 3MB 10MB = 36MB
> 3. Diff between all with one original:
> (Worst, saves space)
> Version 1 <--diff--> <--diff-->
> 10MB 3MB 3MB = 16MB
> 4. Some originals, diffs between all
> (A mix of 2 and 3, this one shows 7 package versions instead of 3)
> F = Full package 10MB
> D = Difference 3MB
> 1 2 3 4 5 6 7
> F D D D F D D D F = 48MB (vs. 70MB for Scenario1)
> Scenario 3 actually saves server space but could be inconvenient for
> users who don't update regularly. Scenario 3 could be improved by
> ensuring it doesn't become substantially less efficient for people not
> updating frequently (As shown in Scenario 4).
> Looking at Scenario 4 more carefully
> Say the client already has version 3, and needs version 7:
> a. The client calculates how much downloading is needed just with diffs:
> Diff between 3 and 4: 3MB
> Diff between 4 and 5: 3MB
> Diff between 5 and 6: 3MB
> Diff between 6 and 7: 3MB
> Total: 12MB
> b. The client calculates again using the latest full package size +
> diffs if necessary:
> Full package 7: 10MB
> Total: 10MB
> After doing these two simple add ups, it goes ahead and downloads the
> least amount required. Which in this case would be B.
> So there are other ways around the server space problem. The bigger the
> gap between full packages - the less efficient it is for people who
> don't update regularly. The smaller the gap - the more disk space
> required by the mirrors. 'Real-world' numbers need to be made to find a
> healthy compromise between the two. The goal of binary diffs is to make
> regular upgrades as painless as possible, security updates are TINY with
> bsdiff. The diff between firefox 1.0.7 and 1.0.8 is over 20 times
> smaller than a normal update. The advantages are obvious, but we need to
> reduce the disadvantages if we're going to see this implemented in my
> lifetime ;).
> Kind Regards,
> James Hall
> ubuntu-devel mailing list
> ubuntu-devel at lists.ubuntu.com
In a world without walls and fences, who needs Windows and Gates?
More information about the kubuntu-devel