zstd compression for packages

Daniel Axtens daniel.axtens at canonical.com
Tue Mar 13 01:07:43 UTC 2018


On Tue, Mar 13, 2018 at 1:43 AM, Balint Reczey <balint.reczey at canonical.com>
wrote:

> Hi Daniel,
>
> On Mon, Mar 12, 2018 at 2:11 PM, Daniel Axtens
> <daniel.axtens at canonical.com> wrote:
> > Hi,
> >
> > I looked into compression algorithms a bit in a previous role, and to be
> > honest I'm quite surprised to see zstd proposed for package storage.
> zstd,
> > according to its own github repo, is "targeting real-time compression
> > scenarios". It's not really designed to be run at its maximum compression
> > level, it's designed to really quickly compress data coming off the wire
> -
> > things like compressing log files being streamed to a central server, or
> I
> > guess writing random data to btrfs where speed is absolutely an issue.
> >
> > Is speed of decompression a big user concern relative to file size? I
> admit
> > that I am biased - as an Australian and with the crummy internet that my
> > location entails, I'd save much more time if the file was 6% smaller and
> > took 10% longer to decompress than the other way around.
>
> Yes, decompression speed is a big issue in some cases. Please consider
> the case of provisioning cluoud/container instances, where after
> booting the image plenty of packages need to be installed and saving
> seconds matter a lot.
>
> Zstd format also allows parallel decompression which can make package
> installation even quicker in wall-clock time.
>
> Internet connection speed increases by ~50% (according to this [3]
> study which matches my experience)  on average per year which is more
> than 6% for every two months.
>
>
The future is pretty unevenly distributed, and lots of the planet is stuck
on really bad internet still.

AFAICT, [3] is anecdotal, rather than a 'study' - it's based on data from 1
person living in California. This is not really representative. If we look
at the connection speed visualisation from the Akamai State of the Internet
report [4], it shows that lots and lots of countries - most of the world! -
has significantly slower internet than that person.

(FWIW, anecdotally, I've never had a residential connection get faster
(except when I moved), which is mostly because the speed of ADSL is pretty
much fixed. Anecdotal reports from users in developing countries, and rural
areas of developed countries are not encouraging either: [5].)

Having said that, I'm not unsympathetic to the usecase you outline. I just
am saddened to see the trade-offs fall against the interests of people with
worse access to the internet. If I can find you ways of saving at least as
much time without making the files bigger, would you be open to that?

Regards,
Daniel

[4]
https://www.akamai.com/uk/en/about/our-thinking/state-of-the-internet-report/state-of-the-internet-connectivity-visualization.jsp
[5] https://danluu.com/web-bloat/




> >
> > Did you consider Google's Brotli?
>
> We did consider it but it was less promising.
>
> Cheers,
> Balint
>
> [3] http://xahlee.info/comp/bandwidth.html
>
> >
> > Regards,
> > Daniel
> >
> > On Mon, Mar 12, 2018 at 9:58 PM, Julian Andres Klode
> > <julian.klode at canonical.com> wrote:
> >>
> >> On Mon, Mar 12, 2018 at 11:06:11AM +0100, Julian Andres Klode wrote:
> >> > Hey folks,
> >> >
> >> > We had a coding day in Foundations last week and Balint and Julian
> added
> >> > support for zstd compression to dpkg [1] and apt [2].
> >> >
> >> > [1] https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=892664
> >> > [2] https://salsa.debian.org/apt-team/apt/merge_requests/8
> >> >
> >> > Zstd is a compression algorithm developed by Facebook that offers far
> >> > higher decompression speeds than xz or even gzip (at roughly constant
> >> > speed and memory usage across all levels), while offering 19
> compression
> >> > levels ranging from roughly comparable to gzip in size (but much
> faster)
> >> > to 19, which is roughly comparable to xz -6:
> >> >
> >> > In our configuration, we run zstd at level 19. For bionic main amd64,
> >> > this causes a size increase of about 6%, from roughly 5.6 to 5.9 GB.
> >> > Installs speed up by about 10%, or, if eatmydata is involved, by up to
> >> > 40% - user time generally by about 50%.
> >> >
> >> > Our implementations for apt and dpkg support multiple frames as used
> by
> >> > pzstd, so packages can be compressed and decompressed in parallel
> >> > eventually.
> >>
> >> More links:
> >>
> >> PPA:
> >> https://launchpad.net/~canonical-foundations/+
> archive/ubuntu/zstd-archive
> >> APT merge request: https://salsa.debian.org/apt-
> team/apt/merge_requests/8
> >> dpkg patches:      https://bugs.debian.org/892664
> >>
> >> I'd also like to talk a bit more about libzstd itself: The package is
> >> currently in universe, but btrfs recently gained support for zstd,
> >> so we already have a copy in the kernel and we need to MIR it anyway
> >> for btrfs-progs.
> >>
> >> --
> >> debian developer - deb.li/jak | jak-linux.org - free software dev
> >> ubuntu core developer                              i speak de, en
> >>
> >> --
>
>
> --
> Balint Reczey
> Ubuntu & Debian Developer
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-devel/attachments/20180313/a036e157/attachment.html>


More information about the ubuntu-devel mailing list