Revisiting default initramfs compression

Dave Jones dave.jones at canonical.com
Wed Mar 9 13:24:55 UTC 2022


Hi Michael (and others),

Julian's summarised this near perfectly, but I'll try and add a little 
detail from the data I've gathered [1] (with others' generous help, in 
particular Heinrich for the RISC-V bits):

On Wed, Mar 09, 2022 at 09:46:19AM +0100, Julian Andres Klode wrote:
>On Wed, Mar 09, 2022 at 02:10:57PM +1300, Michael Hudson-Doyle wrote:
[snip]
>> > some time ago, the default compressor for initramfs was changed
>> > from lz4 -9 to zstd -19. This caused significant problems:
>> >
>>
>> Exactly three months later... we still haven't taken any action on this.
>> Time to do something!

Agreed!

>> I have a few questions below but tl;dr: unless there are immediate
>> objections, I'm going to make a change to initramfs-tools to allow the
>> compression level to be configured and set the default to 12 for zstd.
>
>So xypron had a patch to change the default level to 9 for sponsoring
>out for a couple of months now (no idea how that level came up).
>
>We pushed back on that as it does not account for low-memory systems
>which we need to take care of as well.

Yes, zstd -12 is not safe on low memory systems. In particular it fails 
to even run successfully on an otherwise entirely idle and unloaded Pi 
Zero 2 (which, on arm64, has a little more than 200MB of RAM free at 
runtime, whilst zstd -T0 -12 on a PC requested ~415MB resident at 
runtime).

In fact, I'm not convinced *any* of zstd's levels are actually useful on 
a machine with as limited RAM as the Zero 2 (or 3A+) are. For example, 
with ~200MB of RAM free, if the user is running a daemon that eats, say, 
100MB resident and we start up a compressor that eats 50MB (as zstd -T0 
does at level -1) we stand a fair chance of pushing the daemon into OOM.

Now, in practice this doesn't actually matter right now as I've already 
overridden initramfs' default to lz4 in ubuntu-raspi-settings, however I 
think there are adjustments that should be made there too (and there is 
the question of whether this is relevant for, say, minimal memory cloud 
instances).

>We then postponed any implementation to after a discussion in
>Frankfurt.
>
>I think the summary from the Frankfurt discussion was:
>
>- lz4 -1 is the right choice for low-memory systems
>- if you have more memory, zstd -1 becomes the best choice
>- pigz is outperforming both a bunch of times
>
>But that's really for waveform to share.

Just to clarify a couple of things here:

Firstly I actually think lz4 -2 is probably the ideal level for that 
compressor. There's a large difference in compression performance 
between lz4 -1 and lz4 -2 across all platforms tested, but no difference 
in memory usage, and only a minimal increase in compression & 
decompression time. However, lz4 is currently configured to use level -9 
which takes a considerable amount of extra time for little to no gain in 
compression performance (at least with our initramfs inputs anyway).

On machines with more generous RAM allowances, zstd -T0 -1 does appear 
to be the ideal. The incremental gains in compression at higher levels 
are outweighed by the extra time spent compressing (i.e. for our 
initramfs inputs at least, the extra time spent on the compression is 
not gained back on reading the compressed data at I/O speeds typical for 
their respective platforms).

[snipped some data]

At this point, if you want some data to play with I'd highly recommend 
cloning the following repo and following the instructions in the README:

https://github.com/waveform80/compression

>> Are people really going to run an arm64 userland on a Pi 0?

I have no specific figures on this, but given the *vast* majority of 
Ubuntu downloads for the Pi skew towards arm64 (about 10:1 last time I 
looked) I think it's a safe assumption that at least some do.

>> Any "real" solution for pi 0 has to involve doing at least the bulk 
>> of the compression not on the pi, there's no real way around that. 
>> Which is something we should do, but realistically it's not happening 
>> for jammy.

I'm not so sure about that. On a Pi Zero 2 under the arm64 arch, lz4 -2 
takes ~8MB of resident RAM, and ~1.9s of time to compress the initramfs 
to approximately half its size. If we're willing to wait a little 
longer, lz4 -4 takes 8.7s (and no extra RAM) to get it down to 40% of 
its size. The current -9 level is pointless, however, taking 21s (and no 
extra RAM) to get to 39%.

[snip]
>> I think this broadly makes sense. I'll notice that currently
>> initramfs-tools doesn't allow tuning the compression level at all :/
>> Probably fairly routine to add support for that though.

This would indeed be nice, as would the option to support "none" as a 
compression level (I opine a little more on this in the conclusion 
section of the analysis notebook in the repo linked above).

Last time I looked at this, it didn't look difficult to add the ability 
to provide an entirely custom command for compression via the COMPRESS 
option. In fact, I have the feeling it was *intended* to permit this, 
but somewhere along the way either the intention was dropped or it got 
broken by the addition of some extra parameters (which aren't strictly 
necessary). Happy to work on initramfs-tools to add this facility 
though.

>> > Finding the right level between -1 and -19 is hard. The more
>> > cores you have, the less penalty you pay for higher level.

Indeed -- as with compilation it's more about the ratio of cores 
available to memory available than any single metric (we had some 
experience of this on piwheels when all our builders were Pi 3s with 4 
CPU cores, but only a single GB of RAM; several larger packages naively 
assumed "if you've got 4 cores, it's fine to build using all of them" 
before dying horribly with OOM!).

[snip]
>During the discussion in Frankfurt, it was argued some
>architectures/boards really need lz4 -1 (and that lz4 -9 makes
>no sense whatsoever).

Yup, as mentioned above I've already overridden the Pi images to use lz4 
anyway, but I'd like to change the initramfs-tools default to lz4 -2 
(most likely) as that seems a preferable default for that compressor 
across all archs. I'd also like to change zstd -T0 to level -1 as that 
also seems preferable across all archs, after playing with the data 
gathered in the aforementioned repo.

Do let me know if any clarifications are needed (and my apologies in 
advance for the excessive wordiness in the analysis repo!)

Dave.



More information about the ubuntu-devel mailing list