Revisiting default initramfs compression

Dan Streetman ddstreet at ieee.org
Wed Mar 9 12:33:46 UTC 2022


On Tue, Mar 8, 2022 at 8:11 PM Michael Hudson-Doyle
<michael.hudson at canonical.com> wrote:
>
> On Thu, 9 Dec 2021 at 06:13, Julian Andres Klode <julian.klode at canonical.com> wrote:
>>
>> Hi all,
>>
>> some time ago, the default compressor for initramfs was changed
>> from lz4 -9 to zstd -19. This caused significant problems:
>
>
> Exactly three months later... we still haven't taken any action on this. Time to do something!
>
> I have a few questions below but tl;dr: unless there are immediate objections, I'm going to make a change to initramfs-tools to allow the compression level to be configured and set the default to 12 for zstd.
>
>> - it is very slow
>> - it uses a lot of memory
>>
>> The former is a problem for everyone, the latter means that
>> zstd just crashes on a Pi Zero.
>>
>> This is an analysis of what we have in terms of time spent,
>> memory spent, and file size achieved, and where we can
>> go from here.
>>
>> # Comparison of different compression levels
>>
>> ## Desktop (ThinkPad T480s, jammy)
>>
>> level    usertime   elapsed memory fileSize
>> lz4         9.65s    11.09s    12M      64M
>> -1          5.69s         6.99s    24M      57M
>> -6         12.59s     8.58s    99M      47M
>> -12        19.85s    10.82s   249M      41M
>> -19        71.29s    26.95s   519M      35M
>>
>> -> I believe that somewhere around -12 is a decent
>>    compromise between size and speed.
>
>
> I would agree that it's certainly better than -19.
>
>>
>> ## Pi 4 (arm64, focal)
>>
>> Times have been measured for mkinitramfs only. A full
>> update-initramfs call spends much more time copying
>> some firmware bits to boot partition with flash-kernel
>>
>> level    usertime   elapsed memory fileSize
>> lz4        21.10s    64.85s    21M      29M
>> -1         13.73s    44.55s    21M      27M
>> -6         26.07s    49.09s    91M      24M
>> -12        48.18s    54.67s   203M      22M
>> -19       130.07s    92.80s   350M      20M
>>
>> -> 6 is essentially free if the Pi 4 is idle. Nice.
>> -> -6 is still 20% of total RAM of a Pi 0
>
>
> Are people really going to run an arm64 userland on a Pi 0?
>
> Any "real" solution for pi 0 has to involve doing at least the bulk of the compression not on the pi, there's no real way around that. Which is something we should do, but realistically it's not happening for jammy.
>
>>
>> -> There's no meaningful difference between -6 and -12
>>    in terms of time elapsed. -6 uses 116% CPU, -12 uses
>>    145% CPU.
>>
>> ## Adaptive compression
>>
>> zstd also supports adaptive compression, compressing as hard as
>> it can while not impacting I/O speed. So hardware with slow I/O
>> like a Pi would compress harder to avoid idling.
>>
>> This is somewhat suboptimal with recent update-initramfs though,
>> as it first writes the cpio archive to the disk and then compresses
>> it rather than doing it in a pipe where that would be more
>> meaningful.
>>
>> Question: Does zstd --adapt adapt to memory available?
>
>
> While attractive, this does feel a bit risky. We want to be able to make reasonable predictions about the size of the initrd.
>
>>
>> # Way(s) forward
>>
>> To remedy the issue the proposal is to build with
>>
>> - zstd -1 on hardware with 512 MB or less memory
>> - zstd between -1 and -19 on other hardware
>> - zstd -19 during image building
>
>
> I think this broadly makes sense. I'll notice that currently initramfs-tools doesn't allow tuning the compression level at all :/ Probably fairly routine to add support for that though.
>
>>
>> Finding the right level between -1 and -19 is hard. The more
>> cores you have, the less penalty you pay for higher level.

Just FYI, initramfs uses the zstd -T0 parameter to 'autodetect' the
number of threads to use for compression, and for whatever reason zstd
is coded to only use the number of *cores*, not the number of logical
processors, so (for example) on my dual-socket 14-core hyperthreaded
system, I have 28 total 'cores', but 56 processors. If I compress with
zstd -T0, it uses only 28 threads.

This results in -T0 being slower than specifying the real number of
processors; with -19 compression it took roughly 50% more time for
compression (in my very limited and very anecdotal tests).

Upstream commit 6a46e38de adds a parameter --auto-threads=logical to
actually use all the logical cores in the system; that's still only in
the 'dev' branch but if compression time really is a major issue it
might be worth looking at that, especially for low-process-power
systems that do use hyperthreading (or, initramfs could just determine
the number of logical processors itself and pass that value to zstd -T
directly).

>
>
> You suggested -12 above. How about we try that to start with?
>
> Do you have an idea how to detect "512 MB or less memory"?
>
>>
>> Going for adaptive compression would remove the guess work, but
>> will result in larger images on faster machines. Maybe that's
>> fine, though - they probably have more space on /boot anyway?
>>
>> If we want to aim for 5% of total memory, we should probably
>> aim for something like:
>>
>> -1  on <= 512MB
>> -6  on <= 2 GB (or --adapt=min=1,max=6)
>> -12 on the rest (or --adapt=min=12)
>>
>> It's clear that in all cases, zstd -1 is at least better than the
>> lz4 -9 we used before; both in terms of space used, and time spent.
>
>
> Yeah that's interesting.
>
>>
>> # Concerns
>>
>> Lowering the compression level will reduce the boot speed by fractions
>> of a second on hardware with fast I/O.
>
>
> Well speaking selfishly, the mean "number of boots per mkinitramfs call" for me is about 1...
>
> Cheers,
> mwh
> --
> ubuntu-devel mailing list
> ubuntu-devel at lists.ubuntu.com
> Modify settings or unsubscribe at: https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel



More information about the ubuntu-devel mailing list