<div dir="ltr"><div dir="ltr"><br></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Thu, 10 Mar 2022 at 02:24, Dave Jones <<a href="mailto:dave.jones@canonical.com">dave.jones@canonical.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">Hi Michael (and others),<br>
<br>
Julian's summarised this near perfectly, but I'll try and add a little <br>
detail from the data I've gathered [1] (with others' generous help, in <br>
particular Heinrich for the RISC-V bits):<br>
<br>
On Wed, Mar 09, 2022 at 09:46:19AM +0100, Julian Andres Klode wrote:<br>
>On Wed, Mar 09, 2022 at 02:10:57PM +1300, Michael Hudson-Doyle wrote:<br>
[snip]<br>
>> > some time ago, the default compressor for initramfs was changed<br>
>> > from lz4 -9 to zstd -19. This caused significant problems:<br>
>> ><br>
>><br>
>> Exactly three months later... we still haven't taken any action on this.<br>
>> Time to do something!<br>
<br>
Agreed!<br>
<br>
>> I have a few questions below but tl;dr: unless there are immediate<br>
>> objections, I'm going to make a change to initramfs-tools to allow the<br>
>> compression level to be configured and set the default to 12 for zstd.<br>
><br>
>So xypron had a patch to change the default level to 9 for sponsoring<br>
>out for a couple of months now (no idea how that level came up).<br>
><br>
>We pushed back on that as it does not account for low-memory systems<br>
>which we need to take care of as well.<br>
<br>
Yes, zstd -12 is not safe on low memory systems. In particular it fails <br>
to even run successfully on an otherwise entirely idle and unloaded Pi <br>
Zero 2 (which, on arm64, has a little more than 200MB of RAM free at <br>
runtime, whilst zstd -T0 -12 on a PC requested ~415MB resident at <br>
runtime).<br>
<br>
In fact, I'm not convinced *any* of zstd's levels are actually useful on <br>
a machine with as limited RAM as the Zero 2 (or 3A+) are. For example, <br>
with ~200MB of RAM free, if the user is running a daemon that eats, say, <br>
100MB resident and we start up a compressor that eats 50MB (as zstd -T0 <br>
does at level -1) we stand a fair chance of pushing the daemon into OOM.<br></blockquote><div><br></div><div>Fair enough.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
Now, in practice this doesn't actually matter right now as I've already <br>
overridden initramfs' default to lz4 in ubuntu-raspi-settings, however I <br>
think there are adjustments that should be made there too (and there is <br>
the question of whether this is relevant for, say, minimal memory cloud <br>
instances).<br>
<br>
>We then postponed any implementation to after a discussion in<br>
>Frankfurt.<br>
><br>
>I think the summary from the Frankfurt discussion was:<br>
><br>
>- lz4 -1 is the right choice for low-memory systems<br>
>- if you have more memory, zstd -1 becomes the best choice<br>
>- pigz is outperforming both a bunch of times<br>
><br>
>But that's really for waveform to share.<br>
<br>
Just to clarify a couple of things here:<br>
<br>
Firstly I actually think lz4 -2 is probably the ideal level for that <br>
compressor. There's a large difference in compression performance <br>
between lz4 -1 and lz4 -2 across all platforms tested, but no difference <br>
in memory usage, and only a minimal increase in compression & <br>
decompression time. However, lz4 is currently configured to use level -9 <br>
which takes a considerable amount of extra time for little to no gain in <br>
compression performance (at least with our initramfs inputs anyway).<br>
<br>
On machines with more generous RAM allowances, zstd -T0 -1 does appear <br>
to be the ideal. The incremental gains in compression at higher levels <br>
are outweighed by the extra time spent compressing (i.e. for our <br>
initramfs inputs at least, the extra time spent on the compression is <br>
not gained back on reading the compressed data at I/O speeds typical for <br>
their respective platforms).<br></blockquote><div><br></div><div>Moving all the way to -1 does feel like quite a change, but well. I'm not opposed to it. </div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
[snipped some data]<br>
<br>
At this point, if you want some data to play with I'd highly recommend <br>
cloning the following repo and following the instructions in the README:<br>
<br>
<a href="https://github.com/waveform80/compression" rel="noreferrer" target="_blank">https://github.com/waveform80/compression</a><br>
<br>
>> Are people really going to run an arm64 userland on a Pi 0?<br>
<br>
I have no specific figures on this, but given the *vast* majority of <br>
Ubuntu downloads for the Pi skew towards arm64 (about 10:1 last time I <br>
looked) I think it's a safe assumption that at least some do.<br>
<br>
>> Any "real" solution for pi 0 has to involve doing at least the bulk <br>
>> of the compression not on the pi, there's no real way around that. <br>
>> Which is something we should do, but realistically it's not happening <br>
>> for jammy.<br>
<br>
I'm not so sure about that. On a Pi Zero 2 under the arm64 arch, lz4 -2 <br>
takes ~8MB of resident RAM, and ~1.9s of time to compress the initramfs <br>
to approximately half its size. If we're willing to wait a little <br>
longer, lz4 -4 takes 8.7s (and no extra RAM) to get it down to 40% of <br>
its size. The current -9 level is pointless, however, taking 21s (and no <br>
extra RAM) to get to 39%.<br></blockquote><div><br></div><div>Interesting.</div><div> </div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
[snip]<br>
>> I think this broadly makes sense. I'll notice that currently<br>
>> initramfs-tools doesn't allow tuning the compression level at all :/<br>
>> Probably fairly routine to add support for that though.<br>
<br>
This would indeed be nice, as would the option to support "none" as a <br>
compression level (I opine a little more on this in the conclusion <br>
section of the analysis notebook in the repo linked above).<br>
<br>
Last time I looked at this, it didn't look difficult to add the ability <br>
to provide an entirely custom command for compression via the COMPRESS <br>
option. In fact, I have the feeling it was *intended* to permit this, <br>
but somewhere along the way either the intention was dropped or it got <br>
broken by the addition of some extra parameters (which aren't strictly <br>
necessary). Happy to work on initramfs-tools to add this facility <br>
though.<br></blockquote><div><br></div><div>Yeah I wondered about that too when I looked at the code. Whatever we decide the best solution is, the current code is too inflexible.</div><div><br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex">
>> > Finding the right level between -1 and -19 is hard. The more<br>
>> > cores you have, the less penalty you pay for higher level.<br>
<br>
Indeed -- as with compilation it's more about the ratio of cores <br>
available to memory available than any single metric (we had some <br>
experience of this on piwheels when all our builders were Pi 3s with 4 <br>
CPU cores, but only a single GB of RAM; several larger packages naively <br>
assumed "if you've got 4 cores, it's fine to build using all of them" <br>
before dying horribly with OOM!).<br>
<br>
[snip]<br>
>During the discussion in Frankfurt, it was argued some<br>
>architectures/boards really need lz4 -1 (and that lz4 -9 makes<br>
>no sense whatsoever).<br>
<br>
Yup, as mentioned above I've already overridden the Pi images to use lz4 <br>
anyway, but I'd like to change the initramfs-tools default to lz4 -2 <br>
(most likely) as that seems a preferable default for that compressor <br>
across all archs. I'd also like to change zstd -T0 to level -1 as that <br>
also seems preferable across all archs, after playing with the data <br>
gathered in the aforementioned repo.<br>
<br>
Do let me know if any clarifications are needed (and my apologies in <br>
advance for the excessive wordiness in the analysis repo!)<br></blockquote><div><br></div><div>No all this makes sense, and you have clearly thought more about this than me! So let's get something uploaded please.</div><div><br></div><div>Cheers,</div><div>mwh </div><div><br></div><div> </div></div></div>