Revisiting default initramfs compression

Michael Hudson-Doyle michael.hudson at canonical.com
Tue Mar 15 08:41:28 UTC 2022


On Fri, 11 Mar 2022 at 02:39, Dave Jones <dave.jones at canonical.com> wrote:

> On Thu, Mar 10, 2022 at 12:10:39PM +0100, Julian Andres Klode wrote:
> >On Wed, Mar 09, 2022 at 01:24:55PM +0000, Dave Jones wrote:
> [snip]
> >> Firstly I actually think lz4 -2 is probably the ideal level for that
> >> compressor. There's a large difference in compression performance
> >> between lz4 -1 and lz4 -2 across all platforms tested, but no
> >> difference in memory usage, and only a minimal increase in
> >> compression & decompression time. However, lz4 is currently
> >> configured to use level -9 which takes a considerable amount of extra
> >> time for little to no gain in compression performance (at least with
> >> our initramfs inputs anyway).
> >>
> >> On machines with more generous RAM allowances, zstd -T0 -1 does
> >> appear to be the ideal. The incremental gains in compression at
> >> higher levels are outweighed by the extra time spent compressing
> >> (i.e. for our initramfs inputs at least, the extra time spent on the
> >> compression is not gained back on reading the compressed data at I/O
> >> speeds typical for their respective platforms).
> >>
> >> [snipped some data]
> >>
> >> At this point, if you want some data to play with I'd highly
> >> recommend cloning the following repo and following the instructions
> >> in the README:
> >>
> >> https://github.com/waveform80/compression
> >
> >I'm still not convinced by the data as it does not align at all what I
> >see on my laptop or does it? It certainly does not _feel_ like it, as I
> >was arguing that -12 makes most sense.
>
> I've included the gather.py script in the compression repo if you want
> to run it against your laptop? The entries can be keyed by an arbitrary
> label (like "Julian's laptop") so they wouldn't clobber the existing PC
> based results.
>
> >I did some reameasurements
> >
> >Compression levels:
> >
> >uncompressed        157MB
> >lz4 -2               75MB (42%)
> >lz4 -9               63MB (40%)
> >zstd -T0 -1          56MB (36%)
> >zstd -T0 -2          52MB (33%)
> >zstd -T0 -3          47MB (30%)
> >zstd -T0 -6          45MB (29%)
> >zstd -T0 -12         40MB (22%)
> >
> >I don't know where 19 is, but a switch to lz4 -2 would roughly double
> >the size, that's for sure, so how would this affect /boot size?
>
> Sorry -- I should have clarified:
>
> I'm definitely *not* suggesting we switch PC users from zstd to lz4. As
> you point out, that would balloon the size of the compressed initrd. My
> one concern there was that even at level -1, zstd still uses a *teensy*
> bit too much RAM for my comfort on the extremely limited Pi Zero 2 and
> hence that we should be (and indeed, are) using lz4 instead there.
>
> When lz4 is in use (as on the Pi images on jammy), it is currently
> hard-coded to use level -9 which (as you observe below) is quite silly
> in spending a great deal more time achieving effectively no extra
> compression. In other words, my desire for lz4 -2 applies to the Pi
> images on Jammy alone and nothing else (it's a change that, if made,
> should *not* be back-ported to Focal for the reasons you've noted).
>
> >Looking at size, zstd clearly is the correct choice, if we reverted to
> >lz4 -2, sizes would even grow relative to older lz4 -9 choice, meaning
> >those users upgrading from focal run out of boot space.
> >
> >Ignoring non-LTS users for a moment, we essentially need to find a
> >compressor that accomodates the size increase in kernel initramfs due
> >to new code and stuff, and I think zstd -1 does that reasonably well.
>
> Agreed.
>
> >Times spent (compressor/total update-initramfs)
> >                   user        system      total
> >lz4 -2             0.3/ 6.2s   0.1/ 2.6s   0.3/ 8.2s (3% of
> update-initramfs time)
> >lz4 -9             4.8/10.8s   0.1/ 2.6s   4.9/12.9s <- this is totally
> silly
> >zstd -T0 -1        0.7/ 5.6s   0.1/ 1.7s   0.2/ 6.2s (um, faster than
> lz4?)
> >zstd -T0 -1        0.7/ 7.1s   0.1/ 3.5s   0.2/ 9.3s (um, much slower in
> 2nd run)
> >zstd -T0 -2        0.9/ 7.1s   0.1/ 3.0s   0.3/ 8.8s (more noise than
> difference)
> >zstd -T0 -3        1.6/ 7.8s   0.1/ 2.9s   0.5/ 8.8s
> >zstd     -3        0.9/ 7.2s   0.1/ 3.1s   0.8/ 9.5s
> >zstd     -3        0.9/ 7.7s   0.1/ 3.8s   0.8/10.7s (noise, lots of
> noise)
> >zstd -T0 -6        6.2/12.8s   0.1/ 3.9s   1.7/11.4s
> >zstd -T0 -12      13.1/19.7s   0.2/ 3.4s   4.0/13.0s
> >
> >It shows us that looking at the compressor does not tell us all the
> >story; for low-level zstd and lz4 values, you will absolutely not
> >notice the time spent compressing; in fact, there is more noise from
> >I/O or whatever despite the laptop essentially idling.
>
> Indeed; this is one of the reasons I stuck to the pure (de)compression
> times in my analysis and ignored the rest of update-initramfs as there's
> also *huge* variety across the architectures there (Pi I/O time is
> vastly different to a PC, and of course the inputs vary in size as well
> as the Pi has a much smaller default initrd after Juerg split the kernel
> modules in an -extras package).
>
> >There's no way I can figure out if zstd -3 performs worse than zstd -T0
> >-1, as it's runtime varies by 50%.
> >
> >We also need to consider initrds we prebuild on images and like
> >combined kernel.efi binaries: They are built once and used
> >hundredthousand of times, they need *special* configuration.
>
> Agreed.
>
> In fact, looking at the analysis figures, the LZMA compressors (xz,
> lzip, etc.) consistently beat zstd at producing smaller output at higher
> levels. For pre-built images it *may* be worth re-considering those
> algorithms, but we'd need to measure the decompression performance
>
> (For the record, it's definitely not worth considering these algorithms
> for non-pre-built images; they may compress really well, but they're
> also reaaaaaallllly sloooooow!)
>
> There's some decompression figures for these in the analysis db but they
> were gathered using the userspace applications and I've no idea if those
> figures would be pertinent to kernel initrd decompression.
>
> >But my conclusion now is that I think zstd -1 or zstd -2 or whatever is
> >probably a safe choice for users coming from focal in that it does not
> >grow their initrds, so it's probably a good default.
>
> Yup, sounds good to me!
>

So where is the debdiff? :-) If noone else has time I can probably work on
this but if someone else has done this already, even better.

Cheers,
mwh


> >One thing we should work on is performing the compression in parallel
> >to the CPIO building, this should reduce I/O wait times and offer more
> >meaningful parallelization. But not sure how feasible that is - I don't
> >just mean cpio | compressor, but also running the scripts, and copying
> >them to the output, more like scripts | cpio files from stdin |
> >compress.
>
> Sounds like a good plan.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.ubuntu.com/archives/ubuntu-devel/attachments/20220315/590b4972/attachment.html>


More information about the ubuntu-devel mailing list