Amendment: Update on reducing initramfs size and speed up

Benjamin Drung bdrung at ubuntu.com
Fri Aug 11 20:20:57 UTC 2023


On Fri, 2023-08-11 at 19:32 +0000, Benjamin Drung wrote:
> On Mon, 2023-07-31 at 22:49 +0000, Benjamin Drung wrote:
> > On Mon, 2023-07-31 at 20:45 +0100, Dimitri John Ledkov wrote:
> > > On Mon, 31 Jul 2023 at 20:41, Benjamin Drung <bdrung at ubuntu.com> wrote:
> > > > 
> > > > On Thu, 2023-07-27 at 11:51 +1200, Michael Hudson-Doyle wrote:
> > > > > 
> > > > > 
> > > > > On Thu, 27 Jul 2023 at 09:21, Benjamin Drung <bdrung at ubuntu.com>
> > > > > wrote:
> > > > > > On Wed, 2023-07-26 at 17:53 +0200, Benjamin Drung wrote:
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > A few weeks ago, I posted an idea how to reduce the initramfs size
> > > > > > > and
> > > > > > > speed up the generation:
> > > > > > > 
> > > > > > > https://lists.ubuntu.com/archives/ubuntu-devel/2023-July/042652.html
> > > > > > > 
> > > > > > > This post sparked a lively discussion. The initial idea was
> > > > > > > ditched for
> > > > > > > a better solution: mkinitramfs will put all compressed files
> > > > > > > (kernel
> > > > > > > modules and firmware files) into a cpio archive that is not
> > > > > > > compressed
> > > > > > > (because compressing compressed files makes no sense). All other
> > > > > > > files
> > > > > > > will be added to a cpio archive that gets compressed. As next
> > > > > > > steps, the
> > > > > > > kernel modules and firmware files need to be shipped compressed.
> > > > > > > 
> > > > > > > After several iterations for the implementation and review by
> > > > > > > Daves
> > > > > > > Jones, I just uploaded initramfs-tools 0.142ubuntu8 to mantic that
> > > > > > > puts
> > > > > > > compressed kernel modules and firmware files in an uncompressed
> > > > > > > cpio
> > > > > > > (https://launchpad.net/bugs/2028567).
> > > > > > > 
> > > > > > > I created/updated the follow-up tickets and added my patches to
> > > > > > > them:
> > > > > > > 
> > > > > > > Ship kernel modules Zstd compressed
> > > > > > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2028568
> > > > > > > 
> > > > > > > compress firmware in /lib/firmware
> > > > > > > https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1942260
> > > > > > > 
> > > > > > > And without further ado, here come the benchmark results:
> > > > > > > 
> > > > > > > The benchmarks were done either on an AMD Ryzen 7 5700G with
> > > > > > > schroot and
> > > > > > > overlay on tmpfs or on the hardware mentioned. All tests were
> > > > > > > running
> > > > > > > the latest Ubuntu mantic development release.
> > > > > > > 
> > > > > > > * minimal: schroot with linux-image-generic initramfs-tools zstd
> > > > > > > * full: minimal + busybox-initramfs cryptsetup-initramfs
> > > > > > >    isc-dhcp-client kbd lvm2 mdadm ntfs-3g plymouth plymouth-theme-
> > > > > > > spinner
> > > > > > > * nvidia: full + linux-headers-generic nvidia-driver-525
> > > > > > > * nvidia fw: nvidia + compressed /lib/firmware/nvidia/525.125.06/
> > > > > > > * VisionFive 2: VisionFive 2 RISC-V board
> > > > > > > * RPi Zero 2: Raspberry Pi Zero 2 ARM board (running armhf)
> > > > > > > 
> > > > > > > "next" means using kernel/firmware/initramfs from ppa:bdrung/ppa
> > > > > > > i.e.
> > > > > > > * initramfs-tools 0.142ubuntu7bd4
> > > > > > > * linux 6.3.0-7.7bd2
> > > > > > > * linux-firmware 20230629.gitee91452d-0ubuntu1bd1
> > > > > > > 
> > > > > > > > >                | build   | size               | uncompressed
> > > > > > > > size  |
> > > > > > > > > test           | time    | in bytes  | in MiB | in bytes  | in
> > > > > > > > MiB |
> > > > > > > > > ----------------|---------|-----------|--------|---------------
> > > > > > > > -----|
> > > > > > > > > minimal        | 4.30 s  |  66701818 |  63.6  | 224087608 |
> > > > > > > > 213.7  |
> > > > > > > > > minimal next   | 4.54 s  |  59935186 |  57.2  |  67738810 |
> > > > > > > > 64.6  |
> > > > > > > > > full           | 7.15 s  | 118007038 | 112.5  | 387976843 |
> > > > > > > > 370.0  |
> > > > > > > > > full next      | 7.29 s  | 106937908 | 102.0  | 128610985 |
> > > > > > > > 122.7  |
> > > > > > > > > nvidia         | 7.04 s  | 209200523 | 199.5  | 513554279 |
> > > > > > > > 489.8  |
> > > > > > > > > nvidia next    | 7.21 s  | 195246287 | 186.2  | 235288174 |
> > > > > > > > 224.4  |
> > > > > > > > > nvidia fw next | 7.16 s  | 191329102 | 182.5  | 213078234 |
> > > > > > > > 203.2  |
> > > > > > > > > VisionFive 2   | 142.9 s | 121895035 | 116.2  | 411160836 |
> > > > > > > > 392.1  |
> > > > > > > > > VF 2 next      | 126.7 s | 111651453 | 106.5  | 134120804 |
> > > > > > > > 127.9  |
> > > > > > > > > RPi Zero 2     | 109.5 s |  39803044 |  40.0  |  69592789 |
> > > > > > > > 66.4  |
> > > > > > > > > RPi Zero 2 ²   | 103.5 s |  39804499 |  40.0  |  69592789 |
> > > > > > > > 66.4  |
> > > > > > > > > RPi Zero 2 next| 101.2 s |  31463352 |  30.0  |  41145762 |
> > > > > > > > 39.2  |
> > > > > > > 
> > > > > > > ² Updated initramfs-tools (but no compressed modules or firmware)
> > > > > > > 
> > > > > > > The build time was averaged over five runs.
> > > > > > > 
> > > > > > > > > improvement  | build time | size   | uncompressed size |
> > > > > > > > > --------------|------------|--------|-------------------|
> > > > > > > > > minimal      |  105.6 %   | 89.9 % |      30.2 %       |
> > > > > > > > > full         |  102.0 %   | 90.6 % |      33.1 %       |
> > > > > > > > > nvidia       |  101.7 %   | 91.5 % |      41.5 %       |
> > > > > > > > > VisionFive 2 |   88.7 %   | 91.6 % |      32.6 %       |
> > > > > > > > > RPi Zero 2   |   92.4 %   | 79.0 % |      59.1 %       |
> > > > > > > 
> > > > > > > Building the initramfs takes more CPU cycles (see tests on tmpfs),
> > > > > > > but
> > > > > > > saves time on disk IO. Daves Jones saw much bigger time savings on
> > > > > > > his
> > > > > > > Raspberry Pis but his tests were on lunar.
> > > > > > > 
> > > > > > > Build time influence:
> > > > > > > + add_directories plus uniq take several milliseconds
> > > > > > > + depmod on compressed kernel modules take hundreds of
> > > > > > >    milliseconds longer
> > > > > > > - copying smaller kernel modules (due to compression) is faster
> > > > > > > - cpio archive that needs to be compressed is smaller
> > > > > > > - not storing intermediate cpio archives saves time
> > > > > > > 
> > > > > > > Saving 10 to 20 percent on the initramfs size and only needing
> > > > > > > half or a
> > > > > > > third of the size when unpacked (i.e. needed memory during boot)
> > > > > > > is a
> > > > > > > good improvement.
> > > > > > 
> > > > > > The smaller initramfs overall size (less to load into memory and
> > > > > > unpack)
> > > > > > and the smaller compressed cpio (less to decompress) have a positive
> > > > > > effect on the boot speed, especially on systems with slow CPU and/or
> > > > > > slow IO.
> > > > > > 
> > > > > > When looking at the "kernel" time from systemd-analyze, the
> > > > > > improvement
> > > > > > ranges from 1.62s - 1.36s = 0.26s in a VM on my desktop to a heavily
> > > > > > noticeable 37.9s - 16.5s = 21.4s on the VisionFive 2 RISC-V board.
> > > > > > 
> > > > > 
> > > > > 
> > > > > This is good stuff. It's a bit of a shame that the build time for the
> > > > > initramfs hasn't improved much. I guess it's not as dominated by
> > > > > compression time as I thought?
> > > > > 
> > > > > Do you have any thoughts about making it faster? I know I once ran
> > > > > 'strace -ff mkinitramfs' and ended up with tens or hundreds of
> > > > > thousands of trace files so not having everything done by a billion
> > > > > tiny shell scripts would help, but I don't know how much.
> > > > 
> > > > Good questions. I sprinkled mkinitramfs with "date -Ins" to see where
> > > > most time is spent. I ran the "full next" test case in a chroot on my
> > > > laptop. mkinitramfs took 19.71 seconds.
> > 
> > Correction: The environment was not "full next" (using zst compressed
> > kernel modules and firmware files) but only using "full" with the latest
> > packages from mantic (uncompressed modules + firmware). "full next"
> > takes only 17.6 seconds on my laptop and the numbers below would be
> > slightly different.
> > 
> > > > The most time consuming parts:
> > > > 
> > > > * 10.13 s (51.4%) auto_add_modules
> > > > * 7.20 s (36.5%) run_scripts_optional /usr/share/initramfs-tools/hooks
> > > > * 1.27 s (6.4%) final { cat; cpio; cpio|compress } > $outfile
> > > > 
> > > > The remaining 1.11 seconds are spread over the remaining parts.
> > > > 
> > > > Following hooks in /usr/share/initramfs-tools/hooks took the longest:
> > > > 
> > > > * 4.56s (63.3%) framebuffer
> > > > * 0.87s (12.1%) plymouth
> > > > * 0.81s (11.3%) cryptroot
> > > > * 0.23s (3.2%) lvm2
> > > > * 0.18s (2.5%) udev
> > > > * 0.17s (2.4%) mdadm
> > > > 
> > > > The remaining 0.38 seconds are spread over the remaining dozen of hooks.
> > > > 
> > > > So we should focus on auto_add_modules and the slowest scripts in
> > > > /usr/share/initramfs-tools/hooks to improve the execution time.
> > > > 
> > > > The framebuffer hook just calls copy_modules_dir and manual_add_modules
> > > > multiple times. auto_add_modules calls copy_modules_dir multiple times
> > > > and manual_add_modules. copy_modules_dir calls find and then
> > > > manual_add_modules.
> > > > 
> > > > So most time will be spent in manual_add_modules. This function calls
> > > > modprobe on the modules, copies the modules, and add_firmware on the
> > > > firmwares from a "modinfo -F firmware" call.
> > > > 
> > > > Any ideas to cut that time down? Using a cache for the modprobe/modinfo
> > > > calls?
> > > 
> > > In core-initrd, the above shell script functions are not used, instead
> > > /usr/lib/dracut/dracut-install is used which can resolve individual
> > > modules, subtrees, and wildcards with firmware pretty quickly. Was not
> > > benchmarked. I wonder if you can experiment to assemble a long list of
> > > requested modules / patterns / module dirs, and do a single call to
> > > dracut-install to do them all in one go, and see if that helps things?
> > 
> > I had a look at the source code of /usr/lib/dracut/dracut-install. This
> > binary is written in C and it looked promising. So I did a quick draft
> > by just replacing the content of manual_add_modules by this one-liner:
> > 
> > /usr/lib/dracut/dracut-install -D "$DESTDIR" \
> >     --kerneldir "/lib/modules/${version}" -m "$@"
> > 
> > (Note: We would need to set the environment variable
> > DRACUT_FIRMWARE_PATH in case $version != "$(uname -r)")
> > 
> > This small change cuts the execution time of "full next" (i.e. current
> > mantic + zst kernel modules and firmware files) from 17.6 seconds down
> > to 9.9 second. Nearly halfing the execution time is a big improvement!
> > 
> > I diffed the result of lsinitramfs and the dracut-install initramfs had
> > 12 modules less. It also printed 3 warnings:
> > 
> > dracut-install: Failed to find module 'nvmem-imx-ocotp'
> > dracut-install: Failed to find module 'pl330'
> > dracut-install: Failed to find module 'fbcon'
> > 
> > So a little bit of polishing is needed. Depending on dracut-core just
> > for using dracut-install looks okay to me.
> > 
> > That's enough from my side today. I will head to the bed now and rescue
> > it from the cat.
> 
> Another night to further investigate. I found the reason for the
> difference. dracut-install will stop at the first missing module. You
> have to specify -o to continue. With -o the resulting initramfs is
> identical.
> 
> It reduces the update-initramfs execution time by 40% (from 18.3 to 11.0
> seconds) on my laptop.
> 
> I created https://launchpad.net/bugs/2031185 to get this change
> released. The problem is that dracut-core is in universe. So a MIR is
> needed...

Tweaking the code to only call dracut-install on the combined list of
kernel modules reduces the execution time further from 11.0 seconds to
10.4 seconds (43% total execution time reduction).

-- 
Benjamin Drung
Debian & Ubuntu Developer



More information about the ubuntu-devel mailing list