Amendment: Update on reducing initramfs size and speed up
Adrien Nader
adrien at notk.org
Thu Jul 27 07:41:58 UTC 2023
On Thu, Jul 27, 2023, Michael Hudson-Doyle wrote:
> On Thu, 27 Jul 2023 at 09:21, Benjamin Drung <bdrung at ubuntu.com> wrote:
>
> > On Wed, 2023-07-26 at 17:53 +0200, Benjamin Drung wrote:
> > > Hi all,
> > >
> > > A few weeks ago, I posted an idea how to reduce the initramfs size and
> > > speed up the generation:
> > >
> > > https://lists.ubuntu.com/archives/ubuntu-devel/2023-July/042652.html
> > >
> > > This post sparked a lively discussion. The initial idea was ditched for
> > > a better solution: mkinitramfs will put all compressed files (kernel
> > > modules and firmware files) into a cpio archive that is not compressed
> > > (because compressing compressed files makes no sense). All other files
> > > will be added to a cpio archive that gets compressed. As next steps, the
> > > kernel modules and firmware files need to be shipped compressed.
> > >
> > > After several iterations for the implementation and review by Daves
> > > Jones, I just uploaded initramfs-tools 0.142ubuntu8 to mantic that puts
> > > compressed kernel modules and firmware files in an uncompressed cpio
> > > (https://launchpad.net/bugs/2028567).
> > >
> > > I created/updated the follow-up tickets and added my patches to them:
> > >
> > > Ship kernel modules Zstd compressed
> > > https://bugs.launchpad.net/ubuntu/+source/linux/+bug/2028568
> > >
> > > compress firmware in /lib/firmware
> > > https://bugs.launchpad.net/ubuntu/+source/linux-firmware/+bug/1942260
> > >
> > > And without further ado, here come the benchmark results:
> > >
> > > The benchmarks were done either on an AMD Ryzen 7 5700G with schroot and
> > > overlay on tmpfs or on the hardware mentioned. All tests were running
> > > the latest Ubuntu mantic development release.
> > >
> > > * minimal: schroot with linux-image-generic initramfs-tools zstd
> > > * full: minimal + busybox-initramfs cryptsetup-initramfs
> > > isc-dhcp-client kbd lvm2 mdadm ntfs-3g plymouth plymouth-theme-spinner
> > > * nvidia: full + linux-headers-generic nvidia-driver-525
> > > * nvidia fw: nvidia + compressed /lib/firmware/nvidia/525.125.06/
> > > * VisionFive 2: VisionFive 2 RISC-V board
> > > * RPi Zero 2: Raspberry Pi Zero 2 ARM board (running armhf)
> > >
> > > "next" means using kernel/firmware/initramfs from ppa:bdrung/ppa i.e.
> > > * initramfs-tools 0.142ubuntu7bd4
> > > * linux 6.3.0-7.7bd2
> > > * linux-firmware 20230629.gitee91452d-0ubuntu1bd1
> > >
> > > > | | build | size | uncompressed size |
> > > > | test | time | in bytes | in MiB | in bytes | in MiB |
> > > > |----------------|---------|-----------|--------|--------------------|
> > > > | minimal | 4.30 s | 66701818 | 63.6 | 224087608 | 213.7 |
> > > > | minimal next | 4.54 s | 59935186 | 57.2 | 67738810 | 64.6 |
> > > > | full | 7.15 s | 118007038 | 112.5 | 387976843 | 370.0 |
> > > > | full next | 7.29 s | 106937908 | 102.0 | 128610985 | 122.7 |
> > > > | nvidia | 7.04 s | 209200523 | 199.5 | 513554279 | 489.8 |
> > > > | nvidia next | 7.21 s | 195246287 | 186.2 | 235288174 | 224.4 |
> > > > | nvidia fw next | 7.16 s | 191329102 | 182.5 | 213078234 | 203.2 |
> > > > | VisionFive 2 | 142.9 s | 121895035 | 116.2 | 411160836 | 392.1 |
> > > > | VF 2 next | 126.7 s | 111651453 | 106.5 | 134120804 | 127.9 |
> > > > | RPi Zero 2 | 109.5 s | 39803044 | 40.0 | 69592789 | 66.4 |
> > > > | RPi Zero 2 ² | 103.5 s | 39804499 | 40.0 | 69592789 | 66.4 |
> > > > | RPi Zero 2 next| 101.2 s | 31463352 | 30.0 | 41145762 | 39.2 |
> > >
> > > ² Updated initramfs-tools (but no compressed modules or firmware)
> > >
> > > The build time was averaged over five runs.
> > >
> > > > | improvement | build time | size | uncompressed size |
> > > > |--------------|------------|--------|-------------------|
> > > > | minimal | 105.6 % | 89.9 % | 30.2 % |
> > > > | full | 102.0 % | 90.6 % | 33.1 % |
> > > > | nvidia | 101.7 % | 91.5 % | 41.5 % |
> > > > | VisionFive 2 | 88.7 % | 91.6 % | 32.6 % |
> > > > | RPi Zero 2 | 92.4 % | 79.0 % | 59.1 % |
> > >
> > > Building the initramfs takes more CPU cycles (see tests on tmpfs), but
> > > saves time on disk IO. Daves Jones saw much bigger time savings on his
> > > Raspberry Pis but his tests were on lunar.
> > >
> > > Build time influence:
> > > + add_directories plus uniq take several milliseconds
> > > + depmod on compressed kernel modules take hundreds of
> > > milliseconds longer
> > > - copying smaller kernel modules (due to compression) is faster
> > > - cpio archive that needs to be compressed is smaller
> > > - not storing intermediate cpio archives saves time
> > >
> > > Saving 10 to 20 percent on the initramfs size and only needing half or a
> > > third of the size when unpacked (i.e. needed memory during boot) is a
> > > good improvement.
> >
> > The smaller initramfs overall size (less to load into memory and unpack)
> > and the smaller compressed cpio (less to decompress) have a positive
> > effect on the boot speed, especially on systems with slow CPU and/or
> > slow IO.
> >
> > When looking at the "kernel" time from systemd-analyze, the improvement
> > ranges from 1.62s - 1.36s = 0.26s in a VM on my desktop to a heavily
> > noticeable 37.9s - 16.5s = 21.4s on the VisionFive 2 RISC-V board.
> >
>
> This is good stuff. It's a bit of a shame that the build time for the
> initramfs hasn't improved much. I guess it's not as dominated by
> compression time as I thought?
Compression is typically slower on inputs that compress poorly. That can
explain some of the low speedup.
> Do you have any thoughts about making it faster? I know I once ran 'strace
> -ff mkinitramfs' and ended up with tens or hundreds of thousands of trace
> files so not having everything done by a billion tiny shell scripts would
> help, but I don't know how much.
Without performance data it's early to discuss that but if that's the
case, that raises the question of how much effort we would want to put
into improving the scripts, and whether we would rather optimize the
shell script or rewrite it (or parts of it).
In any case, what would be an acceptable overhead for the script
compared to compression and I/O? Something like 20 or 30% at most?
> I wonder if we can make depmod incremental somehow?
If find that hundreds of milliseconds sounds like a lot. On my laptop,
modules are ~100KB of average and that would make the decompression
speed less than 500KB/s while it's closer to 1.5GB/s on my laptop. An
RPi Zero 2 is much slower but not that much slower. I wouldn't be
surprised that there are low hanging fruits and maybe some
initialization work that is done repeatedly.
--
Adrien
More information about the ubuntu-devel
mailing list