Optimized kernel builds: the straight dope
Daniel
crassico at gmail.com
Tue Aug 15 03:02:09 BST 2006
Last time I tested it was on dapper release and the k7 kernel booted my
athlon more than 5 secs faster then 386. I'm the kind of user who reboots a
lot and I found those seconds pretty precious.
Just my half cent.
On 8/14/06, Matt Zimmerman <mdz at ubuntu.com> wrote:
>
> == Background ==
>
> You've all seen our kernel lineup before. In the beginning, on i386 we
> had
> linux-386, linux-686, linux-k7, linux-686-smp, linux-k7-smp. Later, we
> added specialized server kernels, and folded SMP support into linux-686
> and
> linux-k7.
>
> Since Linux has always offered CPU-specific optimizations, it's been taken
> for granted that this offered enough of a performance benefit to make all
> of
> this maintenance effort worthwhile. A lot has happened to the kernel
> since
> the early days, though, and for some time, it has been capable of loading
> these optimizations at runtime. Even when you use the -386 kernel, you
> get
> the benefit of many CPU-specific optimizations automatically. This is
> great
> news for integrators, like Ubuntu, because we want to provide everyone
> with
> the best experience out of the box, and as you know, there isn't room for
> so
> many redundant kernels on the CD (only one). Many users spend time and
> bandwidth quotas downloading these optimized kernel in hopes of squeezing
> the most performance out of their hardware.
>
> This begged the question: do we still need these old-fashioned builds?
> Experiments have shown that users who are told that their system will run
> faster will say that they "feel" faster whether there is a measurable
> difference or not. For fun, try it with an unsuspecting test subject:
> tell
> them that you'll "optimize" their system to make it a little bit faster,
> and
> make some do-nothing changes to it, then see if they notice a difference.
> The fact is, our observations of performance are highly subjective, which
> is
> why we need to rely on hard data.
>
> == Data ==
>
> Enter Ben Collins, our kernel team lead, who has put together a
> performance
> test to answer that question, covering both the i386 and amd64
> architectures. The results are attached in the form of an email from him.
> Following that is a README which explains how to interpret the numerical
> figures.
>
> No benchmark says it all. They're all biased toward specific workloads,
> and
> very few users run a homogenous workload, especially not desktop users.
> This particular benchmark attempts to measure system responsiveness, a key
> factor in overall performance (real and perceived) for desktop workloads,
> which are largely interactive.
>
> == Conclusion ==
>
> Having read over it, I think the numbers are fairly compelling. The
> difference in performance between -386 and -686 is insigificant; the
> measurements are all within a reasonable error range, and within that
> range,
> -686 was slower as often as it was faster.
>
> My recommendation is that we phase out these additional kernel builds,
> which
> I expect will save us a great deal of developer time, buildd time, archive
> storage and bandwidth.
>
> I'm interested to hear (objective, reasoned) feedback on this data and my
> conclusions from the members of this list.
>
> --
> - mdz
>
>
>
> ---------- Forwarded message ----------
> From: Ben Collins <bcollins at ubuntu.com>
> To: mdz at ubuntu.com
> Date: Mon, 14 Aug 2006 19:58:22 -0400
> Subject: Benchmarks between generic and cpu specific kernel images
> Test was performed on an Intel Pentium 4, 2Ghz system with 256Mb of RAM.
> I consider this about average for our users. The test I ran was
> "contest", which basically builds a kernel under several different load
> types. It's meant specifically for comparing different kernels on the
> same machine. Each one of the lines below represents the average of 3
> kernel builds under that particular type of stress. The first two are
> baseline results (cache run is just no_load without clearing the
> mem/disk cache, the rest of the results have the mem/disk cache cleared
> before starting).
>
> Here's the results:
>
> no_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 73 87.7 0.0 0.0 1.00
> 2.6.17-6-686 3 73 89.0 0.0 0.0 1.00
>
> cacherun:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 65 98.5 0.0 0.0 0.89
> 2.6.17-6-686 3 66 97.0 0.0 0.0 0.90
>
> process_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 143 44.8 260.0 52.4 1.96
> 2.6.17-6-686 3 144 44.4 230.0 52.8 1.97
>
> ctar_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 100 67.0 8.7 10.0 1.37
> 2.6.17-6-686 3 97 69.1 6.3 9.3 1.33
>
> xtar_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 96 67.7 2.0 4.2 1.32
> 2.6.17-6-686 3 96 68.8 1.7 4.1 1.32
>
> io_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 102 64.7 11.3 5.9 1.40
> 2.6.17-6-686 3 103 66.0 12.1 8.3 1.41
>
> io_other:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 94 71.3 20.9 12.8 1.29
> 2.6.17-6-686 3 97 70.1 21.3 16.3 1.33
>
> read_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 91 72.5 2.3 2.2 1.25
> 2.6.17-6-686 3 92 72.8 2.3 2.2 1.26
>
> list_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 87 75.9 0.0 5.7 1.19
> 2.6.17-6-686 3 87 77.0 0.0 6.9 1.19
>
> mem_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 327 20.5 74.0 0.6 4.48
> 2.6.17-6-686 3 390 17.7 80.7 0.8 5.34
>
> dbench_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-6-386 3 83 78.3 52193.3 18.1 1.14
> 2.6.17-6-686 3 92 71.7 47986.7 23.9 1.26
>
>
> Using this as a guide, you can see that the difference is only
> negligible. The -686 targeted kernel does not gain or lose any
> significant performance on this test system. The mem_load was the only
> noticeable change, where the -686 kernel shows worse performance. This
> may be specific to my system. The individual results are here:
>
> 2.6.17-6-386 58 9 157 764901 883 mem_load 0 2 158 182219 5893 7100
> 2.6.17-6-386 58 10 381 780220 4597 mem_load 0 3 382 191532 8976 7600
> 2.6.17-6-386 59 10 445 780181 6654 mem_load 0 2 446 191709 8501 7500
> 2.6.17-6-686 58 11 244 771704 1849 mem_load 0 3 244 192658 7983 7600
> 2.6.17-6-686 58 12 445 788730 6357 mem_load 1 3 446 198200 9831 8100
> 2.6.17-6-686 59 12 481 793676 6401 mem_load 1 3 482 200208 11464 8500
>
> You can see in the 4th column the major changes in the results of the
> first run for each kernel. The results after the first run seem more
> likely to be correct.
>
> I also performed the same test on an SMP Xeon machine (2 x Core2Duo @
> 3ghz), using the amd64-generic and amd64-xeon kernel images. Here's the
> results for that:
>
> no_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 49 357.1 0.0 0.0 1.00
> 2.6.17-7-amd64-xeon 3 47 370.2 0.0 0.0 1.00
>
> cacherun:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 48 362.5 0.0 0.0 0.98
> 2.6.17-7-amd64-xeon 3 47 368.1 0.0 0.0 1.00
>
> process_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 60 285.0 131.0 95.1 1.22
> 2.6.17-7-amd64-xeon 3 60 285.0 130.0 96.7 1.28
>
> ctar_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 55 320.0 15.7 25.5 1.12
> 2.6.17-7-amd64-xeon 3 54 325.9 15.3 25.9 1.15
>
> xtar_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 51 343.1 1.7 5.9 1.04
> 2.6.17-7-amd64-xeon 3 53 330.2 2.3 7.5 1.13
>
> io_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 101 171.3 3.8 5.9 2.06
> 2.6.17-7-amd64-xeon 3 97 179.4 3.7 5.6 2.06
>
> io_other:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 115 150.4 3.8 5.9 2.35
> 2.6.17-7-amd64-xeon 3 103 168.0 3.7 5.5 2.19
>
> read_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 89 194.4 0.4 0.0 1.82
> 2.6.17-7-amd64-xeon 3 93 186.0 0.4 0.0 1.98
>
> list_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 49 357.1 0.0 2.0 1.00
> 2.6.17-7-amd64-xeon 3 53 328.3 0.0 0.0 1.13
>
> mem_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 54 338.9 3137.0 38.9 1.10
> 2.6.17-7-amd64-xeon 3 54 337.0 3155.3 38.9 1.15
>
> dbench_load:
> Kernel [runs] Time CPU% Loads LCPU% Ratio
> 2.6.17-7-amd64-generic 3 84 211.9 0.0 0.0 1.71
> 2.6.17-7-amd64-xeon 3 86 207.0 0.0 0.0 1.83
>
> The only significant different was in the io_other performance. The
> io_other test involves some random I/O operations on a large file. I
> think the difference here is real. However, given that our -server
> kernel will likely get back the small loss should be enough to warrant
> still converting to generic kernels for amd64 and i386.
>
> I can make this happen using linux-meta to handle the upgrades (next
> kernel upload is an ABI bump, so would be perfect timing along with lrm
> and linux-meta).
>
> Question is, do we want...
>
> linux-image-2.6.17-7-generic
> (or just)
> linux-image-2.6.17-7
>
> ...for both i386 and amd64? I prefer the no-flavor version, but it would
> require some slight changes in the build system.
>
> I want to change the amd64 kernels to be less amd64ish, and have just
> two kernels:
>
> linux-image-2.6.17-7{,-generic}
> and
> linux-image-2.6.17-7-server
>
> That way there's more consistency in the naming with i386, and i386
> would just have the extra -server-bigiron.
>
> --
> Ubuntu - http://www.ubuntu.com/
> Debian - http://www.debian.org/
> Linux 1394 - http://www.linux1394.org/
> SwissDisk - http://www.swissdisk.com/
>
>
> --
> ubuntu-devel mailing list
> ubuntu-devel at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/ubuntu-devel
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: https://lists.ubuntu.com/archives/ubuntu-devel/attachments/20060814/88937dc7/attachment-0001.htm
More information about the ubuntu-devel
mailing list