APPLIED[N/U]: [PATCH 0/1][N/U] Enable lowlatency settings in the generic kernel

Andrea Righi andrea.righi at canonical.com
Thu Mar 7 09:54:10 UTC 2024


On Fri, Jan 26, 2024 at 05:06:08PM +0100, Andrea Righi wrote:
> BugLink: https://bugs.launchpad.net/bugs/2051342
> 
> [Impact]
> 
> Ubuntu provides the "lowlatency" kernel: a kernel optimized for
> applications that have special "low latency" requirements.
> 
> Currently, this kernel does not include any specific UBUNTU SAUCE
> patches to improve the extra "low latency" requirements, but the only
> difference is a small subset of .config options.
> 
> Almost all these options are now configurable either at boot-time or
> even at run-time, with the only exception of CONFIG_HZ (250 in the
> generic kernel vs 1000 in the lowlatency kernel).
> 
> Maintaining a separate kernel for a single config option seems a bit
> overkill and it is a significant cost of engineering hours, build time,
> regression testing time and resources. Not to mention the risk of the
> low-latency kernel falling behind and not being perfectly in sync with
> the latest generic kernel.
> 
> Enabling the low-latency settings in the generic kernel has been
> evaluated before, but it has been never finalized due to the potential
> risk of performance regressions in CPU-intensive applications
> (increasing HZ from 250 to 1000 may introduce more kernel jitter in
> number crunching workloads). The outcome of the original proposal
> resulted in a re-classification of the lowlatency kernel as a
> desktop-oriented kernel, enabling additional low latency features (LP:
> #2023007).
> 
> As we are approaching the release of the new Ubuntu 24.04 we may want to
> re-consider merging the low-latency settings in the generic kernel
> again.
> 
> Following a detailed analysis of the specific low-latency features:
> 
> - CONFIG_NO_HZ_FULL=y: enable access to "Full tickless mode" (shutdown
>   clock tick when possible across all the enabled CPUs if they are
>   either idle or running 1 task - reduce kernel jitter of running tasks
>   due to the periodic clock tick, must be enabled at boot time passing
>   `nohz_full=<cpu_list>`); this can actually help CPU-intensive
>   workloads and it could provide much more benefits than the CONFIG_HZ
>   difference (since it can potentially shutdown any kernel jitter on
>   specific CPUs), this one should really be enabled anyway, considering
>   that it is configurable at boot time
> 
> - CONFIG_RCU_NOCB_CPU=y: move RCU callbacks from softirq context to
>   kthread context (reduce time spent in softirqs with preemption
>   disabled to improve the overall system responsiveness, at the cost of
>   introducing a potential performance penalty, because RCU callbacks are
>   not processed by kernel threads); this should be enabled as well,
>   since it is configurable at boot time (via the rcu_nocbs=<cpu_list>
>   parameter)
> 
>  - CONFIG_RCU_LAZY=y: batch RCU callbacks and then flush them after a
>    timed delay instead of executing them immediately (c'an provide 5~10%
>    power-savings for idle or lightly-loaded systems, this is extremely
>    useful for laptops / portable devices -
>    https://lore.kernel.org/lkml/20221016162305.2489629-3-joel at joelfernandes.org/);
>    this has the potential to introduce significant performance
>    regressions, but in the Noble kernel we already have a SAUCE patch
>    that allows to enable/disable this option at boot time (see LP:
>    #2045492), and by default it will be disabled
>    (CONFIG_RCU_LAZY_DEFAULT_OFF=y)
> 
>  - CONFIG_HZ=1000 last but not least, the only option that is *only*
>    tunable at compile time. As already mentioned there is a potential
>    risk of regressions for CPU-intensive applications, but they can be
>    mitigated (and maybe they could even outperformed) with NO_HZ_FULL.
>    On the other hand, HZ=1000 can improve system responsiveness, that
>    means most of the desktop and server applications will benefit from
>    this (the largest part of the server workloads is I/O bound, more
>    than CPU-bound, so they can benefit from having a kernel that can
>    react faster at switching tasks), not to mention the benefit for the
>    typical end users applications (gaming, live conferencing,
>    multimedia, etc.).
> 
> With all of that in place we can provide a kernel that has the
> flexibility to be more responsive, more performant and more power
> efficient (therefore more "generic"), simply by tuning run-time and
> boot-time options.
> 
> Moreover, once these changes are applied we will be able to deprecate
> the lowlatency kernel, saving engineering time and also reducing power
> consumption (required to build the kernel and do all the testing).
> 
> Optionally, we can also provide the optimal "lowlatency" settings as a
> user-space package that would set the proper options in the kernel boot
> command line (GRUB, or similar).
> 
> [Test case]
> 
> There are plenty of benchmarks that can prove the validity of each one
> of the setting mentioned above, providing huge benefits in terms of
> system responsive.
> 
> However, our main goal here is to mitigate as much as possible the risk
> of regression for CPU-intensive applications, so the test case should
> only be focused on this particular aspect, to evaluate the impact of
> this change in the worst case scenario.
> 
> Test case (CPU-intensive stress test):
> 
>  - stress-ng --matrix $(getconf _NPROCESSORS_ONLN) --timeout 5m --metrics-brief
> 
> Metrics:
> 
>  - measure the bogo ops printed to stdout (not a great metric for
>    real-world applications, but in this case it can show the impact of
>    the additional kernel jitter introduced by the different CONFIG_HZ)
> 
> Results (linux-unstable 6.8.0-2.2, avg of 10 runs of 5min each):
> 
>  - CONFIG_HZ=250            : 17415.60 bogo ops/s
>  - CONFIG_HZ=1000           : 14866.05 bogo ops/s
>  - CONFIG_HZ=1000+nohz_full : 18505.52 bogo ops/s
> 
> Results confirm the theory about the performance drop of CPU-intensive
> workloads (-~14%), but also confirms the benefit of NO_HZ_FULL (+~6%)
> compared to the current HZ settings.
> 
> Let's also keep in mind that this is the worst case scenario and a very
> specific one, where only HPC / scientific applications can be affected,
> and even in this case we can always compensate and actually get a better
> level performance exploiting the nohz_full capability.
> 
> [Fix]
> 
> Enable the .config options mentioned above in the generic kernel (only
> on amd64 and arm64 for now).
> 
> [Regression potential]
> 
> As already covered we may experience performance regressions in
> CPU-intensive (number crunching) applications (such as HPC for example),
> but they can be compensated by the NO_HZ_FULL boot-time option.

Applied to noble/linux and noble/linux-unstable.

-Andrea



More information about the kernel-team mailing list