APPLIED[N/U]: [PATCH 0/1][N/U] Enable lowlatency settings in the generic kernel
Andrea Righi
andrea.righi at canonical.com
Thu Mar 7 09:54:10 UTC 2024
On Fri, Jan 26, 2024 at 05:06:08PM +0100, Andrea Righi wrote:
> BugLink: https://bugs.launchpad.net/bugs/2051342
>
> [Impact]
>
> Ubuntu provides the "lowlatency" kernel: a kernel optimized for
> applications that have special "low latency" requirements.
>
> Currently, this kernel does not include any specific UBUNTU SAUCE
> patches to improve the extra "low latency" requirements, but the only
> difference is a small subset of .config options.
>
> Almost all these options are now configurable either at boot-time or
> even at run-time, with the only exception of CONFIG_HZ (250 in the
> generic kernel vs 1000 in the lowlatency kernel).
>
> Maintaining a separate kernel for a single config option seems a bit
> overkill and it is a significant cost of engineering hours, build time,
> regression testing time and resources. Not to mention the risk of the
> low-latency kernel falling behind and not being perfectly in sync with
> the latest generic kernel.
>
> Enabling the low-latency settings in the generic kernel has been
> evaluated before, but it has been never finalized due to the potential
> risk of performance regressions in CPU-intensive applications
> (increasing HZ from 250 to 1000 may introduce more kernel jitter in
> number crunching workloads). The outcome of the original proposal
> resulted in a re-classification of the lowlatency kernel as a
> desktop-oriented kernel, enabling additional low latency features (LP:
> #2023007).
>
> As we are approaching the release of the new Ubuntu 24.04 we may want to
> re-consider merging the low-latency settings in the generic kernel
> again.
>
> Following a detailed analysis of the specific low-latency features:
>
> - CONFIG_NO_HZ_FULL=y: enable access to "Full tickless mode" (shutdown
> clock tick when possible across all the enabled CPUs if they are
> either idle or running 1 task - reduce kernel jitter of running tasks
> due to the periodic clock tick, must be enabled at boot time passing
> `nohz_full=<cpu_list>`); this can actually help CPU-intensive
> workloads and it could provide much more benefits than the CONFIG_HZ
> difference (since it can potentially shutdown any kernel jitter on
> specific CPUs), this one should really be enabled anyway, considering
> that it is configurable at boot time
>
> - CONFIG_RCU_NOCB_CPU=y: move RCU callbacks from softirq context to
> kthread context (reduce time spent in softirqs with preemption
> disabled to improve the overall system responsiveness, at the cost of
> introducing a potential performance penalty, because RCU callbacks are
> not processed by kernel threads); this should be enabled as well,
> since it is configurable at boot time (via the rcu_nocbs=<cpu_list>
> parameter)
>
> - CONFIG_RCU_LAZY=y: batch RCU callbacks and then flush them after a
> timed delay instead of executing them immediately (c'an provide 5~10%
> power-savings for idle or lightly-loaded systems, this is extremely
> useful for laptops / portable devices -
> https://lore.kernel.org/lkml/20221016162305.2489629-3-joel at joelfernandes.org/);
> this has the potential to introduce significant performance
> regressions, but in the Noble kernel we already have a SAUCE patch
> that allows to enable/disable this option at boot time (see LP:
> #2045492), and by default it will be disabled
> (CONFIG_RCU_LAZY_DEFAULT_OFF=y)
>
> - CONFIG_HZ=1000 last but not least, the only option that is *only*
> tunable at compile time. As already mentioned there is a potential
> risk of regressions for CPU-intensive applications, but they can be
> mitigated (and maybe they could even outperformed) with NO_HZ_FULL.
> On the other hand, HZ=1000 can improve system responsiveness, that
> means most of the desktop and server applications will benefit from
> this (the largest part of the server workloads is I/O bound, more
> than CPU-bound, so they can benefit from having a kernel that can
> react faster at switching tasks), not to mention the benefit for the
> typical end users applications (gaming, live conferencing,
> multimedia, etc.).
>
> With all of that in place we can provide a kernel that has the
> flexibility to be more responsive, more performant and more power
> efficient (therefore more "generic"), simply by tuning run-time and
> boot-time options.
>
> Moreover, once these changes are applied we will be able to deprecate
> the lowlatency kernel, saving engineering time and also reducing power
> consumption (required to build the kernel and do all the testing).
>
> Optionally, we can also provide the optimal "lowlatency" settings as a
> user-space package that would set the proper options in the kernel boot
> command line (GRUB, or similar).
>
> [Test case]
>
> There are plenty of benchmarks that can prove the validity of each one
> of the setting mentioned above, providing huge benefits in terms of
> system responsive.
>
> However, our main goal here is to mitigate as much as possible the risk
> of regression for CPU-intensive applications, so the test case should
> only be focused on this particular aspect, to evaluate the impact of
> this change in the worst case scenario.
>
> Test case (CPU-intensive stress test):
>
> - stress-ng --matrix $(getconf _NPROCESSORS_ONLN) --timeout 5m --metrics-brief
>
> Metrics:
>
> - measure the bogo ops printed to stdout (not a great metric for
> real-world applications, but in this case it can show the impact of
> the additional kernel jitter introduced by the different CONFIG_HZ)
>
> Results (linux-unstable 6.8.0-2.2, avg of 10 runs of 5min each):
>
> - CONFIG_HZ=250 : 17415.60 bogo ops/s
> - CONFIG_HZ=1000 : 14866.05 bogo ops/s
> - CONFIG_HZ=1000+nohz_full : 18505.52 bogo ops/s
>
> Results confirm the theory about the performance drop of CPU-intensive
> workloads (-~14%), but also confirms the benefit of NO_HZ_FULL (+~6%)
> compared to the current HZ settings.
>
> Let's also keep in mind that this is the worst case scenario and a very
> specific one, where only HPC / scientific applications can be affected,
> and even in this case we can always compensate and actually get a better
> level performance exploiting the nohz_full capability.
>
> [Fix]
>
> Enable the .config options mentioned above in the generic kernel (only
> on amd64 and arm64 for now).
>
> [Regression potential]
>
> As already covered we may experience performance regressions in
> CPU-intensive (number crunching) applications (such as HPC for example),
> but they can be compensated by the NO_HZ_FULL boot-time option.
Applied to noble/linux and noble/linux-unstable.
-Andrea
More information about the kernel-team
mailing list