ACK: [SRU][Xenial][PATCH] Fix for LP:#1747896

Kleber Souza kleber.souza at canonical.com
Tue Feb 27 10:27:51 UTC 2018


On 02/07/18 13:06, Gavin Guo wrote:
> BugLink: https://bugs.launchpad.net/bugs/1747896
> 
> [Impact]
> The CPU utilization keeps high and the flamegraph[1] shows that the CPU
> is busy updating the load average in the for loop inside
> update_blocked_averages() function. Also, the OOM happens because of
> the decayed cfs_rqs are not released.
> 
> [Fix]
> commit a9e7f6544b9cebdae54d29f87a7ba2a83c0471b5
> Author: Tejun Heo <tj at kernel.org>
> Date:   Tue Apr 25 17:43:50 2017 -0700
> 
> sched/fair: Fix O(nr_cgroups) in load balance path
>     
> Currently, rq->leaf_cfs_rq_list is a traversal ordered list of all
> live cfs_rqs which have ever been active on the CPU; unfortunately,
> this makes update_blocked_averages() O(# total cgroups) which isn't
> scalable at all.
>     
> This shows up as a small CPU consumption and scheduling latency
> increase in the load balancing path in systems with CPU controller
> enabled across most cgroups.  In an edge case where temporary cgroups
> were leaking, this caused the kernel to consume good several tens of
> percents of CPU cycles running update_blocked_averages(), each run
> taking multiple millisecs.
>     
> This patch fixes the issue by taking empty and fully decayed cfs_rqs
> off the rq->leaf_cfs_rq_list.
> 
> [Test]
> 1). Running the script
> #/bin/bash
> 
> for i in $(seq 1 10); do
>         ( for j in $(seq 1 3000); do ssh -S none u at localhost date;done; echo "done $i" ) &
> done
> 
> 2). Observe the cfs_rqs
> $ watch -n1 "grep cfs_rq /proc/sched_debug| wc -l"
> 
> 3). Observe the CPU utilization rate
> $ sudo htop
> 
> The patched kernel[2] shows that the CPU utilization rate is normal, the
> cfs_rqs is decreased periodically, and the memory can be limited.
> 
> [Reference]
> [1]. http://kernel.ubuntu.com/~gavinguo/168887/2018-01-31_07-38-45.perf.data.svg
> [2]. https://launchpad.net/~mimi0213kimo/+archive/ubuntu/cfs-rq-clean
> 
> Tejun Heo (1):
>   sched/fair: Fix O(nr_cgroups) in load balance path
> 
>  kernel/sched/fair.c | 51 +++++++++++++++++++++++++++++++++++++++------------
>  1 file changed, 39 insertions(+), 12 deletions(-)
> 

Good test results, changes look good.

Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>




More information about the kernel-team mailing list