ACK/cmnt: [SRU][PATCH 0/2][Bionic] Fix aggressive CFS throttling

Khalid Elmously khalid.elmously at canonical.com
Fri Oct 18 05:34:54 UTC 2019


On 2019-10-17 14:57:00 , Kleber Souza wrote:
> On 16.10.19 23:15, Khalid Elmously wrote:
> > BugLink: https://bugs.launchpad.net/bugs/1832151
> 
> The bug report is outdated it seems. It says that 512ac999d275 (sched/fair: Fix bandwidth
> timer clock drift condition) is the fix for the bug, but the patches submitted are follow
> up fixes for it. Maybe it needs some update? It would be nice to also have it with the
> SRU template.
> 
> > 
> > "sched/fair: Fix bandwidth timer clock drift condition" enabled CPU bandwidth "slice expiry" behaviour in cpu-local silos (something that should have been in effect all along but was basically completely broken). The undesired side-effect of this expirty being enabled is that threads of highly-threaded non-CPU-bound applications get throttled even when the application isn't using its full quota.
> > 
> > This fix eliminates the problem by removing cpu-local slice expiry altogether. A small pre-requisite patch makes the fix apply nicely.
> > 
> > A derivative 4.15 cloud kernel was tested with this fix and approved by the cloud provider, and I've tested this fix on the master 4.15 with positive results.
> 
> Do you have some more information on how this was tested for regressions on bionic
> master kernel? What are the regression potentials?
> 
> Would this fix be needed for all newer series as well?
> 

Thanks for the review. I've updated the bug with more information.

I have not been able to reproduce this problem on the disco kernel, though I'm not entirely sure why. I haven't heard problems of it on any later kernels either, so I just left it at that.

I have now targeted the bug to E and D to make sure to remember to investigate if they need the fix too.

For now, the patch is nominated for Bionic only.



 
> > 
> > More info in launchpad and salesforce.
> > 
> > 
> > 
> > Dave Chiluk (1):
> >   sched/fair: Fix low cpu usage with high throttling by removing
> >     expiration of cpu-local slices
> > 
> > Patrick Bellasi (1):
> >   sched/fair: Add lsub_positive() and use it consistently
> > 
> >  Documentation/scheduler/sched-bwc.txt | 74 +++++++++++++++++++-----
> >  kernel/sched/fair.c                   | 83 ++++++---------------------
> >  kernel/sched/sched.h                  |  4 --
> >  3 files changed, 79 insertions(+), 82 deletions(-)
> > 
> 
> The backport looks good, fix is upstream and has been tested to fix
> the issue.
> 
> I'm ACK'ing it for now, but I would like to get some more information about
> regression tests and potential before applying it.
> 
> Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>



More information about the kernel-team mailing list