ACK/cmnt: [SRU][PATCH 0/2][Bionic] Fix aggressive CFS throttling

Khalid Elmously khalid.elmously at canonical.com
Mon Oct 21 09:15:55 UTC 2019


On 2019-10-21 11:05:17 , Juerg Haefliger wrote:
> On Fri, 18 Oct 2019 01:34:54 -0400
> Khalid Elmously <khalid.elmously at canonical.com> wrote:
> 
> > On 2019-10-17 14:57:00 , Kleber Souza wrote:
> > > On 16.10.19 23:15, Khalid Elmously wrote:  
> > > > BugLink: https://bugs.launchpad.net/bugs/1832151  
> > > 
> > > The bug report is outdated it seems. It says that 512ac999d275 (sched/fair: Fix bandwidth
> > > timer clock drift condition) is the fix for the bug, but the patches submitted are follow
> > > up fixes for it. Maybe it needs some update? It would be nice to also have it with the
> > > SRU template.
> > >   
> > > > 
> > > > "sched/fair: Fix bandwidth timer clock drift condition" enabled CPU bandwidth "slice expiry" behaviour in cpu-local silos (something that should have been in effect all along but was basically completely broken). The undesired side-effect of this expirty being enabled is that threads of highly-threaded non-CPU-bound applications get throttled even when the application isn't using its full quota.
> > > > 
> > > > This fix eliminates the problem by removing cpu-local slice expiry altogether. A small pre-requisite patch makes the fix apply nicely.
> > > > 
> > > > A derivative 4.15 cloud kernel was tested with this fix and approved by the cloud provider, and I've tested this fix on the master 4.15 with positive results.  
> > > 
> > > Do you have some more information on how this was tested for regressions on bionic
> > > master kernel? What are the regression potentials?
> > > 
> > > Would this fix be needed for all newer series as well?
> > >   
> > 
> > Thanks for the review. I've updated the bug with more information.
> > 
> > I have not been able to reproduce this problem on the disco kernel, though I'm not entirely sure why. I haven't heard problems of it on any later kernels either, so I just left it at that.
> > 
> > I have now targeted the bug to E and D to make sure to remember to investigate if they need the fix too.
> > 
> > For now, the patch is nominated for Bionic only.
> 
> The commit contains a 'Fixes' tag for a commit that is in both E and D so we
> should apply it there as well.
> 

Sure. I've targeted the bug for D and E and will apply the fix there and test it.

But I don't see a reason to delay this patch/SRU for that. Reviewing the patch for just B for now would be helpful.

Thanks
Khaled


> ...Juerg
> 
> > 
> > 
> >  
> > > > 
> > > > More info in launchpad and salesforce.
> > > > 
> > > > 
> > > > 
> > > > Dave Chiluk (1):
> > > >   sched/fair: Fix low cpu usage with high throttling by removing
> > > >     expiration of cpu-local slices
> > > > 
> > > > Patrick Bellasi (1):
> > > >   sched/fair: Add lsub_positive() and use it consistently
> > > > 
> > > >  Documentation/scheduler/sched-bwc.txt | 74 +++++++++++++++++++-----
> > > >  kernel/sched/fair.c                   | 83 ++++++---------------------
> > > >  kernel/sched/sched.h                  |  4 --
> > > >  3 files changed, 79 insertions(+), 82 deletions(-)
> > > >   
> > > 
> > > The backport looks good, fix is upstream and has been tested to fix
> > > the issue.
> > > 
> > > I'm ACK'ing it for now, but I would like to get some more information about
> > > regression tests and potential before applying it.
> > > 
> > > Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>  
> > 
> 





More information about the kernel-team mailing list