ACK/cmnt: [SRU][PATCH 0/2][Bionic] Fix aggressive CFS throttling
Khalid Elmously
khalid.elmously at canonical.com
Fri Oct 18 05:34:54 UTC 2019
On 2019-10-17 14:57:00 , Kleber Souza wrote:
> On 16.10.19 23:15, Khalid Elmously wrote:
> > BugLink: https://bugs.launchpad.net/bugs/1832151
>
> The bug report is outdated it seems. It says that 512ac999d275 (sched/fair: Fix bandwidth
> timer clock drift condition) is the fix for the bug, but the patches submitted are follow
> up fixes for it. Maybe it needs some update? It would be nice to also have it with the
> SRU template.
>
> >
> > "sched/fair: Fix bandwidth timer clock drift condition" enabled CPU bandwidth "slice expiry" behaviour in cpu-local silos (something that should have been in effect all along but was basically completely broken). The undesired side-effect of this expirty being enabled is that threads of highly-threaded non-CPU-bound applications get throttled even when the application isn't using its full quota.
> >
> > This fix eliminates the problem by removing cpu-local slice expiry altogether. A small pre-requisite patch makes the fix apply nicely.
> >
> > A derivative 4.15 cloud kernel was tested with this fix and approved by the cloud provider, and I've tested this fix on the master 4.15 with positive results.
>
> Do you have some more information on how this was tested for regressions on bionic
> master kernel? What are the regression potentials?
>
> Would this fix be needed for all newer series as well?
>
Thanks for the review. I've updated the bug with more information.
I have not been able to reproduce this problem on the disco kernel, though I'm not entirely sure why. I haven't heard problems of it on any later kernels either, so I just left it at that.
I have now targeted the bug to E and D to make sure to remember to investigate if they need the fix too.
For now, the patch is nominated for Bionic only.
> >
> > More info in launchpad and salesforce.
> >
> >
> >
> > Dave Chiluk (1):
> > sched/fair: Fix low cpu usage with high throttling by removing
> > expiration of cpu-local slices
> >
> > Patrick Bellasi (1):
> > sched/fair: Add lsub_positive() and use it consistently
> >
> > Documentation/scheduler/sched-bwc.txt | 74 +++++++++++++++++++-----
> > kernel/sched/fair.c | 83 ++++++---------------------
> > kernel/sched/sched.h | 4 --
> > 3 files changed, 79 insertions(+), 82 deletions(-)
> >
>
> The backport looks good, fix is upstream and has been tested to fix
> the issue.
>
> I'm ACK'ing it for now, but I would like to get some more information about
> regression tests and potential before applying it.
>
> Acked-by: Kleber Sacilotto de Souza <kleber.souza at canonical.com>
More information about the kernel-team
mailing list