Fwd: [084/152] sched: Cure more NO_HZ load average woes
Chase Douglas
chase.douglas at canonical.com
Mon Jan 24 15:00:56 UTC 2011
On 01/24/2011 09:50 AM, Stefan Bader wrote:
> On 01/06/2011 04:09 PM, Chase Douglas wrote:
>> Hi all,
>>
>> I received this notification of a stable patch for .36 that should fix
>> the load avg bugs once and for all. A recap:
>>
>> I found a bug in the load avg calculation and got a fix pushed upstream.
>> This was thrown into lucid and maverick. Unfortunately, it caused a
>> regression for our xen kernels, so it was removed from maverick ec2
>> IIRC. Maybe from others too? This is the commit hash for ubuntu-maverick
>> master:
>>
>> 74f5187ac873042f502227701ed1727e7c5fbfa9
>>
>> I believe this patch should be reenabled for all lucid and maverick
>> kernels, and the following patch should be applied on top. I'm not sure
>> how everything is falling out with the new stable queue process, so I'm
>> forwarding this to the list just to be sure it's seen.
>>
>> Thanks!
>>
>
> I must admit that the maths are a bit beyond my understanding. Though given that
> the first half is in Maverick and the second went as a stable update for .36,
> this seems to be the right thing to do.
>
> Acked-by: Stefan Bader <stefan.bader at canonical.com>
>
> This has now also been reported as a bug
>
> https://bugs.launchpad.net/ubuntu/+source/linux/+bug/706592
>
> Secondary question would be for Lucid. As the patch did not cleanly apply there,
> I checked what currently is in Lucid and it seems the first part is
> not there. But another patch:
>
> commit 0d843425672f4d2dc99b9004409aae503ef4d39f
> Author: Chase Douglas <chase.douglas at canonical.com>
> Date: Thu Apr 8 12:02:11 2010 -0400
>
> sched: update load count only once per cpu in 10 tick update window
>
> This does not show up upstream and I think I remember vaguely that this one was
> replaced by the first upstream patch that is in Maverick.
>
> So I guess the action required there would be to revert the Ubuntu specific
> patch and apply both halfs of the upstream solution. Still it would be quite
> nice to have some way of verification. Has anybody already some sort of testcase
> for this?
I wrote a testcase for the original bug:
http://lkml.org/lkml/2010/3/29/170
As for the second bug, I'm not sure what a good testcase is.
-- Chase
More information about the kernel-team
mailing list