[REGRESSION 2.6.30][PATCH v3] sched: update load count only once per cpu in 10 tick update window

Chase Douglas chase.douglas at canonical.com
Thu Apr 22 15:35:11 UTC 2010

On Thu, Apr 22, 2010 at 9:18 AM, Chase Douglas
<chase.douglas at canonical.com> wrote:
> On Thu, Apr 22, 2010 at 7:08 AM, Peter Zijlstra <peterz at infradead.org> wrote:
>> On Tue, 2010-04-13 at 16:19 -0700, Chase Douglas wrote:
>>> There's a period of 10 ticks where calc_load_tasks is updated by all the
>>> cpus for the load avg. Usually all the cpus do this during the first
>>> tick. If any cpus go idle, calc_load_tasks is decremented accordingly.
>>> However, if they wake up calc_load_tasks is not incremented. Thus, if
>>> cpus go idle during the 10 tick period, calc_load_tasks may be
>>> decremented to a non-representative value. This issue can lead to
>>> systems having a load avg of exactly 0, even though the real load avg
>>> could theoretically be up to NR_CPUS.
>>> This change defers calc_load_tasks accounting after each cpu updates the
>>> count until after the 10 tick update window.
>>> A few points:
>>> * A global atomic deferral counter, and not per-cpu vars, is needed
>>>   because a cpu may go NOHZ idle and not be able to update the global
>>>   calc_load_tasks variable for subsequent load calculations.
>>> * It is not enough to add calls to account for the load when a cpu is
>>>   awakened:
>>>   - Load avg calculation must be independent of cpu load.
>>>   - If a cpu is awakend by one tasks, but then has more scheduled before
>>>     the end of the update window, only the first task will be accounted.
>> Ok, so delaying the whole ILB angle for now, the below is a similar
>> approach to yours but with a more explicit code flow.
>> Does that work for you?
> This looks good. I'll run my test case to make sure it fixes the
> scenario we hit, and then I'll ack it when I've confirmed it works.

I've run my test case and it seems to push the load avg numbers as expected.

Acked-by: Chase Douglas <chase.douglas at canonical.com>

BTW, I noticed some trailing whitespace, so I ran it through checkpatch.pl:

ERROR: trailing whitespace
#44: FILE: kernel/sched.c:2936:
+       $


-- Chase

