[PATCH 2/2] sched: Try tp catch cpu_power being set to 0

Stefan Bader stefan.bader at canonical.com
Thu Jan 20 16:11:26 UTC 2011


On 01/19/2011 03:20 PM, Andy Whitcroft wrote:
> On Tue, Jan 18, 2011 at 04:34:23PM +0100, Stefan Bader wrote:
>> This is an optional change to try catching the culprit which changes
>> cpu_power to 0 (should never happen) and causes divide by zero crashes
>> later on in the scheduler code.
>>
>> BugLink: http://bugs.launchpad.net/bugs/614853
>>
>> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
>> ---
>>  kernel/sched.c |   12 +++++++++++-
>>  1 files changed, 11 insertions(+), 1 deletions(-)
>>
>> diff --git a/kernel/sched.c b/kernel/sched.c
>> index d4a4b14..7ef70c0 100644
>> --- a/kernel/sched.c
>> +++ b/kernel/sched.c
>> @@ -3713,6 +3713,7 @@ static void update_cpu_power(struct sched_domain *sd, int cpu)
>>  	unsigned long weight = sd->span_weight;
>>  	unsigned long power = SCHED_LOAD_SCALE;
>>  	struct sched_group *sdg = sd->groups;
>> +	unsigned long scale_rt;
>>  
>>  	if (sched_feat(ARCH_POWER))
>>  		power *= arch_scale_freq_power(sd, cpu);
>> @@ -3730,12 +3731,18 @@ static void update_cpu_power(struct sched_domain *sd, int cpu)
>>  		power >>= SCHED_LOAD_SHIFT;
>>  	}
>>  
>> -	power *= scale_rt_power(cpu);
>> +	scale_rt = scale_rt_power(cpu);
>> +	power *= scale_rt;
>> +
>>  	power >>= SCHED_LOAD_SHIFT;
>>  
>>  	if (!power)
>>  		power = 1;
>>  
>> +	if (WARN_ON((long) power <= 0))
>> +		printk(KERN_ERR "cpu_power = %ld; scale_rt = %ld\n",
>> +			power, scale_rt);
>> +
> 
> This is meant to catch when power == 0, but just above power is zapped
> to 1 if it is 0.  Did we want to catch the 0 ie should this be above the
> power zapping ?  I guess this is more checking >2^32 ?
> 
>>  	sdg->cpu_power = power;
>>  }
>>  
>> @@ -3759,6 +3766,9 @@ static void update_group_power(struct sched_domain *sd, int cpu)
>>  	} while (group != child->groups);
>>  
>>  	sdg->cpu_power = power;
>> +
>> +	if (WARN_ON((long) power <= 0))
>> +		printk(KERN_ERR "cpu_power = %ld\n", power);
> 
> This is a little odd, I assume its really checking that power is
> not >2^32 and != 0.
> 
> Otherwise I suspect the intent is reasonable.  If this has been boot
> tested I guess it is ok.  Cirtainly 1/2 is a major paper over the bug
> type fix so something needs to be added to try and catch it.
> 
> Acked-by: Andy Whitcroft <apw at canonical.com>
> 
> -apw

Ok, so without changing those I just tried a Lucid ec2 kernel for 32 and 64 bit
with all the 3 recently posted patches applied and it still boots without any
complaints. So it seems (even with the questionable sense) it does not cause
pain to add them.

-Stefan




More information about the kernel-team mailing list