[PATCH 2/2] sched: Try tp catch cpu_power being set to 0

Andy Whitcroft apw at canonical.com
Wed Jan 19 14:20:04 UTC 2011


On Tue, Jan 18, 2011 at 04:34:23PM +0100, Stefan Bader wrote:
> This is an optional change to try catching the culprit which changes
> cpu_power to 0 (should never happen) and causes divide by zero crashes
> later on in the scheduler code.
> 
> BugLink: http://bugs.launchpad.net/bugs/614853
> 
> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
> ---
>  kernel/sched.c |   12 +++++++++++-
>  1 files changed, 11 insertions(+), 1 deletions(-)
> 
> diff --git a/kernel/sched.c b/kernel/sched.c
> index d4a4b14..7ef70c0 100644
> --- a/kernel/sched.c
> +++ b/kernel/sched.c
> @@ -3713,6 +3713,7 @@ static void update_cpu_power(struct sched_domain *sd, int cpu)
>  	unsigned long weight = sd->span_weight;
>  	unsigned long power = SCHED_LOAD_SCALE;
>  	struct sched_group *sdg = sd->groups;
> +	unsigned long scale_rt;
>  
>  	if (sched_feat(ARCH_POWER))
>  		power *= arch_scale_freq_power(sd, cpu);
> @@ -3730,12 +3731,18 @@ static void update_cpu_power(struct sched_domain *sd, int cpu)
>  		power >>= SCHED_LOAD_SHIFT;
>  	}
>  
> -	power *= scale_rt_power(cpu);
> +	scale_rt = scale_rt_power(cpu);
> +	power *= scale_rt;
> +
>  	power >>= SCHED_LOAD_SHIFT;
>  
>  	if (!power)
>  		power = 1;
>  
> +	if (WARN_ON((long) power <= 0))
> +		printk(KERN_ERR "cpu_power = %ld; scale_rt = %ld\n",
> +			power, scale_rt);
> +

This is meant to catch when power == 0, but just above power is zapped
to 1 if it is 0.  Did we want to catch the 0 ie should this be above the
power zapping ?  I guess this is more checking >2^32 ?

>  	sdg->cpu_power = power;
>  }
>  
> @@ -3759,6 +3766,9 @@ static void update_group_power(struct sched_domain *sd, int cpu)
>  	} while (group != child->groups);
>  
>  	sdg->cpu_power = power;
> +
> +	if (WARN_ON((long) power <= 0))
> +		printk(KERN_ERR "cpu_power = %ld\n", power);

This is a little odd, I assume its really checking that power is
not >2^32 and != 0.

Otherwise I suspect the intent is reasonable.  If this has been boot
tested I guess it is ok.  Cirtainly 1/2 is a major paper over the bug
type fix so something needs to be added to try and catch it.

Acked-by: Andy Whitcroft <apw at canonical.com>

-apw




More information about the kernel-team mailing list