[apparmor] [PATCH] [parser] Fix jobs not scaling up to meet available resources when cpus are brought online during compilation

Tue Apr 5 21:16:01 UTC 2016

On 04/05/2016 01:51 PM, Christian Boltz wrote:
> Hello,
> 
> Am Dienstag, 5. April 2016, 13:22:19 CEST schrieb Seth Arnold:
>> On Tue, Apr 05, 2016 at 12:37:07PM -0700, John Johansen wrote:
>>> Enable dynamically scaling max jobs if new resources are brought
>>> online
>>>
>>> BugLink: http://bugs.launchpad.net/bugs/1566490
>>>
>>> This patch enables to parser to scale the max jobs if new resources
>>> are being brought online by the scheduler.
>>>
>>> It only enables the scaling check if there is a difference between
>>> the maximum number of cpus (CONF) and the number of online (ONLN)
>>> cpus.
>>>
>>> Instead of checking for more resources regardless, of whether the
>>> online cpu count is increasing it limits its checking to a maximum
>>> of MAX CPUS + 1 - ONLN cpus times. With each check coming after
>>> fork spawns a new work unit, giving the scheduler a chance to bring
>>> new cpus online before the next check.  The +1 ensures the checks
>>> will be done at least once after the scheduling task sleeps waiting
>>> for its children giving the scheduler an extra chance to bring cpus
>>> online.
> 
> Will it also reduce the number of processes if some CPUs are sent to 
> vacation?
> 
It does not. This is specifically addressing the case of the hotplug
governor (and a few other ones in use on mobile devices), which offlines
cpus when load is low, and then brings them back online as load ramps up.

I was also trying to minimize the cost of the check, by limiting the
number of times we call out to check how many cpus are available. Its
extra overhead that really isn't needed on the devices where we are
seeing this problem. So the simple solution of just check every
time isn't ideal.

The reverse case of cpus going offline while load is high seems some
what degenerate, and is a case where I am willing to live with a few
extra processes hanging around. Hopefully its not a common case
and would only result in one or two extra processes.

>>> Signed-off-by: John Johansen <john.johansen at canonical.com>
>>
>> This feels more complicated than it could be but I must admit I can't
>> suggest any modifications to the algorithm to simplify it.
> 
> It sounds too simple, and it might start too many jobs in some cases, 
> but - why not use the total number of CPUs from the beginning instead of 
> the currently online CPUs?
> 
> The only possible disadvantage is running "too many" jobs - would that 
> do any harm?
> 
it does, too many jobs actually slows things down. I am however willing
to revisit this when we manage to convert to true threading instead of the
fork model we are using today.  Then we could preallocate all possible
threads and just not use them if it would cause contention.

Note, also this patch does not deal with cgroups and actual number
of cpus available to be used, which could be less than what is
online. I need to spend some time evaluating the best solution
for doing this.

We could use pthread_getaffinity_np() which is probably the best solution
and we are already linking against pthreads because of the library, but
we want to go directly to sched_getaffinity(), or maybe there is
something else I haven't hit yet.