Fwd: 50 Watt idle power regression bisected to Linux-3.10
Tim Gardner
timg at tpi.com
Sun Dec 8 17:20:10 UTC 2013
I thought you might find this an interesting thread to follow. It may
have an impact on more then just big Xeon platforms.
-------- Original Message --------
Subject: 50 Watt idle power regression bisected to Linux-3.10
Date: Sat, 7 Dec 2013 03:00:05 -0500
From: Len Brown <lenb at kernel.org>
To: tglx at linutronix.de, Peter Zijlstra <peterz at infradead.org>
CC: Linux PM list <linux-pm at vger.kernel.org>,
"linux-kernel at vger.kernel.org" <linux-kernel at vger.kernel.org>, Jeremy
Eder <jeder at redhat.com>, x86 at kernel.org
Hello Thomas,
An idle WSM-EX box (40 Xeon cores) runs 50 Watts hotter after this patch:
commit 7d1a941731fabf27e5fb6edbebb79fe856edb4e5
Author: Thomas Gleixner <tglx at linutronix.de>
Date: Thu Mar 21 22:50:03 2013 +0100
x86: Use generic idle loop
ie. the commit before this patch (aba92c9e2cf3042bf6efc68fa2e4235ba01bf499)
runs at 50 watts less, as do Linux 3.7, 3.8 and 3.9.
The difference is that the good kernels allow about 98% residence
in the package C6 state, while the bad kernel is so noisy that it
gets into pc6 0% of the time.
(indeed, even core C6 is reduced to about 50% from over 99%)
No, Linux-3.13-rc3 does not fix this issue, even though it contains
the following patch, claiming to address an issue with the commit above:
commit ea8117478918a4734586d35ff530721b682425be
Author: Peter Zijlstra <peterz at infradead.org>
Date: Wed Sep 11 12:43:13 2013 +0200
sched, idle: Fix the idle polling state logic
Mike reported that commit 7d1a9417 ("x86: Use generic idle loop")
regressed several workloads and caused excessive reschedule
interrupts.
The patch in question failed to notice that the x86 code had an
inverted sense of the polling state versus the new generic code (x86:
default polling, generic: default !polling).
Fix the two prominent x86 mwait based idle drivers and introduce a few
new generic polling helpers (fixing the wrong smp_mb__after_clear_bit
usage).
Also switch the idle routines to using tif_need_resched() which is an
immediate TIF_NEED_RESCHED test as opposed to need_resched which will
end up being slightly different.
Reported-by: Mike Galbraith <bitbucket at online.de>
Signed-off-by: Peter Zijlstra <peterz at infradead.org>
Cc: lenb at kernel.org
Cc: tglx at linutronix.de
Link:
http://lkml.kernel.org/n/tip-nc03imb0etuefmzybzj7sprf@git.kernel.org
Signed-off-by: Ingo Molnar <mingo at kernel.org>
How shall we proceed?
thanks,
-Len Brown, Intel Open Source Technology Center
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo at vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/
More information about the kernel-team
mailing list