[Vivid][PATCH 2/2] powerpc: add running_clock for powerpc to prevent spurious softlockup warnings

Chris J Arges chris.j.arges at canonical.com
Tue Mar 31 20:28:44 UTC 2015

From: Cyril Bur <cyrilbur at gmail.com>

BugLink: http://bugs.launchpad.net/bugs/1427075

On POWER8 virtualised kernels the VTB register can be read to have a view
of time that only increases while the guest is running.  This will prevent
guests from seeing time jump if a guest is paused for significant amounts
of time.

On POWER7 and below virtualised kernels stolen time is subtracted from
local_clock as a best effort approximation.  This will not eliminate
spurious warnings in the case of a suspended guest but may reduce the
occurance in the case of softlockups due to host over commit.

Bare metal kernels should avoid reading the VTB as KVM does not restore
sane values when not executing, the approxmation is fine as host kernels
won't observe any stolen time.

[akpm at linux-foundation.org: coding-style fixes]
Signed-off-by: Cyril Bur <cyrilbur at gmail.com>
Cc: Michael Ellerman <mpe at ellerman.id.au>
Cc: Andrew Jones <drjones at redhat.com>
Acked-by: Don Zickus <dzickus at redhat.com>
Cc: Ingo Molnar <mingo at kernel.org>
Cc: Ulrich Obergfell <uobergfe at redhat.com>
Cc: chai wen <chaiw.fnst at cn.fujitsu.com>
Cc: Fabian Frederick <fabf at skynet.be>
Cc: Aaron Tomlin <atomlin at redhat.com>
Cc: Ben Zhang <benzh at chromium.org>
Cc: Martin Schwidefsky <schwidefsky at de.ibm.com>
Cc: John Stultz <john.stultz at linaro.org>
Cc: Thomas Gleixner <tglx at linutronix.de>
Signed-off-by: Andrew Morton <akpm at linux-foundation.org>
Signed-off-by: Linus Torvalds <torvalds at linux-foundation.org>

(cherry picked from commit 4be1b29795d692d512bb67b770665d6f8ea5cb0b)
Signed-off-by: Chris J Arges <chris.j.arges at canonical.com>
 arch/powerpc/kernel/time.c | 32 ++++++++++++++++++++++++++++++++
 1 file changed, 32 insertions(+)

diff --git a/arch/powerpc/kernel/time.c b/arch/powerpc/kernel/time.c
index fa7c4f1..7316dd1 100644
--- a/arch/powerpc/kernel/time.c
+++ b/arch/powerpc/kernel/time.c
@@ -621,6 +621,38 @@ unsigned long long sched_clock(void)
 	return mulhdu(get_tb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
+ * Running clock - attempts to give a view of time passing for a virtualised
+ * kernels.
+ * Uses the VTB register if available otherwise a next best guess.
+ */
+unsigned long long running_clock(void)
+	/*
+	 * Don't read the VTB as a host since KVM does not switch in host
+	 * timebase into the VTB when it takes a guest off the CPU, reading the
+	 * VTB would result in reading 'last switched out' guest VTB.
+	 *
+	 * Host kernels are often compiled with CONFIG_PPC_PSERIES checked, it
+	 * would be unsafe to rely only on the #ifdef above.
+	 */
+	if (firmware_has_feature(FW_FEATURE_LPAR) &&
+	    cpu_has_feature(CPU_FTR_ARCH_207S))
+		return mulhdu(get_vtb() - boot_tb, tb_to_ns_scale) << tb_to_ns_shift;
+	/*
+	 * This is a next best approximation without a VTB.
+	 * On a host which is running bare metal there should never be any stolen
+	 * time and on a host which doesn't do any virtualisation TB *should* equal
+	 * VTB so it makes no difference anyway.
+	 */
+	return local_clock() - cputime_to_nsecs(kcpustat_this_cpu->cpustat[CPUTIME_STEAL]);
 static int __init get_freq(char *name, int cells, unsigned long *val)
 	struct device_node *cpu;

More information about the kernel-team mailing list