[SRU][J/L][PATCH 1/1] x86/xen/time: prefer tsc as clocksource when it is invariant

Krister Johansen kjlx at templeofstupid.com
Sat Aug 26 01:00:22 UTC 2023


On Fri, Aug 25, 2023 at 05:57:30PM -0700, Krister Johansen wrote:
> BugLink: http://bugs.launchpad.net/bugs/2033122
> 
> Kvm elects to use tsc instead of kvm-clock when it can detect that the
> TSC is invariant.
> 
> (As of commit 7539b174aef4 ("x86: kvmguest: use TSC clocksource if
> invariant TSC is exposed")).
> 
> Notable cloud vendors[1] and performance engineers[2] recommend that Xen
> users preferentially select tsc over xen-clocksource due the performance
> penalty incurred by the latter.  These articles are persuasive and
> tailored to specific use cases.  In order to understand the tradeoffs
> around this choice more fully, this author had to reference the
> documented[3] complexities around the Xen configuration, as well as the
> kernel's clocksource selection algorithm.  Many users may not attempt
> this to correctly configure the right clock source in their guest.
> 
> The approach taken in the kvm-clock module spares users this confusion,
> where possible.
> 
> Both the Intel SDM[4] and the Xen tsc documentation explain that marking
> a tsc as invariant means that it should be considered stable by the OS
> and is elibile to be used as a wall clock source.
> 
> In order to obtain better out-of-the-box performance, and reduce the
> need for user tuning, follow kvm's approach and decrease the xen clock
> rating so that tsc is preferable, if it is invariant, stable, and the
> tsc will never be emulated.
> 
> [1] https://aws.amazon.com/premiumsupport/knowledge-center/manage-ec2-linux-clock-source/
> [2] https://www.brendangregg.com/blog/2021-09-26/the-speed-of-time.html
> [3] https://xenbits.xen.org/docs/unstable/man/xen-tscmode.7.html
> [4] Intel 64 and IA-32 Architectures Sofware Developer's Manual Volume
>     3b: System Programming Guide, Part 2, Section 17.17.1, Invariant TSC
> 
> Signed-off-by: Krister Johansen <kjlx at templeofstupid.com>
> Code-reviewed-by: David Reaver <me at davidreaver.com>
> Reviewed-by: Juergen Gross <jgross at suse.com>
> Link: https://lore.kernel.org/r/20221216162118.GB2633@templeofstupid.com
> Signed-off-by: Juergen Gross <jgross at suse.com>
> (cherry picked from commit 99a7bcafbd0d04555074554573019096a8c10450)

Apologies, this should be:

(cherry picked from commit caea091e48ed9d3951506507abf26e9918d08e35)

> Signed-off-by: Krister Johansen <kjlx at templeofstupid.com>
> ---
>  arch/x86/xen/time.c | 38 +++++++++++++++++++++++++++++++++++++-
>  1 file changed, 37 insertions(+), 1 deletion(-)
> 
> diff --git a/arch/x86/xen/time.c b/arch/x86/xen/time.c
> index cc11dd2e2f48..062120dd8b98 100644
> --- a/arch/x86/xen/time.c
> +++ b/arch/x86/xen/time.c
> @@ -474,15 +474,51 @@ static void xen_setup_vsyscall_time_info(void)
>  	xen_clocksource.vdso_clock_mode = VDSO_CLOCKMODE_PVCLOCK;
>  }
>  
> +/*
> + * Check if it is possible to safely use the tsc as a clocksource.  This is
> + * only true if the hypervisor notifies the guest that its tsc is invariant,
> + * the tsc is stable, and the tsc instruction will never be emulated.
> + */
> +static int __init xen_tsc_safe_clocksource(void)
> +{
> +	u32 eax, ebx, ecx, edx;
> +
> +	if (!(boot_cpu_has(X86_FEATURE_CONSTANT_TSC)))
> +		return 0;
> +
> +	if (!(boot_cpu_has(X86_FEATURE_NONSTOP_TSC)))
> +		return 0;
> +
> +	if (check_tsc_unstable())
> +		return 0;
> +
> +	/* Leaf 4, sub-leaf 0 (0x40000x03) */
> +	cpuid_count(xen_cpuid_base() + 3, 0, &eax, &ebx, &ecx, &edx);
> +
> +	/* tsc_mode = no_emulate (2) */
> +	if (ebx != 2)
> +		return 0;
> +
> +	return 1;
> +}
> +
>  static void __init xen_time_init(void)
>  {
>  	struct pvclock_vcpu_time_info *pvti;
>  	int cpu = smp_processor_id();
>  	struct timespec64 tp;
>  
> -	/* As Dom0 is never moved, no penalty on using TSC there */
> +	/*
> +	 * As Dom0 is never moved, no penalty on using TSC there.
> +	 *
> +	 * If it is possible for the guest to determine that the tsc is a safe
> +	 * clocksource, then set xen_clocksource rating below that of the tsc
> +	 * so that the system prefers tsc instead.
> +	 */
>  	if (xen_initial_domain())
>  		xen_clocksource.rating = 275;
> +	else if (xen_tsc_safe_clocksource())
> +		xen_clocksource.rating = 299;
>  
>  	clocksource_register_hz(&xen_clocksource, NSEC_PER_SEC);
>  
> -- 
> 2.25.1
> 



More information about the kernel-team mailing list