[RFC Quantal] Replace pv-ops fix for legacy Xen

Tim Gardner tim.gardner at canonical.com
Thu Jun 14 19:06:28 UTC 2012


On 06/14/2012 10:46 AM, Stefan Bader wrote:
> For hysterical reasons we carry the following patch for running
> as paravirtualized guests on Xen (on Amazon EC2):
> 
>   commit b187be83798aded25e58cb40f52ffc9a1452879d
>   Author: John Johansen <john.johansen at canonical.com>
>   Date:   Tue Jul 27 06:06:07 2010 -0700
> 
>     UBUNTU: SAUCE: fix pv-ops for legacy Xen
> 
> A short (ok, maybe not) background:
> 
> Back in Lucid times there was a problem which caused paravirtualized
> guests with pv-ops enabled to crash early on boot when running on
> EC2 (not always but sometimes).
> 
> The reason was found later (Maverick) to be that some older versions
> of the Xen hypervisor (not sure which exactly) would crash a PV guest
> when that tried to write unexpected/-supported values into CR4. The
> problematic one apparently OSXSAVE.
> 
> So the hack (taken from Fedora) put some code in to filter out that
> bit before it was written into CR4 by a function that is part of the
> pv-ops (iow it only is used on a paravirt guest).
> 
> During last UDS time, there happened to be an upstream discussion
> about this because kernels that had this hack would cause problems
> when running on newer Xen hypervisor versions (and of course CPUs
> that actually support this feature).
> 
> That turned out to be a glibc problem where checks were not complete
> and thus only looking at XSAVE but not OSXSAVE reported through CR4.
> I believe that problem now has been fixed, but still it sounded like
> the way this is handled could be improved.
> 
> In order to be save about EC2 we need to prevent OSXSAVE to be written
> into CR4 on "older" Xen versions. But running on newer versions with
> the right support and features it would be nice not to cripple
> it all the time.
> 
> So this patch would replace the old hack. It hooks into the same function
> which would disable XSAVE/OSXSAVE when it would be manually forced off
> (by masking off the feature flags from the cpuid results).
> The change will enforce this when running on a Xen hypervisor before
> version 4 and only when running as a PV guest. I also added a small
> change to the banner printing code, so it is possible to see when the
> cpuid capping takes place.
> 
> I booted a Quantal kernel with that on two EC2 instances (and one
> even happened to be on an old 3.0.3-rc5 version of Xen). Of course
> neither of them seemed to claim the XSAVE feature in the first 
> place... And it would likely require a lot of time to accidentally
> hit a host that does, and none of my machines and home does either.
> 
> -Stefan
> 
> 
> From 78d874169b045410c41c0f3f2135743aa49d2d28 Mon Sep 17 00:00:00 2001
> From: Stefan Bader <stefan.bader at canonical.com>
> Date: Mon, 4 Jun 2012 19:12:51 +0200
> Subject: [PATCH 2/2] UBUNTU: SAUCE: Mask CR4 writes on older Xen hypervisors
> 
> Older Xen hypervisors (like RHEL5 versions found to be used
> on Amazon's EC2) did have a bug which would crash the domain
> when trying to write unsupported CR values.
> Newer versions do handle this correctly. But some probes (in
> particular one that tries to pass the OSXSAVE bit set) were
> causing pv-ops kernels to crash early on boot when running on EC2.
> 
> We were using a patch that did always mask off OSXSAVE when
> running under Xen PV but since this may limit performance for
> hypervisor/HW combinations that do support the AVX instructions,
> a slightly more targetted approach would be good.
> 
> This patch tries to mask off the XSAVE and OSXSAVE bits when
> running as a PV guest on a host prior Xen 4.
> 
> Signed-off-by: Stefan Bader <stefan.bader at canonical.com>
> ---
>  arch/x86/xen/enlighten.c |   13 +++++++++++++
>  1 file changed, 13 insertions(+)
> 
> diff --git a/arch/x86/xen/enlighten.c b/arch/x86/xen/enlighten.c
> index e74df95..7eaa415 100644
> --- a/arch/x86/xen/enlighten.c
> +++ b/arch/x86/xen/enlighten.c
> @@ -207,6 +207,8 @@ static void __init xen_banner(void)
>  	printk(KERN_INFO "Xen version: %d.%d%s%s\n",
>  	       version >> 16, version & 0xffff, extra.extraversion,
>  	       xen_feature(XENFEAT_mmu_pt_update_preserve_ad) ? " (preserve-AD)" : "");
> +	if (xen_pv_domain() && ((version >> 16) < 4))
> +		printk(KERN_INFO "Forcing xsave off due to Xen version.\n");
>  }
>  
>  static __read_mostly unsigned int cpuid_leaf1_edx_mask = ~0;
> @@ -329,6 +331,7 @@ static bool __init xen_check_mwait(void)
>  }
>  static void __init xen_init_cpuid_mask(void)
>  {
> +	unsigned version = HYPERVISOR_xen_version(XENVER_version, NULL);
>  	unsigned int ax, bx, cx, dx;
>  	unsigned int xsave_mask;
>  
> @@ -353,6 +356,16 @@ static void __init xen_init_cpuid_mask(void)
>  	/* Xen will set CR4.OSXSAVE if supported and not disabled by force */
>  	if ((cx & xsave_mask) != xsave_mask)
>  		cpuid_leaf1_ecx_mask &= ~xsave_mask; /* disable XSAVE & OSXSAVE */
> +	/*
> +	 * This seems not to work with some early Xen versions when
> +	 * the cpuid information claims support but trying to write
> +	 * this into CR4 causes the pv guest to crash.
> +	 * Since we don't know the exact version, assume all versions
> +	 * prior Xen 4.x to be broken.
> +	 */
> +	if (xen_pv_domain() && ((version >> 16) < 4))
> +		cpuid_leaf1_ecx_mask &= ~xsave_mask;
> +
>  	if (xen_check_mwait())
>  		cpuid_leaf1_ecx_set_mask = (1 << (X86_FEATURE_MWAIT % 32));
>  }

Just a nit, how about making the version check a documented macro
instead of open coding '(version >> 16) < 4)' twice ?

Otherwise, ACK.

rtg
-- 
Tim Gardner tim.gardner at canonical.com




More information about the kernel-team mailing list