APPLIED: [SRU][Xenial][PATCH 0/1] smpboot: don't call topology_sane() when Sub-NUMA-Clustering is enabled

Khaled Elmously khalid.elmously at canonical.com
Tue Jun 30 03:06:57 UTC 2020


On 2020-06-08 16:17:35 , Matthew Ruffell wrote:
> BugLink: https://bugs.launchpad.net/bugs/1882478
> 
> [Impact]
> 
> Intel Skylake server processors and onward have a different Last Level Cache
> (LLC) topology than earlier processors, and such processors have a new feature
> called Sub-NUMA-Clustering (SNC) which is similar to the existing 
> Cluster-On-Die (CoD) feature earlier server processors has.
> 
> Sub-NUMA-Clustering divides the system into two "slices", each of which are
> allocated half the CPU cores, half the Last Level Cache and one memory
> controller. Each slice is enumerated as a NUMA node.
> 
> The difference between Sub-NUMA-Clustering and Cluster-On-Die is how the Last
> Level Cache is exposed to each NUMA node. CoD had the same cache line present in
> each half of the LLC. In SNC, each cache line is only present in its respective
> slice. Because of this, the semantics around accessing LLC changes, with a
> process accessing NUMA-local memory only seeing half the LLC capacity.
> 
> On systems with Sub-NUMA-Clustering enabled, on the Xenial 4.4 and Bionic 4.15
> kernels we see the following oops during NUMA node enumeration:
> 
> .... node #0, CPUs: #1 #2 #3 #4 #5 #6
> .... node #1, CPUs: #7
> sched: CPU #7's llc-sibling CPU #0 is not on the same node! [node: 1 != 0]. Ignoring dependency.
> WARNING: CPU: 7 PID: 0 at /build/linux-hwe-F5opqf/linux-hwe-4.15.0/arch/x86/kernel/smpboot.c:375 topology_sane.isra.4+0x6c/0x70
> Modules linked in:
> CPU: 7 PID: 0 Comm: swapper/7 Not tainted 4.15.0-47-generic #50~16.04.1-Ubuntu
> Hardware name: HPE ProLiant DL360 Gen10/ProLiant DL360 Gen10, BIOS U32 10/02/2018
> RIP: 0010:topology_sane.isra.4+0x6c/0x70
> Call Trace:
> set_cpu_sibling_map+0x153/0x540
> start_secondary+0xb2/0x200
> secondary_startup_64+0xa5/0xb0
> #8 #9 #10 #11 #12 #13
> .... node #0, CPUs: #14 #15 #16 #17 #18 #19 #20
> .... node #1, CPUs: #21 #22 #23 #24 #25 #26 #27
> smp: Brought up 2 nodes, 28 CPUs 
> 
> This was with a Intel Xeon Gold 5120 CPU on a HP DL360 Gen10.
> 
> The oops happens because topology_sane() checks to see if the Last Level Cache
> line matches across different CPUs, which it no longer does.
> 
> [Fix]
> 
> The fix comes in the form of the following upstream commit, which landed in
> Linux 4.17:
> 
> commit 1340ccfa9a9afefdbab90d7935d4ed19817e37c2
> Author: Alison Schofield <alison.schofield at intel.com>
> Date: Fri Apr 6 17:21:30 2018 -0700
> Subject: x86,sched: Allow topologies where NUMA nodes share an LLC
> Link: https://github.com/torvalds/linux/commit/1340ccfa9a9afefdbab90d7935d4ed19817e37c2 
> 
> The commit adds a check for this particular family of Intel processors, and if
> the CPU family matches, it simply skips the check to topology_sane().
> 
> The commit needs minor backports to Xenial 4.4 and Bionic 4.15, with the only
> remarks being re-arranging #includes and small context fixups.
> 
> [Testcase]
> 
> Unfortunately, this is hardware specific. To test this, you need a Intel Skylake
> server processor which supports Sub-NUMA-Clustering.
> 
> We have a customer with a Intel Xeon Gold 5120 CPU on a HP DL360 Gen10 that has
> successfully tested the below test kernels, with good results.
> 
> Xenial 4.4 ppa:
> https://launchpad.net/~mruffell/+archive/ubuntu/sf280048-test-ga
> 
> Xenial 4.15 HWE ppa:
> https://launchpad.net/~mruffell/+archive/ubuntu/sf280048-test-hwe
> 
> Running the test kernel, the oops does not reproduce:
> 
> smp: Bringing up secondary CPUs ...
> x86: Booting SMP configuration:
> .... node #0, CPUs: #1
> NMI watchdog: Enabled. Permanently consumes one hw-PMU counter.
> #2 #3 #4 #5 #6
> .... node #1, CPUs: #7 #8 #9 #10 #11 #12 #13
> .... node #0, CPUs: #14 #15 #16 #17 #18 #19 #20
> .... node #1, CPUs: #21 #22 #23 #24 #25 #26 #27
> smp: Brought up 2 nodes, 28 CPUs
> smpboot: Max logical packages: 1
> smpboot: Total of 28 processors activated
> 
> [Regression Potential]
> 
> The commit modifies a small section of smpboot code, which every machine will
> execute on boot. The majority of the commit breaks up a large if statement into
> smaller blocks than it was previously, and adds an extra if statement to check
> for a specific processor family.
> 
> If a regression were to occur, some machines would or would not make their calls
> to topology_sane(), which in the worst case, would result in a oops message and
> slightly degraded performance. The system would still function normally.
> 
> The commit has been present since 4.17-rc2 and is present in Eoan and Focal.
> There are no fixup commits, and no additional processor families have been
> added since.
> 
> Because of the small re-arrangement in logic, and the addition of a processor
> family check, these changes are fairly minor, and I don't think it will cause
> any regressions.
> 
> Alison Schofield (1):
>   x86,sched: Allow topologies where NUMA nodes share an LLC
> 
>  arch/x86/kernel/smpboot.c | 42 +++++++++++++++++++++++++++++++++++----
>  1 file changed, 38 insertions(+), 4 deletions(-)
> 
> -- 
> 2.25.1
> 
> 
> -- 
> kernel-team mailing list
> kernel-team at lists.ubuntu.com
> https://lists.ubuntu.com/mailman/listinfo/kernel-team



More information about the kernel-team mailing list