[Bug 1413540] Re: soft lockup issues with nested KVM VMs running tempest
Chris J Arges
1413540 at bugs.launchpad.net
Fri Jan 23 20:32:38 UTC 2015
Let's concentrate on the hang without KSM on this bug. I've split the
KSM enabled in nested virt issue in bug 1414153.
** Summary changed:
- issues with KSM enabled for nested KVM VMs
+ soft lockup issues with nested KVM VMs running tempest
** No longer affects: qemu (Ubuntu)
** Description changed:
+
+ [Impact]
+ Users of nested KVM for testing openstack have soft lockups as follows:
+ [74180.076007] BUG: soft lockup - CPU#1 stuck for 22s! [qemu-system-x86:14590]
+ <snip>
+ [74180.076007] Call Trace:
+ [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80
+ [74180.076007] [<ffffffff810dbf75>] smp_call_function_single+0xe5/0x190
+ [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80
+ [74180.076007] [<ffffffffa00c4300>] ? rmap_write_protect+0x80/0x80 [kvm]
+ [74180.076007] [<ffffffff810dc3a6>] smp_call_function_many+0x286/0x2d0
+ [74180.076007] [<ffffffff8105c7a0>] ? leave_mm+0x80/0x80
+ [74180.076007] [<ffffffff8105c8f7>] native_flush_tlb_others+0x37/0x40
+ [74180.076007] [<ffffffff8105c9cb>] flush_tlb_mm_range+0x5b/0x230
+ [74180.076007] [<ffffffff8105b80d>] pmdp_splitting_flush+0x3d/0x50
+ [74180.076007] [<ffffffff811ac95b>] __split_huge_page+0xdb/0x720
+ [74180.076007] [<ffffffff811ad008>] split_huge_page_to_list+0x68/0xd0
+ [74180.076007] [<ffffffff811ad9a6>] __split_huge_page_pmd+0x136/0x330
+ [74180.076007] [<ffffffff8117728d>] unmap_page_range+0x7dd/0x810
+ [74180.076007] [<ffffffffa00a66b5>] ? kvm_mmu_notifier_invalidate_range_start+0x75/0x90 [kvm]
+ [74180.076007] [<ffffffff81177341>] unmap_single_vma+0x81/0xf0
+ [74180.076007] [<ffffffff811784ed>] zap_page_range+0xed/0x150
+ [74180.076007] [<ffffffff8108ed74>] ? hrtimer_start_range_ns+0x14/0x20
+ [74180.076007] [<ffffffff81174fbf>] SyS_madvise+0x3bf/0x850
+ [74180.076007] [<ffffffff810db841>] ? SyS_futex+0x71/0x150
+ [74180.076007] [<ffffffff8173186d>] system_call_fastpath+0x1a/0x1f
+
+ [Test Case]
+ - Deploy openstack on openstack
+ - Run tempest on L1 cloud
+ - Check kernel log of L1 nova-compute nodes
+
+ --
+
+ Original Description:
+
When installing qemu-kvm on a VM, KSM is enabled.
I have encountered this problem in trusty:$ lsb_release -a
Distributor ID: Ubuntu
Description: Ubuntu 14.04.1 LTS
Release: 14.04
Codename: trusty
$ uname -a
Linux juju-gema-machine-2 3.13.0-40-generic #69-Ubuntu SMP Thu Nov 13 17:53:56 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux
The way to see the behaviour:
1) $ more /sys/kernel/mm/ksm/run
0
2) $ sudo apt-get install qemu-kvm
3) $ more /sys/kernel/mm/ksm/run
1
To see the soft lockups, deploy a cloud on a virtualised env like ctsstack, run tempest on it, the compute nodes of the virtualised deployment will eventually stop responding with (run tempest 2 times at least):
24096.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791]
[24124.072003] BUG: soft lockup - CPU#0 stuck for 23s! [qemu-system-x86:24791]
[24152.072002] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
[24180.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
[24208.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
[24236.072004] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
[24264.072003] BUG: soft lockup - CPU#0 stuck for 22s! [qemu-system-x86:24791]
I am not sure whether the problem is that we are enabling KSM on a VM or
the problem is that nested KSM is not behaving properly. Either way I
can easily reproduce, please contact me if you need further details.
--
You received this bug notification because you are a member of Ubuntu
Server Team, which is subscribed to qemu in Ubuntu.
https://bugs.launchpad.net/bugs/1413540
Title:
soft lockup issues with nested KVM VMs running tempest
To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1413540/+subscriptions
More information about the Ubuntu-server-bugs
mailing list