[Bug 1353939] Re: Rescue fails with 'Failed to terminate process: Device or resource busy' in the n-cpu log
Liang Chen
liang.chen at canonical.com
Tue Jul 21 21:07:25 UTC 2015
** Description changed:
+ [Impact]
+
+ * Users may sometimes fail to shutdown an instance if the associated qemu
+ process is on uninterruptable sleep (typically IO).
+
+ [Test Case]
+
+ * 1. create some IO load in a VM
+ 2. look at the associated qemu, make sure it has STAT D in ps output
+ 3. shutdown the instance
+ 4. with the patch in place, nova will retry calling libvirt to shutdown
+ the instance 3 times to wait for the signal to be delivered to the
+ qemu process.
+
+ [Regression Potential]
+
+ * None
+
+
message: "Failed to terminate process" AND
message:'InstanceNotRescuable' AND message: 'Exception during message
handling' AND tags:"screen-n-cpu.txt"
The above log stash-query reports back only the failed jobs, the 'Failed to terminate process' close other failed rescue tests,
but tempest does not always reports them as an error at the end.
message: "Failed to terminate process" AND tags:"screen-n-cpu.txt"
Usual console log:
Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task state "None" within the required time (196 s). Current status: SHUTOFF. Current task state: None.
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
full/90726cb/console.html#_2014-08-07_03_50_26_520
Usual n-cpu exception:
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855
2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher [req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 ServerRescueTestJSON-1017508309] Exception during message handling: Instance 0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to terminate process 26425 with SIGKILL: Device or resource busy
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 408, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 292, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 278, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 342, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 320, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 308, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 3094, in rescue_instance
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher reason=_("Driver Error: %s") % unicode(e))
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher InstanceNotRescuable: Instance 0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to terminate process 26425 with SIGKILL: Device or resource busy
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher
** Also affects: nova (Ubuntu)
Importance: Undecided
Status: New
** Tags added: patch
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1353939
Title:
Rescue fails with 'Failed to terminate process: Device or resource
busy' in the n-cpu log
Status in OpenStack Compute (nova):
Fix Released
Status in nova package in Ubuntu:
New
Bug description:
[Impact]
* Users may sometimes fail to shutdown an instance if the associated qemu
process is on uninterruptable sleep (typically IO).
[Test Case]
* 1. create some IO load in a VM
2. look at the associated qemu, make sure it has STAT D in ps output
3. shutdown the instance
4. with the patch in place, nova will retry calling libvirt to shutdown
the instance 3 times to wait for the signal to be delivered to the
qemu process.
[Regression Potential]
* None
message: "Failed to terminate process" AND
message:'InstanceNotRescuable' AND message: 'Exception during message
handling' AND tags:"screen-n-cpu.txt"
The above log stash-query reports back only the failed jobs, the 'Failed to terminate process' close other failed rescue tests,
but tempest does not always reports them as an error at the end.
message: "Failed to terminate process" AND tags:"screen-n-cpu.txt"
Usual console log:
Details: (ServerRescueTestJSON:test_rescue_unrescue_instance) Server 0573094d-53da-40a5-948a-747d181462f5 failed to reach RESCUE status and task state "None" within the required time (196 s). Current status: SHUTOFF. Current task state: None.
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-
full/90726cb/console.html#_2014-08-07_03_50_26_520
Usual n-cpu exception:
http://logs.openstack.org/82/107982/2/gate/gate-tempest-dsvm-postgres-full/90726cb/logs/screen-n-cpu.txt.gz#_2014-08-07_03_32_02_855
2014-08-07 03:32:02.855 ERROR oslo.messaging.rpc.dispatcher [req-39ce7a3d-5ceb-41f5-8f9f-face7e608bd1 ServerRescueTestJSON-2035684545 ServerRescueTestJSON-1017508309] Exception during message handling: Instance 0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to terminate process 26425 with SIGKILL: Device or resource busy
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher Traceback (most recent call last):
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 134, in _dispatch_and_reply
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher incoming.message))
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 177, in _dispatch
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return self._do_dispatch(endpoint, method, ctxt, args)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/usr/local/lib/python2.7/dist-packages/oslo/messaging/rpc/dispatcher.py", line 123, in _do_dispatch
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher result = getattr(endpoint, method)(ctxt, **new_args)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 408, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 88, in wrapped
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher payload)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/exception.py", line 71, in wrapped
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return f(self, context, *args, **kw)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 292, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher pass
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 278, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 342, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 320, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher kwargs['instance'], e, sys.exc_info())
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/openstack/common/excutils.py", line 82, in __exit__
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher six.reraise(self.type_, self.value, self.tb)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 308, in decorated_function
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher return function(self, context, *args, **kwargs)
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher File "/opt/stack/new/nova/nova/compute/manager.py", line 3094, in rescue_instance
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher reason=_("Driver Error: %s") % unicode(e))
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher InstanceNotRescuable: Instance 0573094d-53da-40a5-948a-747d181462f5 cannot be rescued: Driver Error: Failed to terminate process 26425 with SIGKILL: Device or resource busy
2014-08-07 03:32:02.855 22829 TRACE oslo.messaging.rpc.dispatcher
To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1353939/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list