[Bug 1782517] Re: Failed to recover stopped instance
Rikimaru Honjo
1782517 at bugs.launchpad.net
Tue Apr 13 06:07:34 UTC 2021
Hi,
I confirmed that this issue was fixed in the following version masakari.
I change tags after that.
10.0.0-0ubuntu2
9.0.0-0ubuntu0.20.04.4
latest stable/victoria branch(with devstack)
latest stable/ussuri branch(with devstack)
latest stable/train branch(with devstack)
latest stable/stein branch(with devstack)
(Sorry, I don't have time to test victoria/ussuri/train/stein deb
packages. I used stable branches instead of packages.)
** Tags removed: verification-needed-focal verification-needed-groovy verification-stein-needed verification-train-needed verification-ussuri-needed verification-victoria-needed
** Tags added: verification-done-focal verification-done-groovy verification-stein-done verification-train-done verification-ussuri-done verification-victoria-done
** Tags removed: verification-stein-done
** Tags added: verification-stein-needed
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1782517
Title:
Failed to recover stopped instance
Status in Ubuntu Cloud Archive:
Fix Released
Status in Ubuntu Cloud Archive stein series:
Fix Committed
Status in Ubuntu Cloud Archive train series:
Fix Committed
Status in Ubuntu Cloud Archive ussuri series:
Fix Committed
Status in Ubuntu Cloud Archive victoria series:
Fix Committed
Status in Ubuntu Cloud Archive wallaby series:
Fix Released
Status in masakari:
Fix Released
Status in masakari train series:
Fix Released
Status in masakari ussuri series:
Fix Released
Status in masakari victoria series:
Fix Released
Status in masakari wallaby series:
Fix Released
Status in masakari package in Ubuntu:
Fix Released
Status in masakari source package in Focal:
Fix Committed
Status in masakari source package in Groovy:
Fix Committed
Status in masakari source package in Hirsute:
Fix Released
Bug description:
[Error]
Recovering host-failure was failed when there was stopped state instance on the failed host.
As a result, notification status became "failed".
(Instance's vm_state after evacuation became "stopped".)
I used the latest version of masakari.
[Cause of error]
Masakari will try to call stop API after evacuating.
But, evacuate API stops the instance at the end if the original vm_state is stopped.
So 409 error was occurred when masakari called stop API after evacuating.
== Ubuntu SRU Details below ==
[Impact]
See above
[Test Case]
For focal:
Test with an actual juju deployed masakari openstack deployment and ensure the reported bug is fixed on host failure.
For all other releases the fix can be verified with an LXD container for the corresponding release:
$ sudo apt install python3-masakari
$ cd /usr/lib/python3/dist-packages
$ python3 -m unittest masakari.tests.unit.engine.drivers.taskflow.test_host_failure_flow.HostFailureTestCase.test_host_failure_flow_for_stopped_instances
The unit test will be successful on a patched deployment and will fail
with a mismatch error in test_host_failure_flow_for_stopped_instances.
[Where problems coud occur]
Any regressions in this fix will likely result in similar failures to what was reported in this bug, resulting in a failure to recover an instance on host failure. The patch is a small, targeted change with a good unit test and the code is unchanged across the backports which helps mitigate regression potential.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1782517/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list