[Bug 1782517] Re: Failed to recover stopped instance

Corey Bryant 1782517 at bugs.launchpad.net
Thu Mar 25 20:58:50 UTC 2021


For Ubuntu and the cloud archive this has been uploaded to groovy and
focal unapproved queues as well as the staging PPAs for train and stein:

https://launchpad.net/ubuntu/groovy/+queue?queue_state=1&queue_text=masakari
https://launchpad.net/ubuntu/focal/+queue?queue_state=1&queue_text=masakari
https://launchpad.net/~ubuntu-cloud-archive/+archive/ubuntu/train-staging/+packages?field.name_filter=masakari
https://launchpad.net/~ubuntu-cloud-archive/+archive/ubuntu/stein-staging/+packages?field.name_filter=masakari

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1782517

Title:
  Failed to recover stopped instance

Status in Ubuntu Cloud Archive:
  Fix Released
Status in Ubuntu Cloud Archive stein series:
  Triaged
Status in Ubuntu Cloud Archive train series:
  Triaged
Status in Ubuntu Cloud Archive ussuri series:
  Triaged
Status in Ubuntu Cloud Archive victoria series:
  Triaged
Status in Ubuntu Cloud Archive wallaby series:
  Fix Released
Status in masakari:
  Fix Released
Status in masakari train series:
  Fix Released
Status in masakari ussuri series:
  Fix Released
Status in masakari victoria series:
  Fix Released
Status in masakari wallaby series:
  Fix Released
Status in masakari package in Ubuntu:
  Fix Released
Status in masakari source package in Focal:
  Triaged
Status in masakari source package in Groovy:
  Triaged
Status in masakari source package in Hirsute:
  Fix Released

Bug description:
  [Error]
  Recovering host-failure was failed when there was stopped state instance on the failed host.
  As a result, notification status became "failed".
  (Instance's vm_state after evacuation became "stopped".)

  I used the latest version of masakari.

  [Cause of error]
  Masakari will try to call stop API after evacuating.
  But, evacuate API stops the instance at the end if the original vm_state is stopped.
  So 409 error was occurred when masakari called stop API after evacuating.

  == Ubuntu SRU Details below ==
  [Impact]
  See above

  [Test Case]
  For focal:
  Test with an actual juju deployed masakari openstack deployment and ensure the reported bug is fixed on host failure.

  For all other releases the fix can be verified with an LXD container for the corresponding release:
  $ sudo apt install python3-masakari
  $ cd /usr/lib/python3/dist-packages
  $ python3 -m unittest masakari.tests.unit.engine.drivers.taskflow.test_host_failure_flow.HostFailureTestCase.test_host_failure_flow_for_stopped_instances

  The unit test will be successful on a patched deployment and will fail
  with a mismatch error in test_host_failure_flow_for_stopped_instances.

  [Where problems coud occur]
  Any regressions in this fix will likely result in similar failures to what was reported in this bug, resulting in a failure to recover an instance on host failure. The patch is a small, targeted change with a good unit test and the code is unchanged across the backports which helps mitigate regression potential.

To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1782517/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list