[Bug 1535918] Re: instance.host not updated on evacuation
Seyeong Kim
seyeong.kim at canonical.com
Mon Aug 21 12:46:08 UTC 2017
** Description changed:
[Impact]
- Affected to Xenial Mitaka, UCA Mitaka
+ I created several VM instances and checked they are all ACTIVE state after creating vm.
+ Right after checking them, shutdown nova-compute on their host(to test in this case).
+ Then, I tried to evacuate them to the other host. But it is failed with ERROR state.
+ I did some test and analysis.
+ I found two commits below are related.(Please refer to [Others] section)
+ In this context, migration_context is DB field to pass information when migration or evacuation.
- just after creating vm and state ACTIVE,
+ for [1], This gets host info from migration_context. if
+ migration_context is abnormal or empty, migration would be fail.
+ actually, with only this patch, migration_context is empty. so [2] is
+ needed. I touched self.client.prepare part in rpcapi.py from original
+ patch which is replaced on newer version. because it is related newer
+ functionality, I remained mitaka's function call for this issue.
- When evacuating it, it is failed with ERROR state.
+ for [2], This moves recreation check code to former if condition. and it calls rebuild_claim to create migration_context when recreate state not only scheduled. I adjusted test code which are pop up from backport process and seems to be needed. Someone want to backport or cherrypick code related to this, they could find it is already exist.
+ Only one patch of them didn’t fix this issue as test said.
[Test case]
In below env,
+
http://pastebin.ubuntu.com/25337153/
- Network configuration is important in this case, because I tested
- different configuration. but couldn't reproduce it.
-
+ Network configuration is important in this case, because I tested different configuration. but couldn't reproduce it.
reproduction test script ( based on juju )
http://pastebin.ubuntu.com/25360805/
[Regression Potential]
- These backports are about evacuation,
- for a5b920a197c70d2ae08a1e1335d979857f923b4f, This gets host info from migration_context. if migration_context is abnormal or empty, migration would be fail. actually, with only this patch, migration_context is empty. so 0f2d87 is needed. I touched self.client.prepare part in rpcapi.py from original patch which is replaced on newer version. because it is related newer functionality, I remained mitaka's function call for this issue.
- for 0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5, This moves recreation check code to former if condition. and it will make rebuild_claim to create migration_context when recreate state. I adjusted test code which are pop up from backport process and seems to be needed. Someone want to backport or cherrypick code related to this, they could find it is already exist.
+ Existing ACTIVE VMs or Newly creating VMs are not affected by this code
+ because these commits are only called when doing migration or
+ evacuation. If there are ACTIVE VMs and VMs got ERROR state caused by
+ this issue in one host, and after upgrading pkg, All VMs should not be
+ affected anything by this upgrading. After trying to evacuate
+ problematic VM again, ERROR state should be fixed to ACTIVE. I tested
+ this scenario on simple env, but still need to be considered possibility
+ in complex, crowded environment.
[Others]
In test, I should patch two commits, one from
-
https://bugs.launchpad.net/nova/+bug/1686041
Related Patches.
- 1. https://github.com/openstack/nova/commit/a5b920a197c70d2ae08a1e1335d979857f923b4f
-
- 2. https://github.com/openstack/nova/commit/0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5 ( backported to newton from below original)
- - https://github.com/openstack/nova/commit/a2b0824aca5cb4a2ae579f625327c51ed0414d35 ( original)
-
+ [1] https://github.com/openstack/nova/commit/a5b920a197c70d2ae08a1e1335d979857f923b4f
+ [2] https://github.com/openstack/nova/commit/0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5 ( backported to newton from below original)
+ - https://github.com/openstack/nova/commit/a2b0824aca5cb4a2ae579f625327c51ed0414d35 (
+ original)
[Original description]
I'm working on the nova-powervm driver for Mitaka and trying to add
support for evacuation.
The problem I'm hitting is that instance.host is not updated when the
compute driver is called to spawn the instance on the destination host.
It is still set to the source host. It's not until after the spawn
completes that the compute manager updates instance.host to reflect the
destination host.
The nova-powervm driver uses instance events callback mechanism during
plug VIF to determine when Neutron has finished provisioning the
network. The instance events code sends the event to instance.host and
hence is sending the event to the source host (which is down). This
causes the spawn to fail and also causes weirdness when the source host
gets the events when it's powered back up.
To temporarily work around the problem, I hacked in setting
instance.host = CONF.host; instance.save() in the compute driver but
that's not a good solution.
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to Ubuntu Cloud Archive.
https://bugs.launchpad.net/bugs/1535918
Title:
instance.host not updated on evacuation
Status in Ubuntu Cloud Archive:
Fix Released
Status in OpenStack Compute (nova):
Fix Released
Status in nova-powervm:
Fix Released
Status in nova package in Ubuntu:
New
Bug description:
[Impact]
I created several VM instances and checked they are all ACTIVE state after creating vm.
Right after checking them, shutdown nova-compute on their host(to test in this case).
Then, I tried to evacuate them to the other host. But it is failed with ERROR state.
I did some test and analysis.
I found two commits below are related.(Please refer to [Others] section)
In this context, migration_context is DB field to pass information when migration or evacuation.
for [1], This gets host info from migration_context. if
migration_context is abnormal or empty, migration would be fail.
actually, with only this patch, migration_context is empty. so [2] is
needed. I touched self.client.prepare part in rpcapi.py from original
patch which is replaced on newer version. because it is related newer
functionality, I remained mitaka's function call for this issue.
for [2], This moves recreation check code to former if condition. and it calls rebuild_claim to create migration_context when recreate state not only scheduled. I adjusted test code which are pop up from backport process and seems to be needed. Someone want to backport or cherrypick code related to this, they could find it is already exist.
Only one patch of them didn’t fix this issue as test said.
[Test case]
In below env,
http://pastebin.ubuntu.com/25337153/
Network configuration is important in this case, because I tested different configuration. but couldn't reproduce it.
reproduction test script ( based on juju )
http://pastebin.ubuntu.com/25360805/
[Regression Potential]
Existing ACTIVE VMs or Newly creating VMs are not affected by this
code because these commits are only called when doing migration or
evacuation. If there are ACTIVE VMs and VMs got ERROR state caused by
this issue in one host, and after upgrading pkg, All VMs should not be
affected anything by this upgrading. After trying to evacuate
problematic VM again, ERROR state should be fixed to ACTIVE. I tested
this scenario on simple env, but still need to be considered
possibility in complex, crowded environment.
[Others]
In test, I should patch two commits, one from
https://bugs.launchpad.net/nova/+bug/1686041
Related Patches.
[1] https://github.com/openstack/nova/commit/a5b920a197c70d2ae08a1e1335d979857f923b4f
[2] https://github.com/openstack/nova/commit/0f2d87416eff1e96c0fbf0f4b08bf6b6b22246d5 ( backported to newton from below original)
- https://github.com/openstack/nova/commit/a2b0824aca5cb4a2ae579f625327c51ed0414d35 (
original)
[Original description]
I'm working on the nova-powervm driver for Mitaka and trying to add
support for evacuation.
The problem I'm hitting is that instance.host is not updated when the
compute driver is called to spawn the instance on the destination
host. It is still set to the source host. It's not until after the
spawn completes that the compute manager updates instance.host to
reflect the destination host.
The nova-powervm driver uses instance events callback mechanism during
plug VIF to determine when Neutron has finished provisioning the
network. The instance events code sends the event to instance.host
and hence is sending the event to the source host (which is down).
This causes the spawn to fail and also causes weirdness when the
source host gets the events when it's powered back up.
To temporarily work around the problem, I hacked in setting
instance.host = CONF.host; instance.save() in the compute driver but
that's not a good solution.
To manage notifications about this bug go to:
https://bugs.launchpad.net/cloud-archive/+bug/1535918/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list