[Bug 1452641] Re: Static Ceph mon IP addresses in connection_info can prevent VM startup

Corey Bryant corey.bryant at canonical.com
Mon Jun 11 19:50:02 UTC 2018


Just to summarize my understanding, and perhaps clarify for others, this
bug is focused on stale connection_info for rbd volumes (not rbd
images). rbd images have a related issue during live migration that is
being handled in a separate bug (see comment 12 above).

Focusing on connection_info for rbd volumes now (and thanks to Matt
Riedemann's comments for the tips here). connection_info appears to be
properly refreshed for live migration in pre_live_migration() where
_get_instance_block_device_info() is called with refresh_conn_info=True
(see comment 9 above and
https://github.com/openstack/nova/blob/stable/queens/nova/compute/manager.py#L5977).

Is the fix as simple as flipping refresh_conn_info=False to True for
some of the other calls to _get_instance_block_device_info()? Below is
an audit of the _get_instance_block_device_info() calls.

Calls to _get_instance_block_device_info() with refresh_conn_info=False:
  _destroy_evacuated_instances()
  _init_instance()
  _resume_guests_state()
  _shutdown_instance()
  _power_on()
  _do_rebuild_instance()
  reboot_instance()
  revert_resize()
  _resize_instance()
  resume_instance()
  shelve_offload_instance()
  check_can_live_migrate_source()
  _do_live_migration()
  _post_live_migration()
  post_live_migration_at_destination()
  rollback_live_migration_at_destination()

Calls to _get_instance_block_device_info() with refresh_conn_info=True:
  finish_revert_resize()
  _finish_resize()
  pre_live_migration()

Based on xavpaice's comments in (see comment 13 above -- "... existing, running, instances were fine, fresh new instances were fine, but when we stopped instances via nova, then started them again, they failed to start ..."), it would seem that the following should also have refresh_conn_info=True:
  _power_on()              # solves xavpaice's scenario?
  _do_rebuild_instance()
  reboot_instance()

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to nova in Ubuntu.
https://bugs.launchpad.net/bugs/1452641

Title:
  Static Ceph mon IP addresses in connection_info can prevent VM startup

Status in OpenStack Compute (nova):
  Confirmed
Status in nova package in Ubuntu:
  New

Bug description:
  The Cinder rbd driver extracts the IP addresses of the Ceph mon servers from the Ceph mon map when the instance/volume connection is established. This info is then stored in nova's block-device-mapping table and is never re-validated down the line. 
  Changing the Ceph mon servers' IP adresses will prevent the instance from booting as the stale connection info will enter the instance's XML. One idea to fix this would be to use the information from ceph.conf, which should be an alias or a loadblancer, directly.

To manage notifications about this bug go to:
https://bugs.launchpad.net/nova/+bug/1452641/+subscriptions



More information about the Ubuntu-openstack-bugs mailing list