[Bug 1983716] Re: Improve performances of glance when using rbd backend

Cedric Lemarchand 1983716 at bugs.launchpad.net
Wed Nov 23 14:08:33 UTC 2022


Hello,

Upgrading package from 2.0.0-0ubuntu3 to 2.0.0-0ubuntu4 on a 3 ha-
cluster nodes runs fine.

glance-api process has been restarted, several testing has been made
regarding glance (new images and snapshot creations/deletion), so far on
our installation scope (glance on rbd) everything works smoothly.

-- 
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to python-glance-store in Ubuntu.
https://bugs.launchpad.net/bugs/1983716

Title:
  Improve performances of glance when using rbd backend

Status in python-glance-store package in Ubuntu:
  Invalid
Status in python-glance-store source package in Focal:
  Fix Committed

Bug description:
  [Impact]

  This affect image upload performances, specifically for the instance
  snapshot use case where the upload step to glance is taking very long
  time with large ephemeral volume (like dozen or hundred of GB).

  Backporting the fix will improve performance of image upload to glance
  and thus reduce the whole snapshot duration.

  When the image size is not known, which is true when new images are
  uploaded to glance, and when the glance backend is Ceph, the rbd
  volume need to be grown step by step during the upload. This fix
  increase the size of those steps in order to reduce resize calls on
  the Ceph backend.

  [Test Plan]

  On a functionnal Openstack Ussuri cloud running on Focal:

  1) Initial snapshot time measurement without the fix:
  - spawn an instance with ephemeral root volume and fill ~50GB of data: dd if=/dev/urandom of=~/random bs=1M count=50k
  - snapshot the instance, then look for string "seconds to snapshot" in /var/log/nova/nova-compute.log on the Nova host where the instance is running:
  '''
  nova-compute.log.53.gz:2022-07-11 09:54:11.298 3656801 INFO nova.compute.manager [req-0c9e71e9-c17e-4069-aa68-f7928fab9166 f9ec6328f6646c4c9310ff86ff6c45fca1ead9845dfa8a8dc6c4e461e5355a75 385521b179ea48068fbe5b8ccc3c396c - 24d8399e5ee54c8484cdbf79b8ee7394 24d8399e5ee54c8484cdbf79b8ee7394] [instance: 067acb11-34e6-4626-9c33-e7afa4294dbf] Took 866.04 seconds to snapshot the instance on the hypervisor.
  '''

  2) On the glance-api controller, manually patch python-glance-store 2.0.0:
  - check glance version:

  dpkg -l |grep glance
  ii glance 2:20.2.0-0ubuntu1 all OpenStack Image Registry and Delivery Service - Daemons
  ii glance-api 2:20.2.0-0ubuntu1 all OpenStack Image Registry and Delivery Service - API
  ii glance-common 2:20.2.0-0ubuntu1 all OpenStack Image Registry and Delivery Service - Common
  ii python3-glance 2:20.2.0-0ubuntu1 all OpenStack Image Registry and Delivery Service - Python 3 library
  ii python3-glance-store 2.0.0-0ubuntu3 all OpenStack Image Service store library - Python 3.x
  ii python3-glanceclient 1:3.1.1-0ubuntu1 all Client library for Openstack glance server - Python 3.x

  - git clone https://opendev.org/openstack/glance_store.git -b stable/ussuri /usr/lib/python3/dist-packages/glance_store_trunk/
  - cd /usr/lib/python3/dist-packages/glance_store_trunk/ && git checkout tags/2.0.0 && git cherry-pick ca0c58b
  - systemctl stop glance-api.service
  - mv /usr/lib/python3/dist-packages/glance_store /usr/lib/python3/dist-packages/glance_store_orig && ln -s /usr/lib/python3/dist-packages/glance_store_trunk/glance_store /usr/lib/python3/dist-packages/glance_store
  - systemctl start glance-api.service

  3) Redo step 1)

  Time taken to complete the whole snapshot whould be between 15 and
  ~30% better. Ensure there are no bottleneck on the data path from the
  hypervisors drive to the Ceph cluster.

  [Other Info]

  As Ceph cluster (and more specifically the RADOS sub layer of RBD)
  only accounts written bytes, raise resize size to 8GB is not an issue
  as image size is not accounted. If the cluster is close to full, the
  error will happens during upload, not on the resize.

  
  [original description]

  
  Hello,

  In order to significantly improve performances of images upload on rbd
  store, it would be great if commit [1] can be backported from branch
  2.0.1 to focal package (actually 2.0.0-0ubuntu3).

  Except for image upload, the real use case here is to speedup
  instances snapshots, benchmarks between 2.0.0 and 2.0.1 reports a
  performance gain of ~30%: it drops from 230 to 165 seconds with an
  image of 10GB (metrics shows up in nova-compute.log on the host where
  the snapshot occurs).

  [1] commit ca0c58b52756058b6d51bf6a47aeac3d525c1e16 (HEAD ->
  stable/ussuri, tag: ussuri-em, tag: 2.0.1, origin/stable/ussuri)

To manage notifications about this bug go to:
https://bugs.launchpad.net/ubuntu/+source/python-glance-store/+bug/1983716/+subscriptions




More information about the Ubuntu-openstack-bugs mailing list