[Bug 2081742] Re: cinder powermax driver hangs when creating multiple image volumes
Andre Ruiz
2081742 at bugs.launchpad.net
Mon Dec 9 17:35:41 UTC 2024
** Attachment added: "logs-2024-12-06.tgz"
https://bugs.launchpad.net/charm-cinder/+bug/2081742/+attachment/5843153/+files/logs-2024-12-06.tgz
--
You received this bug notification because you are a member of Ubuntu
OpenStack, which is subscribed to cinder in Ubuntu.
https://bugs.launchpad.net/bugs/2081742
Title:
cinder powermax driver hangs when creating multiple image volumes
Status in OpenStack Cinder Charm:
In Progress
Status in Cinder:
In Progress
Status in cinder package in Ubuntu:
Incomplete
Status in cinder source package in Jammy:
Incomplete
Bug description:
This bug is specific for the PowerMax driver in cinder and technically should be filed over the charm cinder-powermax but that charm has not been onboarded yet so filing in generic cinder for now.
Note: this driver also covers the VMAX line, and the storage in
quesiton is a VMAX-250F (all flash) using FC HBAs.
Note2: A Unisphere appliance has been installed in the customer
environment and seems to be working well (the driver talks to
unisphere and not directly to the storage). So far, there is no reason
to believe the Unisphere intallation has any issues -- it is also used
by other systems without problems. I already checked resource
utilization (cpu, mem, etc.) in the VM where Unisphere is running and
it seems ok.
The installation is Charmed Openstack, with Ubuntu Jammy and Openstack
Yoga. Particular releases of the cinder charm is yoga/stable rev 699
and cinder package version is 2:20.3.1-0ubuntu1.4.
Cinder.conf has been configured (via a subordinate cinder driver
charm) with a backend like this:
=================================8<----------------------------------------
[cinder-vmax]
volume_driver = cinder.volume.drivers.dell_emc.powermax.fc.PowerMaxFCDriver
volume_backend_name = vmax_fc
san_ip = vmplnx5287.<redacted>
san_login = openstack
san_password = <redacted>
powermax_array = 0005978xxxxx
powermax_port_groups = [OS-FC-PG]
powermax_service_level = Diamond
powermax_srp = SRP_1
retries = 800
u4p_failover_autofailback = True
use_multipath_for_image_xfer = True
vmax_workload = NONE
=================================8<----------------------------------------
A type has been created in openstack to use this backend.
Notes about the setup:
- FC (fiber channel) HBAs seem to be working fine. As a test, customer can create volumes in the storage system and present them to the hosts without issues (Ubuntu finds them and you can format/mount/use).
- Cinder driver seems to be communicating fine with Unisphere, you can see on the logs that it logs in and does issues all the api calls without problems.
- Cinder-volume is running on the baremetals because of the FC cards (the service is disabled in the main cinder charm and a separate charm just for cinder-volume is added to the hosts).
- If you try to create volumes inside openstack using "volume create --type vmax_fc" it works, volumes get created on the storage side without errors (you can check that with the storage management software).
- If you try to create a batch of volumes (say 10 volumes) at the same time, they all get created without issues.
- Those volumes created above can be assigned to VMs and used normally.
- If you try to create a volume based on an image (volume create --type vmax_fc --image xxxx"), it does not matter if you just create the volume by itself or if is created as part of a VM/instance creation, it works. Volume gets created, it gets mounted somewhere using cinder-volume somewhere, the image is written in the volume.
- The image creation is relatively fast, takes about 20 to 30 seconds (the whole process of creating the volume, presenting it to a host, downloading the image from glance, writing the image to it). The image in question is a ubuntu jammy image in QCOW2 format that has about 600MB. (I know about performance issues with big images, especially when both image and volume are in ceph RAW is preferred -- but in this case image is not in ceph so download needs to happen anyway and also image is small and it does happen fast).
The issue happens when I try to create a batch (i.e. 10) volumes based
on images. Most volumes, if not all, get stuck and never finish. It is
not a matter of being slow, they just block forever. First few images
block in "downloading" state and the rest block in "creating" state
while waiting for slots in the queue.
The problem seems related only to 1) being based on an image and 2)
being a batch.
The customer reports that he had this same issue when testing
Kolla/Ansible Openstack before installing Charmed Openstack -- so it
may be cinder code and not the charms.
To manage notifications about this bug go to:
https://bugs.launchpad.net/charm-cinder/+bug/2081742/+subscriptions
More information about the Ubuntu-openstack-bugs
mailing list