[kolla-ansible][nova] volume creation time

Eugen Block eblock at nde.ag
Wed Feb 9 08:12:11 UTC 2022


I haven't used iscsi as a backend yet, but for HDDs the speed looks  
relatable, on a system with HDD ceph backend the volume creation of a  
volume (image is 2 GB) takes about 40 seconds, as you yee the download  
is quite slow, the conversion is a little faster:

Image download 541.00 MB at 28.14 MB/s
Converted 2252.00 MB image at 172.08 MB/s

With a factor of 10 (20 GB) I would probably end up with similar  
creation times. Just for comparison, this is almost the same image  
(also 2 GB) in a different ceph cluster where I mounted the cinder  
conversion path from cephfs, SSD pool:

Image download 555.12 MB at 41.34 MB/s
Converted 2252.00 MB image at 769.17 MB/s

This volume was created within 20 seconds. You might also want to  
tweak these options:

block_device_allocate_retries = 300
block_device_allocate_retries_interval = 10

These are the defaults:

block_device_allocate_retries = 60
block_device_allocate_retries_interval = 3

This would fit your error message:

> Volume be0f28eb-1045-4687-8bdb-5a6d385be6fa did not finish being  
> created even after we waited 187 seconds or 61 attempts. And its  
> status is downloading.

It tried 60 times with a 3 second interval, apparently that's not  
enough. Can you see any bottlenecks in the network or disk utilization  
which would slow down the download?


Zitat von Franck VEDEL <franck.vedel at univ-grenoble-alpes.fr>:

> Hi Eugen,
> thanks for your help
> We have 3 servers (s1, s2 , s3) and an iscsi bay attached on s3.
> Multinode:
> [control]
> s1
> s2
>
> [compute]
> s1
> s2
> s3
>
> [storage]
> s3
>
> on s1: more /etc/kolla/globals.yml
> ...
> enable_cinder: "yes"
> enable_cinder_backend_iscsi: "yes"
> enable_cinder_backend_lvm: « yes"
> enable_iscsid: « yes"
> cinder_volume_group: "cinder-volumes »
> ...
> enable_glance_image_cache: "yes"
> glance_cache_max_size: "21474836480"
> glance_file_datadir_volume: « /images/«
> ...
>
> on s3:  /images is on the iscsi bay
> mount |grep images
> /dev/mapper/VG--IMAGES-LV--IMAGES on /images type xfs  
> (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,sunit=1024,swidth=1024,noquota)
>
> lsblk
> sdf                                                                   
>            8:80   0   500G  0 disk
> └─mpathc                                                              
>          253:3    0   500G  0 mpath
>  └─mpathc1                                                            
>         253:4    0   500G  0 part
>    └─VG--IMAGES-LV--IMAGES                                            
>         253:5    0   500G  0 lvm   /images
>
>
> ls -l /images:
> drwxr-x---. 5 42415 42415 4096  6 févr. 18:40 image-cache
> drwxr-x---. 2 42415 42415 4096  4 févr. 15:16 images
> drwxr-x---. 2 42415 42415    6 22 nov.  12:03 staging
> drwxr-x---. 2 42415 42415    6 22 nov.  12:03 tasks_work_dir
>
> ls -l /images/image-cache
> total 71646760
> -rw-r-----. 1 42415 42415   360841216  2 déc.  11:52  
> 3e3aada8-7610-4c55-b116-a12db68f8ea4
> -rw-r-----. 1 42415 42415   237436928 28 nov.  16:56  
> 6419642b-fcbd-4e5d-9c77-46a48d2af93f
> -rw-r-----. 1 42415 42415 10975379456 26 nov.  14:59  
> 7490e914-8001-4d56-baea-fabf80f425e1
> -rw-r-----. 1 42415 42415 21474836480 22 nov.  16:46  
> 7fc7f9a6-ab0e-45cf-9c29-7e59f6aa68a5
> -rw-r-----. 1 42415 42415  2694512640 15 déc.  18:07  
> 890fd2e8-2fac-42c6-956b-6b10f2253a56
> -rw-r-----. 1 42415 42415 12048400384  1 déc.  17:04  
> 9a235763-ff0c-40fd-9a8d-7cdca3d3e9ce
> -rw-r-----. 1 42415 42415  5949227008 15 déc.  20:41  
> 9cbba37b-1de1-482a-87f2-631d2143cd46
> -rw-r-----. 1 42415 42415   566994944  6 déc.  12:32  
> b6e29dd9-a66d-4569-a222-6fc0bd9b1b11
> -rw-r-----. 1 42415 42415   578748416  2 déc.  11:24  
> c40953ee-4b39-43a5-8f6c-b48a046c38e9
> -rw-r-----. 1 42415 42415    16300544 27 janv. 12:19  
> c88630c7-a7c6-44ff-bfa0-e5af4b1720e3
> -rw-r-----. 1 42415 42415       12288  6 févr. 18:40 cache.db
> -rw-r-----. 1 42415 42415 12324503552  1 déc.  07:50  
> e0d4fddd-5aa7-4177-a1d6-e6b4c56f12e8
> -rw-r-----. 1 42415 42415  6139084800 22 nov.  15:05  
> eda93204-9846-4216-a6e8-c29977fdcf2f
> -rw-r-----. 1 42415 42415           0 22 nov.  12:03 image_cache_db_init
> drwxr-x---. 2 42415 42415           6 27 janv. 12:19 incomplete
> drwxr-x---. 2 42415 42415           6 22 nov.  12:03 invalid
> drwxr-x---. 2 42415 42415           6 22 nov.  12:03 queue
>
> on s1
> openstack image list
> +--------------------------------------+-----------------------------+--------+
> | ID                                   | Name                         
> | Status |
> +--------------------------------------+-----------------------------+--------+
> …..
> | 7fc7f9a6-ab0e-45cf-9c29-7e59f6aa68a5 | rocky8.4                     
> | active |
> ….
> | 7490e914-8001-4d56-baea-fabf80f425e1 | win10_2104                   
> | active |
> ….
> +———————————————————+-----------------------------+————+
>
>
> openstack image show 7fc7f9a6-ab0e-45cf-9c29-7e59f6aa68a5
> disk_format      | raw
>
> when I try to add an instance from this image (2G RAM, 40G HDD):
> [Error : Build of instance baa06bef-9628-407f-8bae-500ef7bce065  
> aborted: Volume be0f28eb-1045-4687-8bdb-5a6d385be6fa did not finish  
> being created even after we waited 187 seconds or 61 attempts. And  
> its status is downloading.
>
> it’s impossible. I need to add the volume from image first, and  
> after add instance from volume.
>
> Is it normal ?
>
>
> Franck
>
>> Le 7 févr. 2022 à 10:55, Eugen Block <eblock at nde.ag> a écrit :
>>
>> Hi Franck,
>>
>> although it's a different topic than your original question I  
>> wanted to comment on the volume creation time (maybe a new thread  
>> would make sense). What is your storage back end? If it is ceph,  
>> are your images in raw format? Otherwise cinder has to download the  
>> image from glance (to /var/lib/cinder/conversion) and convert it,  
>> then upload it back to ceph. It's similar with nova, nova stores  
>> base images in /var/lib/nova/instances/_base to prevent the compute  
>> nodes from downloading it every time. This may save some time for  
>> the download, but the upload has to happen anyway. And if you don't  
>> use shared storage for nova (e.g. for live-migration) you may  
>> encounter that some compute nodes are quicker creating an instance  
>> because they only have to upload, others will first have to  
>> download, convert and then upload it.
>>
>> You would see the conversion in the logs of cinder:
>>
>> INFO cinder.image.image_utils  
>> [req-f2062570-4006-464b-a1f5-d0d5ac34670d  
>> d71f59600f1c40c394022738d4864915 31b9b4900a4d4bdaabaf263d0b4021be -  
>> - -] Converted 2252.00 MB image at 757.52 MB/s
>>
>> Hope this helps.
>>
>> Eugen
>>
>>
>> Zitat von Franck VEDEL <franck.vedel at univ-grenoble-alpes.fr>:
>>
>>> Sunday morning: my openstack works…. OUF.
>>> the "kolla-ansible -i multimode mariadb_recovery"  command (which  
>>> is magic anyway) fixed the problem and then the mariadb and nova  
>>> containers started.
>>> Once solved the problems between my serv3 and the iscsi bay,  
>>> restart the container glance, everything seems to work.
>>>
>>>> 4 minutes to create a 20GB empty volume seems too long to me. For  
>>>> an actual 20GB image, it's going to depend on the speed of the  
>>>> backing storage tech.
>>> 4 minutes is for a volume from an image. I will see this problem  
>>> next summer , I will retry to change the MTU value.
>>>
>>> Thanks a lot, really
>>>
>>>
>>> Franck
>>>
>>>> Le 5 févr. 2022 à 17:08, Laurent Dumont  
>>>> <laurentfdumont at gmail.com> a écrit :
>>>>
>>>> Any chance to revert back switches + server? That would indicate  
>>>> that MTU was the issue.
>>>> Dont ping the iscsi bay, ping between the controllers in  
>>>> Openstack, they are the ones running mariadb/galera.
>>>> Since the icmp packets are small, it might not trigger the MTU  
>>>> issues. Can you try something like "ping -s 8972 -M do -c 4  
>>>> $mariadb_host_2" from $ mariadb_host_1?
>>>> What is your network setup on the servers? Two ports in a bond?  
>>>> Did you change both physical interface MTU + bond interface itself?
>>>>
>>>> 4 minutes to create a 20GB empty volume seems too long to me. For  
>>>> an actual 20GB image, it's going to depend on the speed of the  
>>>> backing storage tech.
>>>>
>>>> On Sat, Feb 5, 2022 at 1:51 AM Franck VEDEL  
>>>> <franck.vedel at univ-grenoble-alpes.fr  
>>>> <mailto:franck.vedel at univ-grenoble-alpes.fr>> wrote:
>>>> Thanks for your help.
>>>>
>>>>
>>>>> What was the starting value for MTU?
>>>> 1500
>>>>> What was the starting value changed to for MTU?
>>>> 9000
>>>>> Can ping between all your controllers?
>>>> yes, all container starts except nova-conductor, nova-scheduler, maraidb
>>>>
>>>>
>>>>> Do you just have two controllers running mariadb?
>>>> yes
>>>>> How did you change MTU?
>>>>
>>>> On the 3 servers:
>>>> nmcli connection modify team0-port1 802-3-ethernet.mtu 9000
>>>> nmcli connection modify team1-port2 802-3-ethernet.mtu 9000
>>>> nmcli connection modify type team0 team.runner lack ethernet.mtu 9000
>>>> nmcli con down team0
>>>> nmcli con down team1
>>>>
>>>>
>>>>> Was the change reverted at the network level as well (switches  
>>>>> need to be configured higher or at the same MTU value then the  
>>>>> servers)
>>>> I didn’t change Mtu on network (switches) , but ping -s  
>>>> 10.0.5.117 (iscsi bay) was working from serv3.
>>>>
>>>> I changed the value of the mtu because the creation of the  
>>>> volumes takes a lot of time I find (4 minutes for 20G, which is  
>>>> too long for what I want to do, the patience of the students  
>>>> decreases with the years)
>>>>
>>>> Franck
>>>>
>>>>> Le 4 févr. 2022 à 23:12, Laurent Dumont  
>>>>> <laurentfdumont at gmail.com <mailto:laurentfdumont at gmail.com>> a  
>>>>> écrit :
>>>>>
>>>>> What was the starting value for MTU?
>>>>> What was the starting value changed to for MTU?
>>>>> Can ping between all your controllers?
>>>>> Do you just have two controllers running mariadb?
>>>>> How did you change MTU?
>>>>> Was the change reverted at the network level as well (switches  
>>>>> need to be configured higher or at the same MTU value then the  
>>>>> servers)
>>>>> 4567 seems to be the port for galera (clustering for mariadb) <>
>>>>> On Fri, Feb 4, 2022 at 11:52 AM Franck VEDEL  
>>>>> <franck.vedel at univ-grenoble-alpes.fr  
>>>>> <mailto:franck.vedel at univ-grenoble-alpes.fr>> wrote:
>>>>> Hello,
>>>>> I am in an emergency situation, quite catastrophic situation  
>>>>> because I do not know what to do.
>>>>>
>>>>> I have an Openstack cluster with 3 servers (serv1, serv2,  
>>>>> serv3). He was doing so well…
>>>>>
>>>>>
>>>>> A network admin came to me and told me to change an MTU on the  
>>>>> cards. I knew it shouldn't be done...I shouldn't have done it.
>>>>> I did it.
>>>>> Of course, it didn't work as expected. I went back to my  
>>>>> starting configuration and there I have a big problem with  
>>>>> mariadb which is set up on serv1 and serv2.
>>>>>
>>>>> Here are my errors:
>>>>>
>>>>>
>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP: failed to open gcomm  
>>>>> backend connection: 110: failed to reach primary view: 110  
>>>>> (Connection timed out)
>>>>> 	 at gcomm/src/pc.cpp:connect():160
>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP:  
>>>>> gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open backend  
>>>>> connection: -110 (Connection timed out)
>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP:  
>>>>> gcs/src/gcs.cpp:gcs_open():1475: Failed to open channel  
>>>>> 'openstack' at 'gcomm://10.0.5.109:4567,10.0.5.110:4567': <>  
>>>>> -110 (Connection timed out)
>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP: gcs connect failed:  
>>>>> Connection timed out
>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP:  
>>>>> wsrep::connect(gcomm://10.0.5.109:4567,10.0.5.110:4567 <>)  
>>>>> failed: 7
>>>>> 2022-02-04 17:40:36 0 [ERROR] Aborting
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> I do not know what to do. My installation is done with  
>>>>> kolla-ansible, mariadb docker restarts every 30 seconds.
>>>>>
>>>>> Can the "kolla-ansible reconfigure mariadb" command be a solution?
>>>>> Could the command "kolla-ansible mariadb recovery" be a solution?
>>>>>
>>>>> Thanks in advance if you can help me.
>>>>>
>>>>>
>>>>>
>>>>> Franck
>>>>>
>>>>>
>>>>
>>
>>
>>
>>





More information about the openstack-discuss mailing list