[kolla-ansible][nova] volume creation time

Eugen Block eblock at nde.ag
Mon Feb 14 13:30:12 UTC 2022


Hi,

> my openstack being operational and must absolutely be ok for several  
> months (student projects), I don't touch it...
>
> In France we have the saying "Scalded cat fears cold water »

I understand :-)

> But I put all this information aside. I will have to look at this  
> more closely, and especially that I change storage by switching to  
> ceph.

Although Ceph might be challenging at the beginning I'm a big fan of  
it and can only encourage you to do that.


Zitat von Franck VEDEL <franck.vedel at univ-grenoble-alpes.fr>:

> Thank you for your help.
>
> my openstack being operational and must absolutely be ok for several  
> months (student projects), I don't touch it...
>
> In France we have the saying "Scalded cat fears cold water »
>
> But I put all this information aside. I will have to look at this  
> more closely, and especially that I change storage by switching to  
> ceph.
> Thanks again Eugen
>
> Franck
>> Le 9 févr. 2022 à 09:12, Eugen Block <eblock at nde.ag> a écrit :
>>
>> I haven't used iscsi as a backend yet, but for HDDs the speed looks  
>> relatable, on a system with HDD ceph backend the volume creation of  
>> a volume (image is 2 GB) takes about 40 seconds, as you yee the  
>> download is quite slow, the conversion is a little faster:
>>
>> Image download 541.00 MB at 28.14 MB/s
>> Converted 2252.00 MB image at 172.08 MB/s
>>
>> With a factor of 10 (20 GB) I would probably end up with similar  
>> creation times. Just for comparison, this is almost the same image  
>> (also 2 GB) in a different ceph cluster where I mounted the cinder  
>> conversion path from cephfs, SSD pool:
>>
>> Image download 555.12 MB at 41.34 MB/s
>> Converted 2252.00 MB image at 769.17 MB/s
>>
>> This volume was created within 20 seconds. You might also want to  
>> tweak these options:
>>
>> block_device_allocate_retries = 300
>> block_device_allocate_retries_interval = 10
>>
>> These are the defaults:
>>
>> block_device_allocate_retries = 60
>> block_device_allocate_retries_interval = 3
>>
>> This would fit your error message:
>>
>>> Volume be0f28eb-1045-4687-8bdb-5a6d385be6fa did not finish being  
>>> created even after we waited 187 seconds or 61 attempts. And its  
>>> status is downloading.
>>
>> It tried 60 times with a 3 second interval, apparently that's not  
>> enough. Can you see any bottlenecks in the network or disk  
>> utilization which would slow down the download?
>>
>>
>> Zitat von Franck VEDEL <franck.vedel at univ-grenoble-alpes.fr>:
>>
>>> Hi Eugen,
>>> thanks for your help
>>> We have 3 servers (s1, s2 , s3) and an iscsi bay attached on s3.
>>> Multinode:
>>> [control]
>>> s1
>>> s2
>>>
>>> [compute]
>>> s1
>>> s2
>>> s3
>>>
>>> [storage]
>>> s3
>>>
>>> on s1: more /etc/kolla/globals.yml
>>> ...
>>> enable_cinder: "yes"
>>> enable_cinder_backend_iscsi: "yes"
>>> enable_cinder_backend_lvm: « yes"
>>> enable_iscsid: « yes"
>>> cinder_volume_group: "cinder-volumes »
>>> ...
>>> enable_glance_image_cache: "yes"
>>> glance_cache_max_size: "21474836480"
>>> glance_file_datadir_volume: « /images/«
>>> ...
>>>
>>> on s3:  /images is on the iscsi bay
>>> mount |grep images
>>> /dev/mapper/VG--IMAGES-LV--IMAGES on /images type xfs  
>>> (rw,relatime,seclabel,attr2,inode64,logbufs=8,logbsize=32k,sunit=1024,swidth=1024,noquota)
>>>
>>> lsblk
>>> sdf                                                                 
>>>              8:80   0   500G  0 disk
>>> └─mpathc                                                            
>>>            253:3    0   500G  0 mpath
>>> └─mpathc1                                                           
>>>          253:4    0   500G  0 part
>>>   └─VG--IMAGES-LV--IMAGES                                           
>>>          253:5    0   500G  0 lvm   /images
>>>
>>>
>>> ls -l /images:
>>> drwxr-x---. 5 42415 42415 4096  6 févr. 18:40 image-cache
>>> drwxr-x---. 2 42415 42415 4096  4 févr. 15:16 images
>>> drwxr-x---. 2 42415 42415    6 22 nov.  12:03 staging
>>> drwxr-x---. 2 42415 42415    6 22 nov.  12:03 tasks_work_dir
>>>
>>> ls -l /images/image-cache
>>> total 71646760
>>> -rw-r-----. 1 42415 42415   360841216  2 déc.  11:52  
>>> 3e3aada8-7610-4c55-b116-a12db68f8ea4
>>> -rw-r-----. 1 42415 42415   237436928 28 nov.  16:56  
>>> 6419642b-fcbd-4e5d-9c77-46a48d2af93f
>>> -rw-r-----. 1 42415 42415 10975379456 26 nov.  14:59  
>>> 7490e914-8001-4d56-baea-fabf80f425e1
>>> -rw-r-----. 1 42415 42415 21474836480 22 nov.  16:46  
>>> 7fc7f9a6-ab0e-45cf-9c29-7e59f6aa68a5
>>> -rw-r-----. 1 42415 42415  2694512640 15 déc.  18:07  
>>> 890fd2e8-2fac-42c6-956b-6b10f2253a56
>>> -rw-r-----. 1 42415 42415 12048400384  1 déc.  17:04  
>>> 9a235763-ff0c-40fd-9a8d-7cdca3d3e9ce
>>> -rw-r-----. 1 42415 42415  5949227008 15 déc.  20:41  
>>> 9cbba37b-1de1-482a-87f2-631d2143cd46
>>> -rw-r-----. 1 42415 42415   566994944  6 déc.  12:32  
>>> b6e29dd9-a66d-4569-a222-6fc0bd9b1b11
>>> -rw-r-----. 1 42415 42415   578748416  2 déc.  11:24  
>>> c40953ee-4b39-43a5-8f6c-b48a046c38e9
>>> -rw-r-----. 1 42415 42415    16300544 27 janv. 12:19  
>>> c88630c7-a7c6-44ff-bfa0-e5af4b1720e3
>>> -rw-r-----. 1 42415 42415       12288  6 févr. 18:40 cache.db
>>> -rw-r-----. 1 42415 42415 12324503552  1 déc.  07:50  
>>> e0d4fddd-5aa7-4177-a1d6-e6b4c56f12e8
>>> -rw-r-----. 1 42415 42415  6139084800 22 nov.  15:05  
>>> eda93204-9846-4216-a6e8-c29977fdcf2f
>>> -rw-r-----. 1 42415 42415           0 22 nov.  12:03 image_cache_db_init
>>> drwxr-x---. 2 42415 42415           6 27 janv. 12:19 incomplete
>>> drwxr-x---. 2 42415 42415           6 22 nov.  12:03 invalid
>>> drwxr-x---. 2 42415 42415           6 22 nov.  12:03 queue
>>>
>>> on s1
>>> openstack image list
>>> +--------------------------------------+-----------------------------+--------+
>>> | ID                                   | Name                       
>>>   | Status |
>>> +--------------------------------------+-----------------------------+--------+
>>> …..
>>> | 7fc7f9a6-ab0e-45cf-9c29-7e59f6aa68a5 | rocky8.4                   
>>>   | active |
>>> ….
>>> | 7490e914-8001-4d56-baea-fabf80f425e1 | win10_2104                 
>>>   | active |
>>> ….
>>> +———————————————————+-----------------------------+————+
>>>
>>>
>>> openstack image show 7fc7f9a6-ab0e-45cf-9c29-7e59f6aa68a5
>>> disk_format      | raw
>>>
>>> when I try to add an instance from this image (2G RAM, 40G HDD):
>>> [Error : Build of instance baa06bef-9628-407f-8bae-500ef7bce065  
>>> aborted: Volume be0f28eb-1045-4687-8bdb-5a6d385be6fa did not  
>>> finish being created even after we waited 187 seconds or 61  
>>> attempts. And its status is downloading.
>>>
>>> it’s impossible. I need to add the volume from image first, and  
>>> after add instance from volume.
>>>
>>> Is it normal ?
>>>
>>>
>>> Franck
>>>
>>>> Le 7 févr. 2022 à 10:55, Eugen Block <eblock at nde.ag> a écrit :
>>>>
>>>> Hi Franck,
>>>>
>>>> although it's a different topic than your original question I  
>>>> wanted to comment on the volume creation time (maybe a new thread  
>>>> would make sense). What is your storage back end? If it is ceph,  
>>>> are your images in raw format? Otherwise cinder has to download  
>>>> the image from glance (to /var/lib/cinder/conversion) and convert  
>>>> it, then upload it back to ceph. It's similar with nova, nova  
>>>> stores base images in /var/lib/nova/instances/_base to prevent  
>>>> the compute nodes from downloading it every time. This may save  
>>>> some time for the download, but the upload has to happen anyway.  
>>>> And if you don't use shared storage for nova (e.g. for  
>>>> live-migration) you may encounter that some compute nodes are  
>>>> quicker creating an instance because they only have to upload,  
>>>> others will first have to download, convert and then upload it.
>>>>
>>>> You would see the conversion in the logs of cinder:
>>>>
>>>> INFO cinder.image.image_utils  
>>>> [req-f2062570-4006-464b-a1f5-d0d5ac34670d  
>>>> d71f59600f1c40c394022738d4864915 31b9b4900a4d4bdaabaf263d0b4021be  
>>>> - - -] Converted 2252.00 MB image at 757.52 MB/s
>>>>
>>>> Hope this helps.
>>>>
>>>> Eugen
>>>>
>>>>
>>>> Zitat von Franck VEDEL <franck.vedel at univ-grenoble-alpes.fr>:
>>>>
>>>>> Sunday morning: my openstack works…. OUF.
>>>>> the "kolla-ansible -i multimode mariadb_recovery"  command  
>>>>> (which is magic anyway) fixed the problem and then the mariadb  
>>>>> and nova containers started.
>>>>> Once solved the problems between my serv3 and the iscsi bay,  
>>>>> restart the container glance, everything seems to work.
>>>>>
>>>>>> 4 minutes to create a 20GB empty volume seems too long to me.  
>>>>>> For an actual 20GB image, it's going to depend on the speed of  
>>>>>> the backing storage tech.
>>>>> 4 minutes is for a volume from an image. I will see this problem  
>>>>> next summer , I will retry to change the MTU value.
>>>>>
>>>>> Thanks a lot, really
>>>>>
>>>>>
>>>>> Franck
>>>>>
>>>>>> Le 5 févr. 2022 à 17:08, Laurent Dumont  
>>>>>> <laurentfdumont at gmail.com> a écrit :
>>>>>>
>>>>>> Any chance to revert back switches + server? That would  
>>>>>> indicate that MTU was the issue.
>>>>>> Dont ping the iscsi bay, ping between the controllers in  
>>>>>> Openstack, they are the ones running mariadb/galera.
>>>>>> Since the icmp packets are small, it might not trigger the MTU  
>>>>>> issues. Can you try something like "ping -s 8972 -M do -c 4  
>>>>>> $mariadb_host_2" from $ mariadb_host_1?
>>>>>> What is your network setup on the servers? Two ports in a bond?  
>>>>>> Did you change both physical interface MTU + bond interface  
>>>>>> itself?
>>>>>>
>>>>>> 4 minutes to create a 20GB empty volume seems too long to me.  
>>>>>> For an actual 20GB image, it's going to depend on the speed of  
>>>>>> the backing storage tech.
>>>>>>
>>>>>> On Sat, Feb 5, 2022 at 1:51 AM Franck VEDEL  
>>>>>> <franck.vedel at univ-grenoble-alpes.fr  
>>>>>> <mailto:franck.vedel at univ-grenoble-alpes.fr>> wrote:
>>>>>> Thanks for your help.
>>>>>>
>>>>>>
>>>>>>> What was the starting value for MTU?
>>>>>> 1500
>>>>>>> What was the starting value changed to for MTU?
>>>>>> 9000
>>>>>>> Can ping between all your controllers?
>>>>>> yes, all container starts except nova-conductor, nova-scheduler, maraidb
>>>>>>
>>>>>>
>>>>>>> Do you just have two controllers running mariadb?
>>>>>> yes
>>>>>>> How did you change MTU?
>>>>>>
>>>>>> On the 3 servers:
>>>>>> nmcli connection modify team0-port1 802-3-ethernet.mtu 9000
>>>>>> nmcli connection modify team1-port2 802-3-ethernet.mtu 9000
>>>>>> nmcli connection modify type team0 team.runner lack ethernet.mtu 9000
>>>>>> nmcli con down team0
>>>>>> nmcli con down team1
>>>>>>
>>>>>>
>>>>>>> Was the change reverted at the network level as well (switches  
>>>>>>> need to be configured higher or at the same MTU value then the  
>>>>>>> servers)
>>>>>> I didn’t change Mtu on network (switches) , but ping -s  
>>>>>> 10.0.5.117 (iscsi bay) was working from serv3.
>>>>>>
>>>>>> I changed the value of the mtu because the creation of the  
>>>>>> volumes takes a lot of time I find (4 minutes for 20G, which is  
>>>>>> too long for what I want to do, the patience of the students  
>>>>>> decreases with the years)
>>>>>>
>>>>>> Franck
>>>>>>
>>>>>>> Le 4 févr. 2022 à 23:12, Laurent Dumont  
>>>>>>> <laurentfdumont at gmail.com <mailto:laurentfdumont at gmail.com>> a  
>>>>>>> écrit :
>>>>>>>
>>>>>>> What was the starting value for MTU?
>>>>>>> What was the starting value changed to for MTU?
>>>>>>> Can ping between all your controllers?
>>>>>>> Do you just have two controllers running mariadb?
>>>>>>> How did you change MTU?
>>>>>>> Was the change reverted at the network level as well (switches  
>>>>>>> need to be configured higher or at the same MTU value then the  
>>>>>>> servers)
>>>>>>> 4567 seems to be the port for galera (clustering for mariadb) <>
>>>>>>> On Fri, Feb 4, 2022 at 11:52 AM Franck VEDEL  
>>>>>>> <franck.vedel at univ-grenoble-alpes.fr  
>>>>>>> <mailto:franck.vedel at univ-grenoble-alpes.fr>> wrote:
>>>>>>> Hello,
>>>>>>> I am in an emergency situation, quite catastrophic situation  
>>>>>>> because I do not know what to do.
>>>>>>>
>>>>>>> I have an Openstack cluster with 3 servers (serv1, serv2,  
>>>>>>> serv3). He was doing so well…
>>>>>>>
>>>>>>>
>>>>>>> A network admin came to me and told me to change an MTU on the  
>>>>>>> cards. I knew it shouldn't be done...I shouldn't have done it.
>>>>>>> I did it.
>>>>>>> Of course, it didn't work as expected. I went back to my  
>>>>>>> starting configuration and there I have a big problem with  
>>>>>>> mariadb which is set up on serv1 and serv2.
>>>>>>>
>>>>>>> Here are my errors:
>>>>>>>
>>>>>>>
>>>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP: failed to open gcomm  
>>>>>>> backend connection: 110: failed to reach primary view: 110  
>>>>>>> (Connection timed out)
>>>>>>> 	 at gcomm/src/pc.cpp:connect():160
>>>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP:  
>>>>>>> gcs/src/gcs_core.cpp:gcs_core_open():209: Failed to open  
>>>>>>> backend connection: -110 (Connection timed out)
>>>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP:  
>>>>>>> gcs/src/gcs.cpp:gcs_open():1475: Failed to open channel  
>>>>>>> 'openstack' at 'gcomm://10.0.5.109:4567,10.0.5.110:4567': <>  
>>>>>>> -110 (Connection timed out)
>>>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP: gcs connect failed:  
>>>>>>> Connection timed out
>>>>>>> 2022-02-04 17:40:36 0 [ERROR] WSREP:  
>>>>>>> wsrep::connect(gcomm://10.0.5.109:4567,10.0.5.110:4567 <>)  
>>>>>>> failed: 7
>>>>>>> 2022-02-04 17:40:36 0 [ERROR] Aborting
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I do not know what to do. My installation is done with  
>>>>>>> kolla-ansible, mariadb docker restarts every 30 seconds.
>>>>>>>
>>>>>>> Can the "kolla-ansible reconfigure mariadb" command be a solution?
>>>>>>> Could the command "kolla-ansible mariadb recovery" be a solution?
>>>>>>>
>>>>>>> Thanks in advance if you can help me.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Franck
>>>>>>>
>>>>>>>
>>>>>>
>>>>
>>>>
>>>>
>>>>
>>
>>






More information about the openstack-discuss mailing list