On 13/01, Ignazio Cassano wrote:
Hellp Gorka, here you can find more details: https://paste.openstack.org/show/812091/ Many thanks Ignazio
Hi, Given the reported data from the backends, which is: nfsgold1: max_over_subscription_ratio = 20.0 total_capacity_gb = 1945.6 free_capacity_gb = 609.68 reserved_percentage = 0 allocated_capacity_gb = 0 nfsgold2: max_over_subscription_ratio = 20.0 total_capacity_gb = 972.8 free_capacity_gb = 970.36 reserved_percentage = 0 allocated_capacity_gb = 0 Since those backends are not reporting the provisioned_capacity_gb, then it is assigned the same value as the allocated_capacity_gb, which is the sum of existing volumes in Cinder for that backend. I see it is reporting 0 here, so I assume you are using the same storage pool for multiple things and you still don't have volumes in Cinder. The calculation the scheduler does for the weighting is as follows: virtual-free-capacity = total_capacity_gb * max_over_subscription_ratio - provisioned_capacity - math.floor(total_capacity_gb * reserved_percentage) Which results in: nfsgold1 = 1945.6 * 20.0 - 0 - math.floor(1945.6 * 0) = 38,913.8 nfsgold2 = 972.8 * 20.0 - 0 - math.floor(1945.6 * 0) = 19,456 So nfsgold1 is returning a greater value, and therefore is winning the weighing, so only when there is no longer space in nfsgold1 and the filtering fails will nfsgold2 be used. If you look at the debug logs you should see that it describes which backends start the filtering, which ones pass each filter, and then which ones are weighed and which one wins. I see that the NetApp driver has a way to report the provisioned capacity (netapp_driver_reports_provisiones_capacity) that may be able to help you. Another way to resolve the issue may be to use an exclusive pool in the backend. Cheers, Gorka.
Il giorno gio 13 gen 2022 alle ore 13:08 Gorka Eguileor <geguileo@redhat.com> ha scritto:
On 13/01, Ignazio Cassano wrote:
Hello, I am using nfsgold volume type. [root@tst-controller-01 ansible]# cinder type-show nfsgold
+---------------------------------+--------------------------------------+
| Property | Value |
+---------------------------------+--------------------------------------+
| description | None | | extra_specs | volume_backend_name : nfsgold | | id | fd8b1cc8-4c3a-490d-bc95-29e491f850cc | | is_public | True | | name | nfsgold | | os-volume-type-access:is_public | True | | qos_specs_id | None |
+---------------------------------+--------------------------------------+
cinder get-pools
+----------+--------------------------------------------------------------------+
| Property | Value |
+----------+--------------------------------------------------------------------+
| name | cinder-cluster-1@nfsgold2#10.102.189.156: /svm_tstcinder_cl2_volssd |
+----------+--------------------------------------------------------------------+
+----------+--------------------------------------------------------------------+
| Property | Value |
+----------+--------------------------------------------------------------------+
| name | cinder-cluster-1@nfsgold1#10.102.189.155: /svm_tstcinder_cl1_volssd |
+----------+--------------------------------------------------------------------+
Hi,
We would need to see the details of the pools to see additional information:
$ cinder get-pools --detail
I noted that nfsgold2 is used also when nfsgold1 is almost full. I expected the volume was created on share with more space availability. Ignazio
Then the capacity filtering seems to be working as expected (we can confirm looking at the debug logs and seeing if both backends pass the filtering). You could see in the logs that both of them are passing the filtering and are valid to create volumes.
The thing we'd have to look into is the weighing phase, where the scheduler is selecting nfsgold1 as the best option.
I assume you haven't changed the defaults in the configuration options "scheduler_default_weighers" or in "scheduler_weight_handler".
So it must be using the "CapacityWeigher". Are you using default values for "capacity_weigher_multiplier" and "allocated_capacity_weight_multiplier" config options?
When using defaults the capacity weigher should be spread volumes instead of stacking them.
I still think that the best way to debug this is to view the debug logs. In Stein you should be able to dynamically change the logging level of the scheduler services to debug without restarting the services, and then changing it back to info.
Cheers, Gorka.
Il giorno gio 13 gen 2022 alle ore 12:03 Gorka Eguileor <
ha scritto:
On 13/01, Ignazio Cassano wrote:
Hello, I am using openstack stein on centos 7 with netapp ontap driver. Seems capacity filter is not working and volumes are always creed on
geguileo@redhat.com> the
first share where less space is available. My configuration is posted here: enabled_backends = nfsgold1, nfsgold2
[nfsgold1] nas_secure_file_operations = false nas_secure_file_permissions = false volume_driver = cinder.volume.drivers.netapp.common.NetAppDriver netapp_storage_family = ontap_cluster netapp_storage_protocol = nfs netapp_vserver = svm-tstcinder2-cl1 netapp_server_hostname = faspod2.csi.it netapp_server_port = 80 netapp_login = apimanager netapp_password = password nfs_shares_config = /etc/cinder/nfsgold1_shares volume_backend_name = nfsgold #nfs_mount_options = lookupcache=pos nfs_mount_options = lookupcache=pos
[nfsgold2] nas_secure_file_operations = false nas_secure_file_permissions = false volume_driver = cinder.volume.drivers.netapp.common.NetAppDriver netapp_storage_family = ontap_cluster netapp_storage_protocol = nfs netapp_vserver = svm-tstcinder2-cl2 netapp_server_hostname = faspod2.csi.it netapp_server_port = 80 netapp_login = apimanager netapp_password = password nfs_shares_config = /etc/cinder/nfsgold2_shares volume_backend_name = nfsgold #nfs_mount_options = lookupcache=pos nfs_mount_options = lookupcache=pos
Volumes are created always on nfsgold1 also if has less space available of nfsgold2 share Thanks Ignazio
Hi,
What volume type are you using to create the volumes? If you don't define it it would use the default from the cinder.conf file.
What are the extra specs of the volume type?
What pool info are the NetApp backends reporting?
It's usually a good idea to enabled debugging on the schedulers and look at the details of how they are making the filtering and weighting decisions.
Cheers, Gorka.