cinder.volume.drivers.rbd connecting
Hi, we have faced some problems when creating volumes to add to VMs, to see what was happening I activated the Debug=True mode of Cinder in the cinder.conf file. I see that when I try to create a new volume I get the following in the log: "DEBUG cinder.volume.drivers.rbd connecting to (conf=/etc/ceph/ceph.conf, timeout=-1) _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431” I’m using OpenStack Train and Ceph Octopus. When I check with openstack volume service list +------------------+----------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------+------+---------+-------+----------------------------+ | cinder-scheduler | spsrc-controller-1 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-scheduler | spsrc-controller-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-scheduler | spsrc-controller-3 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-11T10:48:42.000000 | | cinder-backup | spsrc-mon-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-backup | spsrc-mon-1 | nova | enabled | up | 2021-05-11T10:06:44.000000 | | cinder-backup | spsrc-mon-3 | nova | enabled | up | 2021-05-11T10:06:47.000000 | +------------------+----------------------+------+---------+-------+——————————————+ So cinder-volume is Down, I compare "cinder-backup" Ceph config with "cinder-volume", and they are equal! so why only one of them works? diff /etc/kolla/cinder-backup/ceph.conf /etc/kolla/cinder-volume/ceph.conf I go inside the "cinder_volume" container docker exec -it cinder_volume /bin/bash Try listing cinder volumes, works! rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls Any Ideas. Kind regards.
On 12/05, ManuParra wrote:
Hi, we have faced some problems when creating volumes to add to VMs, to see what was happening I activated the Debug=True mode of Cinder in the cinder.conf file. I see that when I try to create a new volume I get the following in the log:
"DEBUG cinder.volume.drivers.rbd connecting to (conf=/etc/ceph/ceph.conf, timeout=-1) _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431”
I’m using OpenStack Train and Ceph Octopus. When I check with openstack volume service list
+------------------+----------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------+------+---------+-------+----------------------------+ | cinder-scheduler | spsrc-controller-1 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-scheduler | spsrc-controller-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-scheduler | spsrc-controller-3 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-11T10:48:42.000000 | | cinder-backup | spsrc-mon-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-backup | spsrc-mon-1 | nova | enabled | up | 2021-05-11T10:06:44.000000 | | cinder-backup | spsrc-mon-3 | nova | enabled | up | 2021-05-11T10:06:47.000000 | +------------------+----------------------+------+---------+-------+——————————————+
So cinder-volume is Down,
I compare "cinder-backup" Ceph config with "cinder-volume", and they are equal! so why only one of them works? diff /etc/kolla/cinder-backup/ceph.conf /etc/kolla/cinder-volume/ceph.conf
I go inside the "cinder_volume" container docker exec -it cinder_volume /bin/bash
Try listing cinder volumes, works! rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls
Any Ideas. Kind regards.
Hi, Cinder volume could be down because the stats polling is taking too long. If that's the case, then you can set: rbd_exclusive_cinder_pool = true in your driver's section in cinder.conf to fix it. Cheers, Gorka.
Hi Gorka, Thank you very much for your help, we checked that option but before testing it we worked on the following (there is another message in the list where we have discussed it and you point it out): This was the solution proposed by my colleague Sebastian:
We restarted the "cinder-volume" and "cinder-scheduler" services with "debug=True", got back the same debug message:
2021-05-15 23:15:27.091 31 DEBUG cinder.volume.drivers.rbd [req-f43e30ae-2bdc-4690-9c1b-3e58081fdc9e - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431
Then, I had a look at the docs looking for "timeout" configuration options:
https://docs.openstack.org/cinder/train/configuration/block-storage/drivers/... <https://docs.openstack.org/cinder/train/configuration/block-storage/drivers/ceph-rbd-volume-driver.html#driver-options>
"rados_connect_timeout = -1; (Integer) Timeout value (in seconds) used when connecting to ceph cluster. If value < 0, no timeout is set and default librados value is used."
I added it to the "cinder.conf" file for the "cinder-volume" service with: "rados_connect_timeout=15".
Before this change the "cinder-volume" logs ended with this message:
2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0)
After the change:
2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0) 2021-05-15 23:04:23.180 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver initialization completed successfully. 2021-05-15 23:04:23.190 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initiating service 12 cleanup 2021-05-15 23:04:23.196 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Service 12 cleanup completed. 2021-05-15 23:04:23.315 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initializing RPC dependent components of volume driver RBDDriver (1.2.0) 2021-05-15 23:05:10.381 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver post RPC initialization completed successfully.
And now the service is reported as "up" in "openstack volume service list" and we can successfully create Ceph volumes now. Many will do more validation tests today to confirm.
So it looks like the "cinder-volume" service didn't start up properly in the first place and that's why the service was "down”.
Kind regards.
On 18 May 2021, at 17:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi, we have faced some problems when creating volumes to add to VMs, to see what was happening I activated the Debug=True mode of Cinder in the cinder.conf file. I see that when I try to create a new volume I get the following in the log:
"DEBUG cinder.volume.drivers.rbd connecting to (conf=/etc/ceph/ceph.conf, timeout=-1) _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431”
I’m using OpenStack Train and Ceph Octopus. When I check with openstack volume service list
+------------------+----------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------+------+---------+-------+----------------------------+ | cinder-scheduler | spsrc-controller-1 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-scheduler | spsrc-controller-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-scheduler | spsrc-controller-3 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-11T10:48:42.000000 | | cinder-backup | spsrc-mon-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-backup | spsrc-mon-1 | nova | enabled | up | 2021-05-11T10:06:44.000000 | | cinder-backup | spsrc-mon-3 | nova | enabled | up | 2021-05-11T10:06:47.000000 | +------------------+----------------------+------+---------+-------+——————————————+
So cinder-volume is Down,
I compare "cinder-backup" Ceph config with "cinder-volume", and they are equal! so why only one of them works? diff /etc/kolla/cinder-backup/ceph.conf /etc/kolla/cinder-volume/ceph.conf
I go inside the "cinder_volume" container docker exec -it cinder_volume /bin/bash
Try listing cinder volumes, works! rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls
Any Ideas. Kind regards.
Hi,
Cinder volume could be down because the stats polling is taking too long.
If that's the case, then you can set:
rbd_exclusive_cinder_pool = true
in your driver's section in cinder.conf to fix it.
Cheers, Gorka.
On 18/05, ManuParra wrote:
Hi Gorka, Thank you very much for your help, we checked that option but before testing it we worked on the following (there is another message in the list where we have discussed it and you point it out): This was the solution proposed by my colleague Sebastian:
Hi, That's good to know. Maybe we need to explore changing the default Cinder settings. Cheers, Gorka.
We restarted the "cinder-volume" and "cinder-scheduler" services with "debug=True", got back the same debug message:
2021-05-15 23:15:27.091 31 DEBUG cinder.volume.drivers.rbd [req-f43e30ae-2bdc-4690-9c1b-3e58081fdc9e - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431
Then, I had a look at the docs looking for "timeout" configuration options:
https://docs.openstack.org/cinder/train/configuration/block-storage/drivers/... <https://docs.openstack.org/cinder/train/configuration/block-storage/drivers/ceph-rbd-volume-driver.html#driver-options>
"rados_connect_timeout = -1; (Integer) Timeout value (in seconds) used when connecting to ceph cluster. If value < 0, no timeout is set and default librados value is used."
I added it to the "cinder.conf" file for the "cinder-volume" service with: "rados_connect_timeout=15".
Before this change the "cinder-volume" logs ended with this message:
2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0)
After the change:
2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0) 2021-05-15 23:04:23.180 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver initialization completed successfully. 2021-05-15 23:04:23.190 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initiating service 12 cleanup 2021-05-15 23:04:23.196 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Service 12 cleanup completed. 2021-05-15 23:04:23.315 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initializing RPC dependent components of volume driver RBDDriver (1.2.0) 2021-05-15 23:05:10.381 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver post RPC initialization completed successfully.
And now the service is reported as "up" in "openstack volume service list" and we can successfully create Ceph volumes now. Many will do more validation tests today to confirm.
So it looks like the "cinder-volume" service didn't start up properly in the first place and that's why the service was "down”.
Kind regards.
On 18 May 2021, at 17:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi, we have faced some problems when creating volumes to add to VMs, to see what was happening I activated the Debug=True mode of Cinder in the cinder.conf file. I see that when I try to create a new volume I get the following in the log:
"DEBUG cinder.volume.drivers.rbd connecting to (conf=/etc/ceph/ceph.conf, timeout=-1) _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431”
I’m using OpenStack Train and Ceph Octopus. When I check with openstack volume service list
+------------------+----------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------+------+---------+-------+----------------------------+ | cinder-scheduler | spsrc-controller-1 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-scheduler | spsrc-controller-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-scheduler | spsrc-controller-3 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-11T10:48:42.000000 | | cinder-backup | spsrc-mon-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-backup | spsrc-mon-1 | nova | enabled | up | 2021-05-11T10:06:44.000000 | | cinder-backup | spsrc-mon-3 | nova | enabled | up | 2021-05-11T10:06:47.000000 | +------------------+----------------------+------+---------+-------+——————————————+
So cinder-volume is Down,
I compare "cinder-backup" Ceph config with "cinder-volume", and they are equal! so why only one of them works? diff /etc/kolla/cinder-backup/ceph.conf /etc/kolla/cinder-volume/ceph.conf
I go inside the "cinder_volume" container docker exec -it cinder_volume /bin/bash
Try listing cinder volumes, works! rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls
Any Ideas. Kind regards.
Hi,
Cinder volume could be down because the stats polling is taking too long.
If that's the case, then you can set:
rbd_exclusive_cinder_pool = true
in your driver's section in cinder.conf to fix it.
Cheers, Gorka.
participants (2)
-
Gorka Eguileor
-
ManuParra