Restart cinder-volume with Ceph rdb
Dear OpenStack community, I have encountered a problem a few days ago and that is that when creating new volumes with: "openstack volume create --size 20 testmv" the volume creation status shows an error. If I go to the error log detail it indicates: "Schedule allocate volume: Could not find any available weighted backend". Indeed then I go to the cinder log and it indicates: "volume service is down - host: rbd:volumes@ceph-rbd”. I check with: "openstack volume service list” in which state are the services and I see that indeed this happens: | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | And stopped since 2021-04-29 ! I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). Thank you very much in advance. Regards.
Hi, so restart the volume service;-) systemctl restart openstack-cinder-volume.service Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
The default error messages for cinder-volume can be pretty vague. I would suggest enabling Debug for Cinder + service restart and seeing the error logs when the service goes up --> down. That should be in the cinder-volumes logs. On Tue, May 11, 2021 at 6:05 PM ManuParra <mparra@iaa.es> wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
Is this a new cluster, or one that has been running for a while? Did you just setup integration with Ceph? This part: "rbd:volumes@ceph-rbd" doesn't look right to me. For me (Victoria / Nautilus) this looks like: <cinder-volume-host>:<name>. name is configured in the cinder.conf with a [<name>] section, and enabled_backends=<name> in the [DEFAULT] section. cinder-volume-host is something that resolves to the host running openstack-cinder-volume.service. What version of OpenStack, and what version of Ceph are you running? Thank you, Dominic L. Hilsbos, MBA Vice President – Information Technology Perform Air International Inc. DHilsbos@PerformAir.com www.PerformAir.com -----Original Message----- From: ManuParra [mailto:mparra@iaa.es] Sent: Tuesday, May 11, 2021 3:00 PM To: Eugen Block Cc: openstack-discuss@lists.openstack.org Subject: Re: Restart cinder-volume with Ceph rdb Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
Hi Laurent, I included the Debug=True mode for Cinder-Volumes and Cinder-Scheduler, and the result is that I now have the following in the Debug: DEBUG cinder.volume.drivers.rbd [req-a0cb90b6-ca5d-496c-9a0b-e2296f1946ca - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431 DEBUG cinder.volume.drivers.rbd [req-a0cb90b6-ca5d-496c-9a0b-e2296f1946ca - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431 DEBUG cinder.volume.drivers.rbd [req-a0cb90b6-ca5d-496c-9a0b-e2296f1946ca - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431 Every time a new volume is requested cinder-volumes is called which is a ceph-rbd pool. I have restarted all cinder services on the three controller/monitor nodes I have and also restarted all ceph daemons, but I still see that when doing openstack volume service list +------------------+----------------------+------+---------+-------+----------------------------+ | Binary | Host | Zone | Status | State | Updated At | +------------------+----------------------+------+---------+-------+----------------------------+ | cinder-scheduler | spsrc-contr-1 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-scheduler | spsrc-contr-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-scheduler | spsrc-contr-3 | nova | enabled | up | 2021-05-11T10:06:39.000000 | | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-05-11T10:48:42.000000 | | cinder-backup | spsrc-mon-2 | nova | enabled | up | 2021-05-11T10:06:47.000000 | | cinder-backup | spsrc-mon-1 | nova | enabled | up | 2021-05-11T10:06:44.000000 | | cinder-backup | spsrc-mon-3 | nova | enabled | up | 2021-05-11T10:06:47.000000 | +------------------+----------------------+------+---------+-------+----------------------------+ cinder-volume is down and cannot create new volumes to associate to a VM. Kind regards.
On 12 May 2021, at 03:43, DHilsbos@performair.com <mailto:DHilsbos@performair.com> wrote:
Is this a new cluster, or one that has been running for a while?
Did you just setup integration with Ceph?
This part: "rbd:volumes@ceph-rbd" doesn't look right to me. For me (Victoria / Nautilus) this looks like: <cinder-volume-host>:<name>.
name is configured in the cinder.conf with a [<name>] section, and enabled_backends=<name> in the [DEFAULT] section. cinder-volume-host is something that resolves to the host running openstack-cinder-volume.service.
What version of OpenStack, and what version of Ceph are you running?
Thank you,
Dominic L. Hilsbos, MBA Vice President – Information Technology Perform Air International Inc. DHilsbos@PerformAir.com <mailto:DHilsbos@PerformAir.com> www.PerformAir.com <http://www.performair.com/>
-----Original Message----- From: ManuParra [mailto:mparra@iaa.es <mailto:mparra@iaa.es>] Sent: Tuesday, May 11, 2021 3:00 PM To: Eugen Block Cc: openstack-discuss@lists.openstack.org <mailto:openstack-discuss@lists.openstack.org> Subject: Re: Restart cinder-volume with Ceph rdb
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag <mailto:eblock@nde.ag>> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es <mailto:mparra@iaa.es>>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
Hello Dominique, the integration with CEPH was already done and apparently everything works, I can create with CephFS manila but not Block Storage for volumes for Cinder. The OpenStack version is Train and Ceph is Octopus. If I check the Ceph Pools, I see that there is indeed a pool called cinder-volumes which is the one that connects cinder with ceph. Regards.
On 12 May 2021, at 03:43, DHilsbos@performair.com wrote:
Is this a new cluster, or one that has been running for a while?
Did you just setup integration with Ceph?
This part: "rbd:volumes@ceph-rbd" doesn't look right to me. For me (Victoria / Nautilus) this looks like: <cinder-volume-host>:<name>.
name is configured in the cinder.conf with a [<name>] section, and enabled_backends=<name> in the [DEFAULT] section. cinder-volume-host is something that resolves to the host running openstack-cinder-volume.service.
What version of OpenStack, and what version of Ceph are you running?
Thank you,
Dominic L. Hilsbos, MBA Vice President – Information Technology Perform Air International Inc. DHilsbos@PerformAir.com www.PerformAir.com
-----Original Message----- From: ManuParra [mailto:mparra@iaa.es] Sent: Tuesday, May 11, 2021 3:00 PM To: Eugen Block Cc: openstack-discuss@lists.openstack.org Subject: Re: Restart cinder-volume with Ceph rdb
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
On 12/05, ManuParra wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
Hi, You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value. Please add to your driver's section in cinder.conf the following: rbd_exclusive_cinder_pool = true And restart the service. Cheers, Gorka.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
Hi Gorka, let me show the cinder config: [ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver … So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host. Regards.
On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
Hi,
You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value.
Please add to your driver's section in cinder.conf the following:
rbd_exclusive_cinder_pool = true
And restart the service.
Cheers, Gorka.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
On 12/05, ManuParra wrote:
Hi Gorka, let me show the cinder config:
[ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver …
So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host.
Hi, Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option. Then the next thing you need to have is the cinder-volume service running correctly before making any requests. I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service. See if the logs show any ERROR level entries. I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes. Cheers, Gorka.
Regards.
On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
Hi,
You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value.
Please add to your driver's section in cinder.conf the following:
rbd_exclusive_cinder_pool = true
And restart the service.
Cheers, Gorka.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem. As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger. Regards.
On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi Gorka, let me show the cinder config:
[ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver …
So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host.
Hi,
Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option.
Then the next thing you need to have is the cinder-volume service running correctly before making any requests.
I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service.
See if the logs show any ERROR level entries.
I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes.
Cheers, Gorka.
Regards.
On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
Hi,
You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value.
Please add to your driver's section in cinder.conf the following:
rbd_exclusive_cinder_pool = true
And restart the service.
Cheers, Gorka.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
Dear OpenStack community,
I have encountered a problem a few days ago and that is that when creating new volumes with:
"openstack volume create --size 20 testmv"
the volume creation status shows an error. If I go to the error log detail it indicates:
"Schedule allocate volume: Could not find any available weighted backend".
Indeed then I go to the cinder log and it indicates:
"volume service is down - host: rbd:volumes@ceph-rbd”.
I check with:
"openstack volume service list” in which state are the services and I see that indeed this happens:
| cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 |
And stopped since 2021-04-29 !
I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working.
This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok).
Thank you very much in advance. Regards.
On 13/05, ManuParra wrote:
Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem.
As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
Regards.
On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi Gorka, let me show the cinder config:
[ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver …
So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host.
Hi,
Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option.
Then the next thing you need to have is the cinder-volume service running correctly before making any requests.
I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service.
See if the logs show any ERROR level entries.
I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes.
Cheers, Gorka.
Regards.
On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
Hi,
You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value.
Please add to your driver's section in cinder.conf the following:
rbd_exclusive_cinder_pool = true
And restart the service.
Cheers, Gorka.
On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote:
Hi,
so restart the volume service;-)
systemctl restart openstack-cinder-volume.service
Zitat von ManuParra <mparra@iaa.es>:
> Dear OpenStack community, > > I have encountered a problem a few days ago and that is that when creating new volumes with: > > "openstack volume create --size 20 testmv" > > the volume creation status shows an error. If I go to the error log detail it indicates: > > "Schedule allocate volume: Could not find any available weighted backend". > > Indeed then I go to the cinder log and it indicates: > > "volume service is down - host: rbd:volumes@ceph-rbd”. > > I check with: > > "openstack volume service list” in which state are the services and I see that indeed this happens: > > > | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | > > And stopped since 2021-04-29 ! > > I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. > > This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). > > Thank you very much in advance. Regards.
Hello Gorka, not yet, let me update cinder configuration, add the option, restart cinder and I’ll update the status. Do you recommend other things to try for this cycle? Regards.
On 13 May 2021, at 09:37, Gorka Eguileor <geguileo@redhat.com> wrote:
On 13/05, ManuParra wrote:
Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem.
As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
Regards.
On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi Gorka, let me show the cinder config:
[ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver …
So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host.
Hi,
Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option.
Then the next thing you need to have is the cinder-volume service running correctly before making any requests.
I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service.
See if the logs show any ERROR level entries.
I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes.
Cheers, Gorka.
Regards.
On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. Kind regards.
Hi,
You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value.
Please add to your driver's section in cinder.conf the following:
rbd_exclusive_cinder_pool = true
And restart the service.
Cheers, Gorka.
> On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote: > > Hi, > > so restart the volume service;-) > > systemctl restart openstack-cinder-volume.service > > > Zitat von ManuParra <mparra@iaa.es>: > >> Dear OpenStack community, >> >> I have encountered a problem a few days ago and that is that when creating new volumes with: >> >> "openstack volume create --size 20 testmv" >> >> the volume creation status shows an error. If I go to the error log detail it indicates: >> >> "Schedule allocate volume: Could not find any available weighted backend". >> >> Indeed then I go to the cinder log and it indicates: >> >> "volume service is down - host: rbd:volumes@ceph-rbd”. >> >> I check with: >> >> "openstack volume service list” in which state are the services and I see that indeed this happens: >> >> >> | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | >> >> And stopped since 2021-04-29 ! >> >> I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. >> >> This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). >> >> Thank you very much in advance. Regards. > > > >
Hello, I just saw that you are running Ceph Octopus with Train release and wanted to let you know that we saw issues with the os-brick version shipped with Train not supporting client version of Ceph Octopus. So for our Ceph cluster running Octopus we had to keep the client version on Nautilus until upgrading to Victoria which included a newer version of os-brick. Maybe this is unrelated to your issue but just wanted to put it out there. Best regards Tobias
On 13 May 2021, at 12:55, ManuParra <mparra@iaa.es> wrote:
Hello Gorka, not yet, let me update cinder configuration, add the option, restart cinder and I’ll update the status. Do you recommend other things to try for this cycle? Regards.
On 13 May 2021, at 09:37, Gorka Eguileor <geguileo@redhat.com> wrote:
On 13/05, ManuParra wrote: Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem.
As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
Regards.
On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi Gorka, let me show the cinder config:
[ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver …
So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host.
Hi,
Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option.
Then the next thing you need to have is the cinder-volume service running correctly before making any requests.
I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service.
See if the logs show any ERROR level entries.
I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes.
Cheers, Gorka.
Regards.
On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote: > Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. > The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. > The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. > Kind regards. >
Hi,
You are most likely using an older release, have a high number of cinder RBD volumes, and have not changed configuration option "rbd_exclusive_cinder_pool" from its default "false" value.
Please add to your driver's section in cinder.conf the following:
rbd_exclusive_cinder_pool = true
And restart the service.
Cheers, Gorka.
>> On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote: >> >> Hi, >> >> so restart the volume service;-) >> >> systemctl restart openstack-cinder-volume.service >> >> >> Zitat von ManuParra <mparra@iaa.es>: >> >>> Dear OpenStack community, >>> >>> I have encountered a problem a few days ago and that is that when creating new volumes with: >>> >>> "openstack volume create --size 20 testmv" >>> >>> the volume creation status shows an error. If I go to the error log detail it indicates: >>> >>> "Schedule allocate volume: Could not find any available weighted backend". >>> >>> Indeed then I go to the cinder log and it indicates: >>> >>> "volume service is down - host: rbd:volumes@ceph-rbd”. >>> >>> I check with: >>> >>> "openstack volume service list” in which state are the services and I see that indeed this happens: >>> >>> >>> | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | >>> >>> And stopped since 2021-04-29 ! >>> >>> I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. >>> >>> This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). >>> >>> Thank you very much in advance. Regards. >> >> >> >> > >
Hi All, Thanks for your inputs so far. I am also trying to help Manu with this issue. The "cinder-volume" service was working properly with the existing configuration. However, after a power outage the service is no longer reported as "up". Looking at the source code, the service status is reported as "down" by "cinder-scheduler" in here: https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_... With message: "WARNING cinder.scheduler.host_manager [req-<>- default default] volume service is down. (host: rbd:volumes@ceph-rbd)" I printed out the "service" tuple https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_... and we get: "2021-05-15 09:57:24.918 7 WARNING cinder.scheduler.host_manager [<> - default default] Service(active_backend_id=None,availability_zone='nova',binary='cinder-volume',cluster=<?>,cluster_name=None,created_at=2020-06-12T07:53:42Z,deleted=False,deleted_at=None,disabled=False,disabled_reason=None,frozen=False,host='rbd:volumes@ceph-rbd ',id=12,modified_at=None,object_current_version='1.38',replication_status='disabled',report_count=8067424,rpc_current_version='3.16',topic='cinder-volume',updated_at=2021-05-12T15:37:52Z,uuid='604668e8-c2e7-46ed-a2b8-086e588079ac')" Cinder is configured with a Ceph RBD backend, as explained in https://github.com/openstack/kolla-ansible/blob/stable/train/doc/source/refe... That's where the "backend_host=rbd:volumes" configuration is coming from. We are using 3 controller nodes for OpenStack and 3 monitor nodes for Ceph. The Ceph cluster doesn't report any error. The "cinder-volume" containers don't report any error. Moreover, when we go inside the "cinder-volume" container we are able to list existing volumes with: rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls So the connection to the Ceph cluster works. Why is "cinder-scheduler" reporting the that the backend Ceph cluster is down? Many thanks, Sebastian On Thu, 13 May 2021 at 13:12, Tobias Urdin <tobias.urdin@binero.com> wrote:
Hello,
I just saw that you are running Ceph Octopus with Train release and wanted to let you know that we saw issues with the os-brick version shipped with Train not supporting client version of Ceph Octopus.
So for our Ceph cluster running Octopus we had to keep the client version on Nautilus until upgrading to Victoria which included a newer version of os-brick.
Maybe this is unrelated to your issue but just wanted to put it out there.
Best regards Tobias
On 13 May 2021, at 12:55, ManuParra <mparra@iaa.es> wrote:
Hello Gorka, not yet, let me update cinder configuration, add the option, restart cinder and I’ll update the status. Do you recommend other things to try for this cycle? Regards.
On 13 May 2021, at 09:37, Gorka Eguileor <geguileo@redhat.com> wrote:
On 13/05, ManuParra wrote: Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem.
As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
Regards.
On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com> wrote:
On 12/05, ManuParra wrote:
Hi Gorka, let me show the cinder config:
[ceph-rbd] rbd_ceph_conf = /etc/ceph/ceph.conf rbd_user = cinder backend_host = rbd:volumes rbd_pool = cinder.volumes volume_backend_name = ceph-rbd volume_driver = cinder.volume.drivers.rbd.RBDDriver …
So, using rbd_exclusive_cinder_pool=True it will be used just for
volumes? but the log is saying no connection to the backend_host.
Hi,
Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option.
Then the next thing you need to have is the cinder-volume service running correctly before making any requests.
I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service.
See if the logs show any ERROR level entries.
I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes.
Cheers, Gorka.
Regards.
> On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com>
wrote:
> > On 12/05, ManuParra wrote: >> Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. >> The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. >> The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. >> Kind regards. >> > > Hi, > > You are most likely using an older release, have a high number of cinder > RBD volumes, and have not changed configuration option > "rbd_exclusive_cinder_pool" from its default "false" value. > > Please add to your driver's section in cinder.conf the following: > > rbd_exclusive_cinder_pool = true > > > And restart the service. > > Cheers, > Gorka. > >>> On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote: >>> >>> Hi, >>> >>> so restart the volume service;-) >>> >>> systemctl restart openstack-cinder-volume.service >>> >>> >>> Zitat von ManuParra <mparra@iaa.es>: >>> >>>> Dear OpenStack community, >>>> >>>> I have encountered a problem a few days ago and that is that when creating new volumes with: >>>> >>>> "openstack volume create --size 20 testmv" >>>> >>>> the volume creation status shows an error. If I go to the error log detail it indicates: >>>> >>>> "Schedule allocate volume: Could not find any available weighted backend". >>>> >>>> Indeed then I go to the cinder log and it indicates: >>>> >>>> "volume service is down - host: rbd:volumes@ceph-rbd”. >>>> >>>> I check with: >>>> >>>> "openstack volume service list” in which state are the services and I see that indeed this happens: >>>> >>>> >>>> | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | >>>> >>>> And stopped since 2021-04-29 ! >>>> >>>> I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. >>>> >>>> This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). >>>> >>>> Thank you very much in advance. Regards. >>> >>> >>> >>> >> >> > >
That is a bit strange. I don't use the Ceph backend so I don't know any magic tricks. - I'm surprised that the Debug logging level doesn't add anything else. Is there any other lines besides the "connecting" one? - Can we narrow down the port/IP destination for the Ceph RBD traffic? - Can we failover the cinder-volume service to another controller and check the status of the volume service? - Did the power outage impact the Ceph cluster + network gear + all the controllers? - Does the content of /etc/ceph/ceph.conf appear to be valid inside the container? Looking at the code - https://github.com/openstack/cinder/blob/stable/train/cinder/volume/drivers/... It should raise an exception if there is a timeout when the connection client is built. except self.rados.Error: msg = _("Error connecting to ceph cluster.") LOG.exception(msg) client.shutdown() raise exception.VolumeBackendAPIException(data=msg) On Sat, May 15, 2021 at 4:16 AM Sebastian Luna Valero < sebastian.luna.valero@gmail.com> wrote:
Hi All,
Thanks for your inputs so far. I am also trying to help Manu with this issue.
The "cinder-volume" service was working properly with the existing configuration. However, after a power outage the service is no longer reported as "up".
Looking at the source code, the service status is reported as "down" by "cinder-scheduler" in here:
https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_...
With message: "WARNING cinder.scheduler.host_manager [req-<>- default default] volume service is down. (host: rbd:volumes@ceph-rbd)"
I printed out the "service" tuple https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_... and we get:
"2021-05-15 09:57:24.918 7 WARNING cinder.scheduler.host_manager [<> - default default] Service(active_backend_id=None,availability_zone='nova',binary='cinder-volume',cluster=<?>,cluster_name=None,created_at=2020-06-12T07:53:42Z,deleted=False,deleted_at=None,disabled=False,disabled_reason=None,frozen=False,host='rbd:volumes@ceph-rbd ',id=12,modified_at=None,object_current_version='1.38',replication_status='disabled',report_count=8067424,rpc_current_version='3.16',topic='cinder-volume',updated_at=2021-05-12T15:37:52Z,uuid='604668e8-c2e7-46ed-a2b8-086e588079ac')"
Cinder is configured with a Ceph RBD backend, as explained in https://github.com/openstack/kolla-ansible/blob/stable/train/doc/source/refe...
That's where the "backend_host=rbd:volumes" configuration is coming from.
We are using 3 controller nodes for OpenStack and 3 monitor nodes for Ceph.
The Ceph cluster doesn't report any error. The "cinder-volume" containers don't report any error. Moreover, when we go inside the "cinder-volume" container we are able to list existing volumes with:
rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls
So the connection to the Ceph cluster works.
Why is "cinder-scheduler" reporting the that the backend Ceph cluster is down?
Many thanks, Sebastian
On Thu, 13 May 2021 at 13:12, Tobias Urdin <tobias.urdin@binero.com> wrote:
Hello,
I just saw that you are running Ceph Octopus with Train release and wanted to let you know that we saw issues with the os-brick version shipped with Train not supporting client version of Ceph Octopus.
So for our Ceph cluster running Octopus we had to keep the client version on Nautilus until upgrading to Victoria which included a newer version of os-brick.
Maybe this is unrelated to your issue but just wanted to put it out there.
Best regards Tobias
On 13 May 2021, at 12:55, ManuParra <mparra@iaa.es> wrote:
Hello Gorka, not yet, let me update cinder configuration, add the option, restart cinder and I’ll update the status. Do you recommend other things to try for this cycle? Regards.
On 13 May 2021, at 09:37, Gorka Eguileor <geguileo@redhat.com> wrote:
On 13/05, ManuParra wrote: Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem.
As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
Regards.
On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com>
wrote:
On 12/05, ManuParra wrote: > Hi Gorka, let me show the cinder config: > > [ceph-rbd] > rbd_ceph_conf = /etc/ceph/ceph.conf > rbd_user = cinder > backend_host = rbd:volumes > rbd_pool = cinder.volumes > volume_backend_name = ceph-rbd > volume_driver = cinder.volume.drivers.rbd.RBDDriver > … > > So, using rbd_exclusive_cinder_pool=True it will be used just for
volumes? but the log is saying no connection to the backend_host.
Hi,
Your backend_host doesn't have a valid hostname, please set a proper hostname in that configuration option.
Then the next thing you need to have is the cinder-volume service running correctly before making any requests.
I would try adding rbd_exclusive_cinder_pool=true then tailing the volume logs, and restarting the service.
See if the logs show any ERROR level entries.
I would also check the service-list output right after the service is restarted, if it's up then I would check it again after 2 minutes.
Cheers, Gorka.
> > Regards. > > >> On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com>
wrote:
>> >> On 12/05, ManuParra wrote: >>> Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. >>> The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. >>> The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. >>> Kind regards. >>> >> >> Hi, >> >> You are most likely using an older release, have a high number of cinder >> RBD volumes, and have not changed configuration option >> "rbd_exclusive_cinder_pool" from its default "false" value. >> >> Please add to your driver's section in cinder.conf the following: >> >> rbd_exclusive_cinder_pool = true >> >> >> And restart the service. >> >> Cheers, >> Gorka. >> >>>> On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote: >>>> >>>> Hi, >>>> >>>> so restart the volume service;-) >>>> >>>> systemctl restart openstack-cinder-volume.service >>>> >>>> >>>> Zitat von ManuParra <mparra@iaa.es>: >>>> >>>>> Dear OpenStack community, >>>>> >>>>> I have encountered a problem a few days ago and that is that when creating new volumes with: >>>>> >>>>> "openstack volume create --size 20 testmv" >>>>> >>>>> the volume creation status shows an error. If I go to the error log detail it indicates: >>>>> >>>>> "Schedule allocate volume: Could not find any available weighted backend". >>>>> >>>>> Indeed then I go to the cinder log and it indicates: >>>>> >>>>> "volume service is down - host: rbd:volumes@ceph-rbd”. >>>>> >>>>> I check with: >>>>> >>>>> "openstack volume service list” in which state are the services and I see that indeed this happens: >>>>> >>>>> >>>>> | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | >>>>> >>>>> And stopped since 2021-04-29 ! >>>>> >>>>> I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. >>>>> >>>>> This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). >>>>> >>>>> Thank you very much in advance. Regards. >>>> >>>> >>>> >>>> >>> >>> >> >> >
Thanks, Laurent. Long story short, we have been able to bring the "cinder-volume" service back up. We restarted the "cinder-volume" and "cinder-scheduler" services with "debug=True", got back the same debug message: 2021-05-15 23:15:27.091 31 DEBUG cinder.volume.drivers.rbd [req-f43e30ae-2bdc-4690-9c1b-3e58081fdc9e - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431 Then, I had a look at the docs looking for "timeout" configuration options: https://docs.openstack.org/cinder/train/configuration/block-storage/drivers/... "rados_connect_timeout = -1; (Integer) Timeout value (in seconds) used when connecting to ceph cluster. If value < 0, no timeout is set and default librados value is used." I added it to the "cinder.conf" file for the "cinder-volume" service with: "rados_connect_timeout=15". Before this change the "cinder-volume" logs ended with this message: 2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0) After the change: 2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0) 2021-05-15 23:04:23.180 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver initialization completed successfully. 2021-05-15 23:04:23.190 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initiating service 12 cleanup 2021-05-15 23:04:23.196 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Service 12 cleanup completed. 2021-05-15 23:04:23.315 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initializing RPC dependent components of volume driver RBDDriver (1.2.0) 2021-05-15 23:05:10.381 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver post RPC initialization completed successfully. And now the service is reported as "up" in "openstack volume service list" and we can successfully create Ceph volumes now. Many will do more validation tests today to confirm. So it looks like the "cinder-volume" service didn't start up properly in the first place and that's why the service was "down". Why adding "rados_connect_timeout=15" to cinder.conf solved the issue? I honestly don't know and it was a matter of luck to try this out. If anyone knows the reason, we would love to know more. Thank you very much again for your kind help! Best regards, Sebastian On Sat, 15 May 2021 at 19:40, Laurent Dumont <laurentfdumont@gmail.com> wrote:
That is a bit strange. I don't use the Ceph backend so I don't know any magic tricks.
- I'm surprised that the Debug logging level doesn't add anything else. Is there any other lines besides the "connecting" one? - Can we narrow down the port/IP destination for the Ceph RBD traffic? - Can we failover the cinder-volume service to another controller and check the status of the volume service? - Did the power outage impact the Ceph cluster + network gear + all the controllers? - Does the content of /etc/ceph/ceph.conf appear to be valid inside the container?
Looking at the code - https://github.com/openstack/cinder/blob/stable/train/cinder/volume/drivers/...
It should raise an exception if there is a timeout when the connection client is built.
except self.rados.Error: msg = _("Error connecting to ceph cluster.") LOG.exception(msg) client.shutdown() raise exception.VolumeBackendAPIException(data=msg)
On Sat, May 15, 2021 at 4:16 AM Sebastian Luna Valero < sebastian.luna.valero@gmail.com> wrote:
Hi All,
Thanks for your inputs so far. I am also trying to help Manu with this issue.
The "cinder-volume" service was working properly with the existing configuration. However, after a power outage the service is no longer reported as "up".
Looking at the source code, the service status is reported as "down" by "cinder-scheduler" in here:
https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_...
With message: "WARNING cinder.scheduler.host_manager [req-<>- default default] volume service is down. (host: rbd:volumes@ceph-rbd)"
I printed out the "service" tuple https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_... and we get:
"2021-05-15 09:57:24.918 7 WARNING cinder.scheduler.host_manager [<> - default default] Service(active_backend_id=None,availability_zone='nova',binary='cinder-volume',cluster=<?>,cluster_name=None,created_at=2020-06-12T07:53:42Z,deleted=False,deleted_at=None,disabled=False,disabled_reason=None,frozen=False,host='rbd:volumes@ceph-rbd ',id=12,modified_at=None,object_current_version='1.38',replication_status='disabled',report_count=8067424,rpc_current_version='3.16',topic='cinder-volume',updated_at=2021-05-12T15:37:52Z,uuid='604668e8-c2e7-46ed-a2b8-086e588079ac')"
Cinder is configured with a Ceph RBD backend, as explained in https://github.com/openstack/kolla-ansible/blob/stable/train/doc/source/refe...
That's where the "backend_host=rbd:volumes" configuration is coming from.
We are using 3 controller nodes for OpenStack and 3 monitor nodes for Ceph.
The Ceph cluster doesn't report any error. The "cinder-volume" containers don't report any error. Moreover, when we go inside the "cinder-volume" container we are able to list existing volumes with:
rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls
So the connection to the Ceph cluster works.
Why is "cinder-scheduler" reporting the that the backend Ceph cluster is down?
Many thanks, Sebastian
On Thu, 13 May 2021 at 13:12, Tobias Urdin <tobias.urdin@binero.com> wrote:
Hello,
I just saw that you are running Ceph Octopus with Train release and wanted to let you know that we saw issues with the os-brick version shipped with Train not supporting client version of Ceph Octopus.
So for our Ceph cluster running Octopus we had to keep the client version on Nautilus until upgrading to Victoria which included a newer version of os-brick.
Maybe this is unrelated to your issue but just wanted to put it out there.
Best regards Tobias
On 13 May 2021, at 12:55, ManuParra <mparra@iaa.es> wrote:
Hello Gorka, not yet, let me update cinder configuration, add the option, restart cinder and I’ll update the status. Do you recommend other things to try for this cycle? Regards.
On 13 May 2021, at 09:37, Gorka Eguileor <geguileo@redhat.com> wrote:
On 13/05, ManuParra wrote: Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem.
As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
Regards.
> On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com>
wrote:
> > On 12/05, ManuParra wrote: >> Hi Gorka, let me show the cinder config: >> >> [ceph-rbd] >> rbd_ceph_conf = /etc/ceph/ceph.conf >> rbd_user = cinder >> backend_host = rbd:volumes >> rbd_pool = cinder.volumes >> volume_backend_name = ceph-rbd >> volume_driver = cinder.volume.drivers.rbd.RBDDriver >> … >> >> So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host. > > Hi, > > Your backend_host doesn't have a valid hostname, please set a proper > hostname in that configuration option. > > Then the next thing you need to have is the cinder-volume service > running correctly before making any requests. > > I would try adding rbd_exclusive_cinder_pool=true then tailing the > volume logs, and restarting the service. > > See if the logs show any ERROR level entries. > > I would also check the service-list output right after the service is > restarted, if it's up then I would check it again after 2 minutes. > > Cheers, > Gorka. > > >> >> Regards. >> >> >>> On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote: >>> >>> On 12/05, ManuParra wrote: >>>> Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. >>>> The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. >>>> The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. >>>> Kind regards. >>>> >>> >>> Hi, >>> >>> You are most likely using an older release, have a high number of cinder >>> RBD volumes, and have not changed configuration option >>> "rbd_exclusive_cinder_pool" from its default "false" value. >>> >>> Please add to your driver's section in cinder.conf the following: >>> >>> rbd_exclusive_cinder_pool = true >>> >>> >>> And restart the service. >>> >>> Cheers, >>> Gorka. >>> >>>>> On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote: >>>>> >>>>> Hi, >>>>> >>>>> so restart the volume service;-) >>>>> >>>>> systemctl restart openstack-cinder-volume.service >>>>> >>>>> >>>>> Zitat von ManuParra <mparra@iaa.es>: >>>>> >>>>>> Dear OpenStack community, >>>>>> >>>>>> I have encountered a problem a few days ago and that is that when creating new volumes with: >>>>>> >>>>>> "openstack volume create --size 20 testmv" >>>>>> >>>>>> the volume creation status shows an error. If I go to the error log detail it indicates: >>>>>> >>>>>> "Schedule allocate volume: Could not find any available weighted backend". >>>>>> >>>>>> Indeed then I go to the cinder log and it indicates: >>>>>> >>>>>> "volume service is down - host: rbd:volumes@ceph-rbd”. >>>>>> >>>>>> I check with: >>>>>> >>>>>> "openstack volume service list” in which state are the services and I see that indeed this happens: >>>>>> >>>>>> >>>>>> | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | >>>>>> >>>>>> And stopped since 2021-04-29 ! >>>>>> >>>>>> I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. >>>>>> >>>>>> This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). >>>>>> >>>>>> Thank you very much in advance. Regards. >>>>> >>>>> >>>>> >>>>> >>>> >>>> >>> >>> >> >
Glad to know it was resolved! It's a bit weird that explicitly setting the parameter works, but good to know! On Mon, May 17, 2021 at 2:11 AM Sebastian Luna Valero < sebastian.luna.valero@gmail.com> wrote:
Thanks, Laurent.
Long story short, we have been able to bring the "cinder-volume" service back up.
We restarted the "cinder-volume" and "cinder-scheduler" services with "debug=True", got back the same debug message:
2021-05-15 23:15:27.091 31 DEBUG cinder.volume.drivers.rbd [req-f43e30ae-2bdc-4690-9c1b-3e58081fdc9e - - - - -] connecting to cinder@ceph (conf=/etc/ceph/ceph.conf, timeout=-1). _do_conn /usr/lib/python3.6/site-packages/cinder/volume/drivers/rbd.py:431
Then, I had a look at the docs looking for "timeout" configuration options:
https://docs.openstack.org/cinder/train/configuration/block-storage/drivers/...
"rados_connect_timeout = -1; (Integer) Timeout value (in seconds) used when connecting to ceph cluster. If value < 0, no timeout is set and default librados value is used."
I added it to the "cinder.conf" file for the "cinder-volume" service with: "rados_connect_timeout=15".
Before this change the "cinder-volume" logs ended with this message:
2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0)
After the change:
2021-05-15 23:02:48.821 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Starting volume driver RBDDriver (1.2.0) 2021-05-15 23:04:23.180 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver initialization completed successfully. 2021-05-15 23:04:23.190 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initiating service 12 cleanup 2021-05-15 23:04:23.196 31 INFO cinder.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Service 12 cleanup completed. 2021-05-15 23:04:23.315 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Initializing RPC dependent components of volume driver RBDDriver (1.2.0) 2021-05-15 23:05:10.381 31 INFO cinder.volume.manager [req-6e8f9f46-ee34-4925-9fc8-dea8729d0d93 - - - - -] Driver post RPC initialization completed successfully.
And now the service is reported as "up" in "openstack volume service list" and we can successfully create Ceph volumes now. Many will do more validation tests today to confirm.
So it looks like the "cinder-volume" service didn't start up properly in the first place and that's why the service was "down".
Why adding "rados_connect_timeout=15" to cinder.conf solved the issue? I honestly don't know and it was a matter of luck to try this out. If anyone knows the reason, we would love to know more.
Thank you very much again for your kind help!
Best regards, Sebastian
On Sat, 15 May 2021 at 19:40, Laurent Dumont <laurentfdumont@gmail.com> wrote:
That is a bit strange. I don't use the Ceph backend so I don't know any magic tricks.
- I'm surprised that the Debug logging level doesn't add anything else. Is there any other lines besides the "connecting" one? - Can we narrow down the port/IP destination for the Ceph RBD traffic? - Can we failover the cinder-volume service to another controller and check the status of the volume service? - Did the power outage impact the Ceph cluster + network gear + all the controllers? - Does the content of /etc/ceph/ceph.conf appear to be valid inside the container?
Looking at the code - https://github.com/openstack/cinder/blob/stable/train/cinder/volume/drivers/...
It should raise an exception if there is a timeout when the connection client is built.
except self.rados.Error: msg = _("Error connecting to ceph cluster.") LOG.exception(msg) client.shutdown() raise exception.VolumeBackendAPIException(data=msg)
On Sat, May 15, 2021 at 4:16 AM Sebastian Luna Valero < sebastian.luna.valero@gmail.com> wrote:
Hi All,
Thanks for your inputs so far. I am also trying to help Manu with this issue.
The "cinder-volume" service was working properly with the existing configuration. However, after a power outage the service is no longer reported as "up".
Looking at the source code, the service status is reported as "down" by "cinder-scheduler" in here:
https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_...
With message: "WARNING cinder.scheduler.host_manager [req-<>- default default] volume service is down. (host: rbd:volumes@ceph-rbd)"
I printed out the "service" tuple https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_... and we get:
"2021-05-15 09:57:24.918 7 WARNING cinder.scheduler.host_manager [<> - default default] Service(active_backend_id=None,availability_zone='nova',binary='cinder-volume',cluster=<?>,cluster_name=None,created_at=2020-06-12T07:53:42Z,deleted=False,deleted_at=None,disabled=False,disabled_reason=None,frozen=False,host='rbd:volumes@ceph-rbd ',id=12,modified_at=None,object_current_version='1.38',replication_status='disabled',report_count=8067424,rpc_current_version='3.16',topic='cinder-volume',updated_at=2021-05-12T15:37:52Z,uuid='604668e8-c2e7-46ed-a2b8-086e588079ac')"
Cinder is configured with a Ceph RBD backend, as explained in https://github.com/openstack/kolla-ansible/blob/stable/train/doc/source/refe...
That's where the "backend_host=rbd:volumes" configuration is coming from.
We are using 3 controller nodes for OpenStack and 3 monitor nodes for Ceph.
The Ceph cluster doesn't report any error. The "cinder-volume" containers don't report any error. Moreover, when we go inside the "cinder-volume" container we are able to list existing volumes with:
rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls
So the connection to the Ceph cluster works.
Why is "cinder-scheduler" reporting the that the backend Ceph cluster is down?
Many thanks, Sebastian
On Thu, 13 May 2021 at 13:12, Tobias Urdin <tobias.urdin@binero.com> wrote:
Hello,
I just saw that you are running Ceph Octopus with Train release and wanted to let you know that we saw issues with the os-brick version shipped with Train not supporting client version of Ceph Octopus.
So for our Ceph cluster running Octopus we had to keep the client version on Nautilus until upgrading to Victoria which included a newer version of os-brick.
Maybe this is unrelated to your issue but just wanted to put it out there.
Best regards Tobias
On 13 May 2021, at 12:55, ManuParra <mparra@iaa.es> wrote:
Hello Gorka, not yet, let me update cinder configuration, add the option, restart cinder and I’ll update the status. Do you recommend other things to try for this cycle? Regards.
On 13 May 2021, at 09:37, Gorka Eguileor <geguileo@redhat.com> wrote:
> On 13/05, ManuParra wrote: > Hi Gorka again, yes, the first thing is to know why you can't connect to that host (Ceph is actually set up for HA) so that's the way to do it. I tell you this because previously from the beginning of the setup of our setup it has always been like that, with that hostname and there has been no problem. > > As for the errors, the strangest thing is that in Monasca I have not found any error log, only warning on “volume service is down. (host: rbd:volumes@ceph-rbd)" and info, which is even stranger.
Have you tried the configuration change I recommended?
> > Regards. > >> On 12 May 2021, at 23:34, Gorka Eguileor <geguileo@redhat.com> wrote: >> >> On 12/05, ManuParra wrote: >>> Hi Gorka, let me show the cinder config: >>> >>> [ceph-rbd] >>> rbd_ceph_conf = /etc/ceph/ceph.conf >>> rbd_user = cinder >>> backend_host = rbd:volumes >>> rbd_pool = cinder.volumes >>> volume_backend_name = ceph-rbd >>> volume_driver = cinder.volume.drivers.rbd.RBDDriver >>> … >>> >>> So, using rbd_exclusive_cinder_pool=True it will be used just for volumes? but the log is saying no connection to the backend_host. >> >> Hi, >> >> Your backend_host doesn't have a valid hostname, please set a proper >> hostname in that configuration option. >> >> Then the next thing you need to have is the cinder-volume service >> running correctly before making any requests. >> >> I would try adding rbd_exclusive_cinder_pool=true then tailing the >> volume logs, and restarting the service. >> >> See if the logs show any ERROR level entries. >> >> I would also check the service-list output right after the service is >> restarted, if it's up then I would check it again after 2 minutes. >> >> Cheers, >> Gorka. >> >> >>> >>> Regards. >>> >>> >>>> On 12 May 2021, at 11:49, Gorka Eguileor <geguileo@redhat.com> wrote: >>>> >>>> On 12/05, ManuParra wrote: >>>>> Thanks, I have restarted the service and I see that after a few minutes then cinder-volume service goes down again when I check it with the command openstack volume service list. >>>>> The host/service that contains the cinder-volumes is rbd:volumes@ceph-rbd that is RDB in Ceph, so the problem does not come from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the volumes. I have checked Ceph and the status of everything is correct, no errors or warnings. >>>>> The error I have is that cinder can’t connect to rbd:volumes@ceph-rbd. Any further suggestions? Thanks in advance. >>>>> Kind regards. >>>>> >>>> >>>> Hi, >>>> >>>> You are most likely using an older release, have a high number of cinder >>>> RBD volumes, and have not changed configuration option >>>> "rbd_exclusive_cinder_pool" from its default "false" value. >>>> >>>> Please add to your driver's section in cinder.conf the following: >>>> >>>> rbd_exclusive_cinder_pool = true >>>> >>>> >>>> And restart the service. >>>> >>>> Cheers, >>>> Gorka. >>>> >>>>>> On 11 May 2021, at 22:30, Eugen Block <eblock@nde.ag> wrote: >>>>>> >>>>>> Hi, >>>>>> >>>>>> so restart the volume service;-) >>>>>> >>>>>> systemctl restart openstack-cinder-volume.service >>>>>> >>>>>> >>>>>> Zitat von ManuParra <mparra@iaa.es>: >>>>>> >>>>>>> Dear OpenStack community, >>>>>>> >>>>>>> I have encountered a problem a few days ago and that is that when creating new volumes with: >>>>>>> >>>>>>> "openstack volume create --size 20 testmv" >>>>>>> >>>>>>> the volume creation status shows an error. If I go to the error log detail it indicates: >>>>>>> >>>>>>> "Schedule allocate volume: Could not find any available weighted backend". >>>>>>> >>>>>>> Indeed then I go to the cinder log and it indicates: >>>>>>> >>>>>>> "volume service is down - host: rbd:volumes@ceph-rbd”. >>>>>>> >>>>>>> I check with: >>>>>>> >>>>>>> "openstack volume service list” in which state are the services and I see that indeed this happens: >>>>>>> >>>>>>> >>>>>>> | cinder-volume | rbd:volumes@ceph-rbd | nova | enabled | down | 2021-04-29T09:48:42.000000 | >>>>>>> >>>>>>> And stopped since 2021-04-29 ! >>>>>>> >>>>>>> I have checked Ceph (monitors,managers, osds. etc) and there are no problems with the Ceph BackEnd, everything is apparently working. >>>>>>> >>>>>>> This happened after an uncontrolled outage.So my question is how do I restart only cinder-volumes (I also have cinder-backup, cinder-scheduler but they are ok). >>>>>>> >>>>>>> Thank you very much in advance. Regards. >>>>>> >>>>>> >>>>>> >>>>>> >>>>> >>>>> >>>> >>>> >>> >> >
participants (7)
-
DHilsbos@performair.com
-
Eugen Block
-
Gorka Eguileor
-
Laurent Dumont
-
ManuParra
-
Sebastian Luna Valero
-
Tobias Urdin