Restart cinder-volume with Ceph rdb

Sebastian Luna Valero sebastian.luna.valero at gmail.com
Sat May 15 08:08:24 UTC 2021


Hi All,

Thanks for your inputs so far. I am also trying to help Manu with this
issue.

The "cinder-volume" service was working properly with the existing
configuration. However, after a power outage the service is no longer
reported as "up".

Looking at the source code, the service status is reported as "down" by
"cinder-scheduler" in here:

https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_manager.py#L618

With message: "WARNING cinder.scheduler.host_manager [req-<>- default
default] volume service is down. (host: rbd:volumes at ceph-rbd)"

I printed out the "service" tuple
https://github.com/openstack/cinder/blob/stable/train/cinder/scheduler/host_manager.py#L615
and we get:

"2021-05-15 09:57:24.918 7 WARNING cinder.scheduler.host_manager [<> -
default default]
Service(active_backend_id=None,availability_zone='nova',binary='cinder-volume',cluster=<?>,cluster_name=None,created_at=2020-06-12T07:53:42Z,deleted=False,deleted_at=None,disabled=False,disabled_reason=None,frozen=False,host='rbd:volumes at ceph-rbd
',id=12,modified_at=None,object_current_version='1.38',replication_status='disabled',report_count=8067424,rpc_current_version='3.16',topic='cinder-volume',updated_at=2021-05-12T15:37:52Z,uuid='604668e8-c2e7-46ed-a2b8-086e588079ac')"

Cinder is configured with a Ceph RBD backend, as explained in
https://github.com/openstack/kolla-ansible/blob/stable/train/doc/source/reference/storage/external-ceph-guide.rst#cinder

That's where the "backend_host=rbd:volumes" configuration is coming from.

We are using 3 controller nodes for OpenStack and 3 monitor nodes for Ceph.

The Ceph cluster doesn't report any error. The "cinder-volume" containers
don't report any error. Moreover, when we go inside the "cinder-volume"
container we are able to list existing volumes with:

rbd -p cinder.volumes --id cinder -k /etc/ceph/ceph.client.cinder.keyring ls

So the connection to the Ceph cluster works.

Why is "cinder-scheduler" reporting the that the backend Ceph cluster is
down?

Many thanks,
Sebastian


On Thu, 13 May 2021 at 13:12, Tobias Urdin <tobias.urdin at binero.com> wrote:

> Hello,
>
> I just saw that you are running Ceph Octopus with Train release and wanted
> to let you know that we saw issues with the os-brick version shipped with
> Train not supporting client version of Ceph Octopus.
>
> So for our Ceph cluster running Octopus we had to keep the client version
> on Nautilus until upgrading to Victoria which included a newer version of
> os-brick.
>
> Maybe this is unrelated to your issue but just wanted to put it out there.
>
> Best regards
> Tobias
>
> > On 13 May 2021, at 12:55, ManuParra <mparra at iaa.es> wrote:
> >
> > Hello Gorka, not yet, let me update cinder configuration, add the
> option, restart cinder and I’ll update the status.
> > Do you recommend other things to try for this cycle?
> > Regards.
> >
> >> On 13 May 2021, at 09:37, Gorka Eguileor <geguileo at redhat.com> wrote:
> >>
> >>> On 13/05, ManuParra wrote:
> >>> Hi Gorka again, yes, the first thing is to know why you can't connect
> to that host (Ceph is actually set up for HA) so that's the way to do it. I
> tell you this because previously from the beginning of the setup of our
> setup it has always been like that, with that hostname and there has been
> no problem.
> >>>
> >>> As for the errors, the strangest thing is that in Monasca I have not
> found any error log, only warning on “volume service is down. (host:
> rbd:volumes at ceph-rbd)" and info, which is even stranger.
> >>
> >> Have you tried the configuration change I recommended?
> >>
> >>
> >>>
> >>> Regards.
> >>>
> >>>> On 12 May 2021, at 23:34, Gorka Eguileor <geguileo at redhat.com> wrote:
> >>>>
> >>>> On 12/05, ManuParra wrote:
> >>>>> Hi Gorka, let me show the cinder config:
> >>>>>
> >>>>> [ceph-rbd]
> >>>>> rbd_ceph_conf = /etc/ceph/ceph.conf
> >>>>> rbd_user = cinder
> >>>>> backend_host = rbd:volumes
> >>>>> rbd_pool = cinder.volumes
> >>>>> volume_backend_name = ceph-rbd
> >>>>> volume_driver = cinder.volume.drivers.rbd.RBDDriver
> >>>>> …
> >>>>>
> >>>>> So, using rbd_exclusive_cinder_pool=True it will be used just for
> volumes? but the log is saying no connection to the backend_host.
> >>>>
> >>>> Hi,
> >>>>
> >>>> Your backend_host doesn't have a valid hostname, please set a proper
> >>>> hostname in that configuration option.
> >>>>
> >>>> Then the next thing you need to have is the cinder-volume service
> >>>> running correctly before making any requests.
> >>>>
> >>>> I would try adding rbd_exclusive_cinder_pool=true then tailing the
> >>>> volume logs, and restarting the service.
> >>>>
> >>>> See if the logs show any ERROR level entries.
> >>>>
> >>>> I would also check the service-list output right after the service is
> >>>> restarted, if it's up then I would check it again after 2 minutes.
> >>>>
> >>>> Cheers,
> >>>> Gorka.
> >>>>
> >>>>
> >>>>>
> >>>>> Regards.
> >>>>>
> >>>>>
> >>>>>> On 12 May 2021, at 11:49, Gorka Eguileor <geguileo at redhat.com>
> wrote:
> >>>>>>
> >>>>>> On 12/05, ManuParra wrote:
> >>>>>>> Thanks, I have restarted the service and I see that after a few
> minutes then cinder-volume service goes down again when I check it with the
> command openstack volume service list.
> >>>>>>> The host/service that contains the cinder-volumes is
> rbd:volumes at ceph-rbd that is RDB in Ceph, so the problem does not come
> from Cinder, rather from Ceph or from the RDB (Ceph) pools that stores the
> volumes. I have checked Ceph and the status of everything is correct, no
> errors or warnings.
> >>>>>>> The error I have is that cinder can’t  connect to
> rbd:volumes at ceph-rbd. Any further suggestions? Thanks in advance.
> >>>>>>> Kind regards.
> >>>>>>>
> >>>>>>
> >>>>>> Hi,
> >>>>>>
> >>>>>> You are most likely using an older release, have a high number of
> cinder
> >>>>>> RBD volumes, and have not changed configuration option
> >>>>>> "rbd_exclusive_cinder_pool" from its default "false" value.
> >>>>>>
> >>>>>> Please add to your driver's section in cinder.conf the following:
> >>>>>>
> >>>>>> rbd_exclusive_cinder_pool = true
> >>>>>>
> >>>>>>
> >>>>>> And restart the service.
> >>>>>>
> >>>>>> Cheers,
> >>>>>> Gorka.
> >>>>>>
> >>>>>>>> On 11 May 2021, at 22:30, Eugen Block <eblock at nde.ag> wrote:
> >>>>>>>>
> >>>>>>>> Hi,
> >>>>>>>>
> >>>>>>>> so restart the volume service;-)
> >>>>>>>>
> >>>>>>>> systemctl restart openstack-cinder-volume.service
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> Zitat von ManuParra <mparra at iaa.es>:
> >>>>>>>>
> >>>>>>>>> Dear OpenStack community,
> >>>>>>>>>
> >>>>>>>>> I have encountered a problem a few days ago and that is that
> when creating new volumes with:
> >>>>>>>>>
> >>>>>>>>> "openstack volume create --size 20 testmv"
> >>>>>>>>>
> >>>>>>>>> the volume creation status shows an error.  If I go to the error
> log detail it indicates:
> >>>>>>>>>
> >>>>>>>>> "Schedule allocate volume: Could not find any available weighted
> backend".
> >>>>>>>>>
> >>>>>>>>> Indeed then I go to the cinder log and it indicates:
> >>>>>>>>>
> >>>>>>>>> "volume service is down - host: rbd:volumes at ceph-rbd”.
> >>>>>>>>>
> >>>>>>>>> I check with:
> >>>>>>>>>
> >>>>>>>>> "openstack volume service list”  in which state are the services
> and I see that indeed this happens:
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>> | cinder-volume | rbd:volumes at ceph-rbd | nova | enabled | down
> | 2021-04-29T09:48:42.000000 |
> >>>>>>>>>
> >>>>>>>>> And stopped since 2021-04-29 !
> >>>>>>>>>
> >>>>>>>>> I have checked Ceph (monitors,managers, osds. etc) and there are
> no problems with the Ceph BackEnd, everything is apparently working.
> >>>>>>>>>
> >>>>>>>>> This happened after an uncontrolled outage.So my question is how
> do I restart only cinder-volumes (I also have cinder-backup,
> cinder-scheduler but they are ok).
> >>>>>>>>>
> >>>>>>>>> Thank you very much in advance. Regards.
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>
> >>>>>>>
> >>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >>
> >>
> >
> >
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20210515/41965114/attachment-0001.html>


More information about the openstack-discuss mailing list