[kolla] All services stats DOWN after re-launch whole cluster.

Erik McCormick emccormick at cirrusseven.com
Tue Feb 4 13:19:36 UTC 2020


On Tue, Feb 4, 2020, 7:20 AM Eddie Yen <missile0407 at gmail.com> wrote:

> Hi everyone,
> We have the Kolla Openstack site, which is 3 HCI (Controller+Compute) + 3
> Storage (Ceph OSD)
> site without internet. We did the shutdown few days ago since CNY
> holidays.
> Today we re-launch whole cluster back. First we met the issue that MariaDB
> containers keep
> restarting, and we fixed by using mariadb_recovery command.
> After that we check the status of each services, and found that all
> services shown at
> Admin > System > System Information are DOWN. Strange is no MariaDB, AMQP
> connection,
> or other error found when check the downed service log.
> We tried reboot each servers but the situation still a same. Then we found
> the RabbitMQ log not
> updating, the last log still stayed at the date we shutdown. Logged in to
> RabbitMQ container and
> type "rabbitmqctl status" shows connection refused, and tried access its
> web manager from
> <VIP>:15672 on browser just gave us "503 Service unavailable" message.
> Also no port 5672
> listening.

Any chance you have a NIC that didn't come up? What is in the log of the
container itself? (ie. docker log rabbitmq).

> I searched this issue on the internet but only few information about this.
> One of solution is delete
> some files in mnesia folder, another is remove rabbitmq container and its
> volume then re-deploy.
> But both are not sure. Does anyone know how to solve it?
> Many thanks,
> Eddie.


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20200204/d8fae237/attachment.html>

More information about the openstack-discuss mailing list