[kolla][cinder] cinder containers api, volume, backup unhealthy

30 May 2023

      Hi,

after replacing my control nodes with new nodes (all bare-metal) somehow 
the cinder containers are no longer starting.

I checked the logs on one of the control nodes and I see this in the 
api-eror.log:

2023-05-30 21:31:36.350636 Timeout when reading response headers from 
daemon process 'cinder-api': /var/www/cgi-bin/cinder/cinder-wsgi
2023-05-30 21:31:37.827101 mod_wsgi (pid=22): Failed to exec Python 
script file '/var/www/cgi-bin/cinder/cinder-wsgi'.
2023-05-30 21:31:37.827168 mod_wsgi (pid=22): Exception occurred 
processing WSGI script '/var/www/cgi-bin/cinder/cinder-wsgi'.
2023-05-30 21:31:37.828005 Traceback (most recent call last):
2023-05-30 21:31:37.828046   File "/var/www/cgi-bin/cinder/cinder-wsgi", 
line 52, in <module>
2023-05-30 21:31:37.828053     application = initialize_application()
2023-05-30 21:31:37.828058   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/cinder/wsgi/wsgi.py", 
line 44, in initialize_application
2023-05-30 21:31:37.828063     coordination.COORDINATOR.start()
2023-05-30 21:31:37.828068   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/cinder/coordination.py", 
line 86, in start
2023-05-30 21:31:37.828071 self.coordinator.start(start_heart=True)
2023-05-30 21:31:37.828075   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/coordination.py", 
line 689, in start
2023-05-30 21:31:37.828078 super(CoordinationDriverWithExecutor, 
self).start(start_heart)
2023-05-30 21:31:37.828083   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/coordination.py", 
line 426, in start
2023-05-30 21:31:37.828086     self._start()
2023-05-30 21:31:37.828090   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/tooz/drivers/etcd3gw.py", 
line 224, in _start
2023-05-30 21:31:37.828093     self._membership_lease = 
self.client.lease(self.membership_timeout)
2023-05-30 21:31:37.828098   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/etcd3gw/client.py", 
line 122, in lease
2023-05-30 21:31:37.828111     json={"TTL": ttl, "ID": 0})
2023-05-30 21:31:37.828116   File 
"/var/lib/kolla/venv/lib/python3.6/site-packages/etcd3gw/client.py", 
line 88, in post
2023-05-30 21:31:37.828123     resp.reason
2023-05-30 21:31:37.828154 etcd3gw.exceptions.ConnectionTimeoutError: 
Gateway Time-out

All other containers are working just fine. Even the cinder_scheduler 
container works fine.

So far I have tried the following:

remove the cinder containers including its volume from one control node

mariadb_recovery

Reboot all control nodes.

kolla-ansible reconfigure --tags cinder,nova

Any help is highly appreciated.

Cheers,

Oliver

Oliver Weinmann

Michał Nasiadka

tags

participants (2)