Cinder API timeout on single-control node
Eugen Block
eblock at nde.ag
Wed Sep 1 12:02:25 UTC 2021
Hi *,
since your last responses were quite helpful regarding rabbitmq I
would like to ask a different question for the same environment. It's
an older openstack version (Pike) running with only one control node.
There already were lots of volumes (way more than 1000) in that cloud,
but after adding a bunch more (not sure how many exactly) in one
project the whole cinder api became extremely slow. Both horizon and
CLI run into timeouts:
[Wed Sep 01 13:18:52.109178 2021] [wsgi:error] [pid 60440] [client
<IP>:58474] Timeout when reading response headers from daemon process
'horizon':
/srv/www/openstack-dashboard/openstack_dashboard/wsgi/django.wsgi,
referer: http://<control>/project/volumes/
[Wed Sep 01 13:18:53.664714 2021] [wsgi:error] [pid 13007] Not Found:
/favicon.ico
Sometimes the volume creation succeeds if you just retry, but it often
fails. The dashboard shows a "504 gateway timeout" after two minutes
(also after four minutes after I increased the timeout for the apache
dashboard config).
The timeout also shows even if I try to get into the volumes tab of an
empty project.
A couple of weeks ago I already noticed some performance issues with
cinder api if there are lots of attached volumes, if there are many
"available" volumes it doesn't seem to slow things down. But since
then the total number of volumes has doubled. At the moment there are
more than 960 attached volumes across all projects and more than 750
detached volumes. I searched the cinder.conf for any helpful setting
but I'm not sure which would actually help. And since it's a
production cloud I would like to avoid restarting services all the
time just to try something. Maybe some of you can point me in the
right direction? I would appreciate any help!
If there's more information I can provide just let me know.
Thanks!
Eugen
More information about the openstack-discuss
mailing list