[Openstack-operators] Very Slow API operations after Grizzly upgrade.

Jonathan Proulx jon at jonproulx.com
Thu Aug 15 14:45:03 UTC 2013


Hi All,

I have a single controller node 60 compute node cloud on Ubuntu 12.04 /
cloud archive and after upgrade to grizzly  everything seem painfully slow.

I've had 'nova list' take on the order of one minute to return (there's 65
non-deleted instances and a total of just under 500k instances in the
instances table but that was true before upgrade as well)

The controller node is 4x busier with this tiny load of a single user and a
few VMs as it has averaged in production with 1,500 VMs dozens of users and
VMs starting every 6sec on average.

This has me a little worried but the system is so over spec'ed that I can't
see it as my current problem as the previous average was 5% CPU utilization
so now I'm only at 20%.  All the databases fit comfortably in memory with
plenty of room for caching so my disk I/0 is virtually nothing.

Not quite sure where to start.  I'd like to blame conductor for serializing
database access, but I really hope any service could handle at least one
rack of servers before you needed to scale out...but besides the poor user
experience of sluggish response I'm also getting timeouts if I try and
start some number of 10's of servers, the usual work flow around here often
involves 100's.

Anyone had similar problems and/or have suggestions of where else to look
for bottle necks.

-Jon
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20130815/eb375198/attachment.html>


More information about the OpenStack-operators mailing list