[openstack-dev] [Openstack][Nova] Slow Nova API Calls from periodic-tempest-devstack-vm-check-hourly

Mark McLoughlin markmc at redhat.com
Mon Jan 7 06:46:56 UTC 2013


On Fri, 2013-01-04 at 11:58 -0800, Joe Gordon wrote:
> Conclusion
> =========
> 
> Nova REST calls take way long, even in small scale devstack/tempest
> testing.  We should spend resources on fixing this for Grizzly.  I would
> like to start a dialog on how we can fix this and see if anyone wants to
> volunteer help.  Below are a few possible ideas of things that will help us
> make the REST calls faster.
> 
> * Continuously track REST API times of devstack/tempest jobs, to identify
> slow API calls and watch for trends.  (logstash + statsd?)
> * Identify the underlying cause of the slow performance.  It looks like
> nova.db.api may be partially responsible, but not the only issue?  What
> else is a bottleneck?
> * Enable millisecond level time logging
> https://review.openstack.org/#/c/18933/
> * Add an option in devstack to enable python profiling?  Whats best way to
> do this?
> * Make more REST API related code asynchronous (via RPC)

Summary is we're randomly seeing a multi-second delay in nova-api
between the time a HTTP request is received and we send an AMQP message?

Suspecting the DB calls is a good first guess. The ability to log slow
DB calls would be nice. Perhaps MySQL has something we can use here,
assuming we log some sort of client ID on the nova-api side that we can
use to correlate with the MySQL logs.

External processes we run is also another place to look. Again, logging
those processes that take longer than a certain threshold to run might
be good.

I'd also investigate whether it's some other greenthread blocking the
API request thread from running. What would be really nice would be some
way of logging when a particular greenthread ran for a long time while
other greenthreads are pending.

It would also be good to see some system level stats and see if it's
just that the system is generally CPU or I/O bound while these slow
calls are happening. Maybe logging CPU, memory, disk and network usage
along with the top few processes every tenth of a second.

Cheers,
Mark.




More information about the OpenStack-dev mailing list