Open Stack

Mon Mar 5 15:25:13 UTC 2012

Excerpts from Mark Washenberger's message of 2012-03-04 23:34:03 -0500:
> While we are on the topic of api performance and the database, I have a
> few thoughts I'd like to share.
> 
> TL;DR:
> - we should consider refactoring our wsgi server to leverage multiple
>   processors
> - we could leverage compute-cell database responsibility separataion
>   to speedup our api database performance by several orders of magnitude
> 
> I think the main way eventlet holds us back right now is that we have
> such low utilization. The big jump with multiprocessing or threading
> would be the potential to leverage more powerful hardware. Currently
> nova-api probably wouldn't run any faster on bare metal than it would
> run on an m1.tiny. Of course, this isn't an eventlet limitation per se
> but rather we are limiting ourselves to eventlet single-processing
> performance with our wsgi server implementation.

This seems fairly easily remedied without code changes via usage of something 
like gunicorn (in multi-process single socket mode as wsgi frontend), or any generic 
load balancer against multiple processes. But its of limited utility unless the 
individual processes can handle concurrency scenarios greater than 1.

I'm a bit skeptical about the use of multiprocessing, it imposes its own set of 
constraints and problems. Interestingly using like zmq (again with its own 
issues, but more robust imo than multiprocessing) allows for transparency from 
single process ipc to network ipc without the file handle, event loop 
inheritance concerns of something like multprocessing.

> 
> However, the greatest performance improvement I see would come from
> streamlining the database interactions incurred on each nova-api
> request. We have been pretty fast-and-loose with adding database
> and glance calls to the openstack api controllers and compute api.
> I am especially thinking of the extension mechanism, which tends
> to require another database call for each /servers extension a
> deployer chooses to enable.
> 
> But, if we think in ideal terms, each api request should perform
> no more than 1 database call for queries, and no more than 2 db calls
> for commands (validation + initial creation). In addition, I can
> imagine an implementation where these database calls don't have any
> joins, and involve no more than one network roundtrip.
>

is there any debug tooling around api endpoints that can identify these calls 
ala some of the wsgi middleware targeted towards web apps (ie. debugtoolbars).

> Beyond refactoring the way we add in data for response extensions,
> I think the right way to get this database performance is make the
> compute-cells approach the "normal". In this approach, there are
> at least two nova databases, one which lives along with the nova-api
> nodes, and one that lives in a compute cell. The api database is kept
> up to date through asynchronous updates that bubble up from the
> compute cells. With this separation, we are free to tailor the schema
> of the api database to match api performance needs, while we tailor
> the schema of the compute cell database to the operational requirements
> of compute workers. In particular, we can completely denormalize the
> tables in the api database without creating unpleasant side effects
> in the compute manager code. This denormalization both means fewer
> database interactions and fewer joins (which likely matters for larger
> deployments).
> 
> If we partner this streamlining and denormalization approach with
> similar attentions to glance performance and an rpc implementation
> that writes to disk and returns, processing network activities in
> the background, I think we could get most api actions to < 10 ms on
> reasonable hardware. 
> 
> As much as the initial push on compute-cells is about scale, I think
> it could enable major performance improvements directly on its heels
> during the fulsom cycle. This is something I'd love to talk about more
> at the conference if anyone has any interest.
> 

sounds interesting, but potentially complex, with schema and data drift 
possibilities.

cheers,

Kapil

Open Stack

[Openstack] eventlet weirdness

OpenStack

Community

Documentation

Branding & Legal