[openstack-dev] Moving task flow to conductor - concern about scale

Peter Feiner peter at gridcentric.ca
Fri Jul 19 21:25:27 UTC 2013

On Fri, Jul 19, 2013 at 4:36 PM, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
> This seems to me to be a good example where a library "problem" is leaking into the openstack architecture right? That is IMHO a bad path to go down.
> I like to think of a world where this isn't a problem and design the correct solution there instead and fix the eventlet problem instead. Other large applications don't fallback to rpc calls to get around a database/eventlet scaling issues afaik.
> Honestly I would almost just want to finally fix the eventlet problem (chris b. I think has been working on it) and design a system that doesn't try to work around a libraries lacking. But maybe that's to much idealism, idk...

Well, there are two problems that multiple nova-conductor processes
fix. One is the bad interaction between eventlet and native code. The
other is allowing multiprocessing.  That is, once nova-conductor
starts to handle enough requests, enough time will be spent holding
the GIL to make it a bottleneck; in fact I've had to scale keystone
using multiple processes because of GIL contention (i.e., keystone was
steadily at 100% CPU utilization when I was hitting OpenStack with
enough requests). So multiple processes isn't avoidable. Indeed, other
software that strives for high concurrency, such as apache, use
multiple processes to avoid contention for per-process kernel
resources like the mmap semaphore.

> This doesn't even touch on the synchronization issues that can happen when u start pumping db traffic over a mq. Ex, an update is now queued behind another update, the second one conflicts with the first, where does resolution happen when an async mq call is used. What about when you have X conductors doing Y reads and Z updates; I don't even want to think about the sync/races there (and so on...). Did u hit / check for any consistency issues in your tests? Consistency issues under high load using multiple conductors scare the bejezzus out of me....

If a sequence of updates needs to be atomic, then they should be made
in the same database transaction. Hence nova-conductor's interface
isn't do_some_sql(query), it's a bunch of high-level nova operations
that are implemented using transactions.

More information about the OpenStack-dev mailing list