[openstack-dev] Moving task flow to conductor - concern about scale

Peter Feiner peter at gridcentric.ca
Fri Jul 19 14:54:01 UTC 2013

On Fri, Jul 19, 2013 at 10:15 AM, Dan Smith <dms at danplanet.com> wrote:
> > So rather than asking "what doesn't work / might not work in the
> > future" I think the question should be "aside from them both being
> > things that could be described as a conductor - what's the
> > architectural reason for wanting to have these two separate groups of
> > functionality in the same service ?"
> IMHO, the architectural reason is "lack of proliferation of services and
> the added complexity that comes with it." If one expects the
> proxy workload to always overshadow the task workload, then making
> these two things a single service makes things a lot simpler.

I'd like to point a low-level detail that makes scaling nova-conductor
at the process level extremely compelling: the database driver
blocking the eventlet thread serializes nova's database access.

Since the database connection driver is typically implemented in a
library beyond the purview of eventlet's monkeypatching (i.e., a
native python extension like _mysql.so), blocking database calls will
block all eventlet coroutines. Since most of what nova-conductor does
is access the database, a nova-conductor process's handling of
requests is effectively serial.

Nova-conductor is the gateway to the database for nova-compute
processes.  So permitting a single nova-conductor process would
effectively serialize all database queries during instance creation,
deletion, periodic instance refreshes, etc. Since these queries are
made frequently (i.e., easily 100 times during instance creation) and
while other global locks are held (e.g., in the case of nova-compute's
ResourceTracker), most of what nova-compute does becomes serialized.

In parallel performance experiments I've done, I have found that
running multiple nova-conductor processes is the best way to mitigate
the serialization of blocking database calls. Say I am booting N
instances in parallel (usually up to N=40). If I have a single
nova-conductor process, the duration of each nova-conductor RPC
increases linearly with N, which can add _minutes_ to instance
creation time (i.e., dozens of RPCs, some taking several seconds).
However, if I run N nova-conductor processes in parallel, then the
duration of the nova-conductor RPCs do not increase with N; since each
RPC is most likely handled by a different nova-conductor, serial
execution of each process is moot.

Note that there are alternative methods for preventing the eventlet
thread from blocking during database calls. However, none of these
alternatives performed as well as multiple nova-conductor processes:

Instead of using the native database driver like _mysql.so, you can
use a pure-python driver, like pymysql by setting
sql_connection=mysql+pymysql://... in the [DEFAULT] section of
/etc/nova/nova.conf, which eventlet will monkeypatch to avoid
blocking. The problem with this approach is the vastly greater CPU
demand of the pure-python driver compared to the native driver. Since
the pure-python driver is so much more CPU intensive, the eventlet
thread spends most of its time talking to the database, which
effectively the problem we had before!

Instead of making database calls from eventlet's thread, you can
submit them to eventlet's pool of worker threads and wait for the
results. Try this by setting dbapi_use_tpool=True in the [DEFAULT]
section of /etc/nova/nova.conf. The problem I found with this approach
was the overhead of synchronizing with the worker threads. In
particular, the time elapsed between the worker thread finishing and
the waiting coroutine being resumed was typically several times
greater than the duration of the database call itself.

More information about the OpenStack-dev mailing list