[nova][dev] Fixing eventlet monkey patching in Nova
mordred at inaugust.com
Fri Mar 15 21:41:20 UTC 2019
On 3/15/19 8:03 PM, Sean Mooney wrote:
> On Fri, 2019-03-15 at 12:15 -0700, melanie witt wrote:
>> On Fri, 15 Mar 2019 18:53:35 +0000 (GMT), Chris Dent
>> <cdent+os at anticdent.org> wrote:
>>> The mod_wsgi/uwsgi side of things strived to be eventlet free as it
>>> makes for weirdness, and at some point I did some work to be sure it
>>> never showed up in placement  while it was still in nova. A side
>>> effect of that work was that it also didn't need to show up in the
>>> nova-api unless you used the cmd line script version of it. At the
>>> same time I also explored not importing the world, and was able to
>>> get some improvements (mostly by moving things out of __init__.py in
>>> packages that had deeply nested members) but not as much as would
>>> have liked.
>>> However, we later (as mentioned elsewhere in the thread) made
>>> getting cell mappings parallelize, bring back the need for eventlet,
>>> it seems.
>> Is there an alternative we could use for threading in the API that is
>> compatible with python 2.7? I'd be happy to convert the cells
>> scatter-gather code to use it, if so.
> its a little heavy weight but python multi process or explicit threads would work.
> taking the multi processing example
> from multiprocessing import Pool
> def f(x):
> return x*x
> if __name__ == '__main__':
> p = Pool(5)
> print(p.map(f, [1, 2, 3]))
> you bassically could create a pool X process and map the function acroos the pool.
> X could either be a fixed parallelisum factor or the total numa of cells.
> if f is a function that takes the cell and lists the instances then
> p.map(f,["cell1","cell2",...]) returns the set of results from each of the concurrent executions.
> but it also blocks until they are completed.
> eventlets gives you concurency which means we interleave the requests but only one request is executing at
> anyone time. using mulitprocess will give real parallelism but no concurance as we will block until all the
> parralel request are completed.
> you can kind of get the best of both world by submitting the request asycronously
> as is showen in the later example
> # launching multiple evaluations asynchronously *may* use more processes
> multiple_results = [pool.apply_async(os.getpid, ()) for i in range(4)]
> print [res.get(timeout=1) for res in multiple_results]
> this allows you to submit multiple request to the pool in parallel
> then retrive them with a timeout after all results are submitted allowing use
> limit the time we wait if there is a slow or down cell.
FWIW, we use explicit threads in Zuul and Nodepool and have been very
happy with them- since they're explicit and don't require weird
In sdk, there are a few things that need to be done in the background
(when uploading a very large swift object and we're going to split it in
to chunks and upload concurrently) In that case we used
concurrent.futures.ThreadPoolExecutor which creates a pool of threads
and allows you to submit jobs into it ... very simiar to the
using-pool-of-workers example above.
executor = concurrent.futures.ThreadPoolExecutor(max_workers=5)
job_future = executor.submit(some_callable, some_arg, other=arg)
for completed in concurrent.futures.as_completed([job_future]):
result = completed.result()
> wsgi also provides an additional layer of parallelism as each instance of the api should be severing only one request
> and that parallelism is managed by uwsgi or apache if using mod_wsgi.
> im not sure if multiprocess is warented in this case but if we did use it we should proably create the pool once
> and reuse it rather then inline in the scater gather function.
> anyway that would how i would personally aproch this if it was identified as a perfromace issue with
> a config argument to control the number of processes in the pool but it would definitely be a train thing
> as its a non tivial change in how this currently works.
More information about the openstack-discuss