Open Stack

Fri Mar 15 20:03:12 UTC 2019

On Fri, 2019-03-15 at 12:15 -0700, melanie witt wrote:
> On Fri, 15 Mar 2019 18:53:35 +0000 (GMT), Chris Dent 
> <cdent+os at anticdent.org> wrote:
> > The mod_wsgi/uwsgi side of things strived to be eventlet free as it
> > makes for weirdness, and at some point I did some work to be sure it
> > never showed up in placement [1] while it was still in nova. A side
> > effect of that work was that it also didn't need to show up in the
> > nova-api unless you used the cmd line script version of it. At the
> > same time I also explored not importing the world, and was able to
> > get some improvements (mostly by moving things out of __init__.py in
> > packages that had deeply nested members) but not as much as would
> > have liked.
> > 
> > However, we later (as mentioned elsewhere in the thread) made
> > getting cell mappings parallelize, bring back the need for eventlet,
> > it seems.
> 
> Is there an alternative we could use for threading in the API that is 
> compatible with python 2.7? I'd be happy to convert the cells 
> scatter-gather code to use it, if so.
its a little heavy weight but python multi process or explicit threads would work.

taking the multi processing example 
https://docs.python.org/2/library/multiprocessing.html

from multiprocessing import Pool

def f(x):
    return x*x

if __name__ == '__main__':
    p = Pool(5)
    print(p.map(f, [1, 2, 3]))

you bassically could create a pool X process and map the function acroos the pool.
X could either be a fixed parallelisum factor or the total numa of cells.

if f is a function that takes the cell and lists the instances then 
p.map(f,["cell1","cell2",...]) returns the set of results from each of the concurrent executions.
but it also blocks until they are completed.

eventlets gives you concurency which means we interleave the requests but only one request is executing at
anyone time. using mulitprocess will give real parallelism but no concurance as we will block until all the 
parralel request are completed.

you can kind of get the best of both world by submitting the request asycronously 
as is showen in the later example 
https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-workers

# launching multiple evaluations asynchronously *may* use more processes
    multiple_results = [pool.apply_async(os.getpid, ()) for i in range(4)]
    print [res.get(timeout=1) for res in multiple_results]

this allows you to submit multiple request to the pool in parallel
then retrive them with a timeout after all results are submitted allowing use
limit the time we wait if there is a slow or down cell.

wsgi also provides an additional layer of parallelism as each instance of the api should be severing only one request
and that parallelism is managed by uwsgi or apache if using mod_wsgi.

im not sure if multiprocess is warented in this case but if we did use it we should proably create the pool once
and reuse it rather then inline in the scater gather function.

anyway that would how i would personally aproch this if it was identified as a perfromace issue with
a config argument to control the number of processes in the pool but it would definitely be a train thing
as its a non tivial change in how this currently works.

> 
> -melanie
> 

Open Stack

[nova][dev] Fixing eventlet monkey patching in Nova

OpenStack

Community

Documentation

Branding & Legal