On Fri, 2019-03-15 at 12:15 -0700, melanie witt wrote:
On Fri, 15 Mar 2019 18:53:35 +0000 (GMT), Chris Dent <cdent+os@anticdent.org> wrote:
The mod_wsgi/uwsgi side of things strived to be eventlet free as it makes for weirdness, and at some point I did some work to be sure it never showed up in placement [1] while it was still in nova. A side effect of that work was that it also didn't need to show up in the nova-api unless you used the cmd line script version of it. At the same time I also explored not importing the world, and was able to get some improvements (mostly by moving things out of __init__.py in packages that had deeply nested members) but not as much as would have liked.
However, we later (as mentioned elsewhere in the thread) made getting cell mappings parallelize, bring back the need for eventlet, it seems.
Is there an alternative we could use for threading in the API that is compatible with python 2.7? I'd be happy to convert the cells scatter-gather code to use it, if so. its a little heavy weight but python multi process or explicit threads would work.
taking the multi processing example https://docs.python.org/2/library/multiprocessing.html from multiprocessing import Pool def f(x): return x*x if __name__ == '__main__': p = Pool(5) print(p.map(f, [1, 2, 3])) you bassically could create a pool X process and map the function acroos the pool. X could either be a fixed parallelisum factor or the total numa of cells. if f is a function that takes the cell and lists the instances then p.map(f,["cell1","cell2",...]) returns the set of results from each of the concurrent executions. but it also blocks until they are completed. eventlets gives you concurency which means we interleave the requests but only one request is executing at anyone time. using mulitprocess will give real parallelism but no concurance as we will block until all the parralel request are completed. you can kind of get the best of both world by submitting the request asycronously as is showen in the later example https://docs.python.org/2/library/multiprocessing.html#using-a-pool-of-worke... # launching multiple evaluations asynchronously *may* use more processes multiple_results = [pool.apply_async(os.getpid, ()) for i in range(4)] print [res.get(timeout=1) for res in multiple_results] this allows you to submit multiple request to the pool in parallel then retrive them with a timeout after all results are submitted allowing use limit the time we wait if there is a slow or down cell. wsgi also provides an additional layer of parallelism as each instance of the api should be severing only one request and that parallelism is managed by uwsgi or apache if using mod_wsgi. im not sure if multiprocess is warented in this case but if we did use it we should proably create the pool once and reuse it rather then inline in the scater gather function. anyway that would how i would personally aproch this if it was identified as a perfromace issue with a config argument to control the number of processes in the pool but it would definitely be a train thing as its a non tivial change in how this currently works.
-melanie