[openstack-dev] Asynchrounous programming: replace eventlet with asyncio
Jesse Noller
jesse.noller at RACKSPACE.COM
Thu Feb 6 21:35:48 UTC 2014
On Feb 6, 2014, at 2:32 PM, Kevin Conway <kevinjacobconway at gmail.com> wrote:
> There's an incredibly valid reason why we use green thread abstractions like eventlet and gevent in Python. The CPython implementation is inherently single threaded so we need some other form of concurrency to get the most effective use out of our code. You can "import threading" all you want but it won't work the way you expect it to. If you are considering doing anything threading related in Python then http://www.youtube.com/watch?v=Obt-vMVdM8s is absolutely required watching.
As a python core-dev, this isn’t *entirely* correct - or incorrect. It’s actually misleading though. But I don’t think -dev is where we need to boil the ocean about concurrency in python…
>
> Green threads give us a powerful way to manage concurrency where it counts: I/O. Everything in openstack is waiting on something else in openstack. That is our natural state of being. If your plan for increasing the number of concurrent requests is "fork more processes" then you're in for a rude awakening when your hosts start kernel panicking from a lack of memory. With green threads, on the other hand, we maintain the use of one process, one thread but are able to manage multiple, concurrent network operations.
>
> In the case of API nodes: yes, they should (at most) do some db work and drop a message on the queue. That means they almost exclusively deal with I/O. Expecting your wsgi server to scale that up for you is wrong and, in fact, the reason we have eventlet in the first place.
>
> What's more is this conversation has turned from "lets use asyncio" to "lets make evenltet work with asyncio". If the aim is to convert eventlet to use the asyncio interface then this seems like a great idea so long as it takes place within the eventlet project and not openstack. I don't see the benefit of shimming in asyncio and a fork/backport of asyncio into any of our code bases if the point is to integrate it into a third party module.
>
> On Thu, Feb 6, 2014 at 12:34 PM, Joshua Harlow <harlowja at yahoo-inc.com> wrote:
> Its a good question, I see openstack as mostly like the following 2 groups of applications.
>
> Group 1:
>
> API entrypoints using [apache/nginx]+wsgi (nova-api, glance-api…)
>
> In this group we can just let the underlying framework/app deal with the scaling and just use native wsgi as it was intended. Scale more [apache/nginx] if u need more requests per second. For any kind of long term work these apps should be dropping all work to be done on a MQ and letting someone pick that work up to be finished in some future time.
>
> Group 2:
>
> Workers that pick things up off MQ. In this area we are allowed to be a little more different and change as we want, but it seems like the simple approach we have been doing is the daemon model (forking N child worker processes). We've also added eventlet in these children (so it becomes more like NxM where M is the number of greenthreads). For the usages where workers are used has it been beneficial to add those M greenthreads? If we just scaled out more N (processes) how bad would it be? (I don't have the answers here actually, but it does make you wonder why we couldn't just eliminate eventlet/asyncio altogether and just use more N processes).
>
> -Josh
>
> From: Yuriy Taraday <yorik.sar at gmail.com>
>
> Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Date: Thursday, February 6, 2014 at 10:06 AM
>
> To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] Asynchrounous programming: replace eventlet with asyncio
>
> Hello.
>
>
> On Tue, Feb 4, 2014 at 5:38 PM, victor stinner <victor.stinner at enovance.com> wrote:
> I would like to replace eventlet with asyncio in OpenStack for the asynchronous programming. The new asyncio module has a better design and is less "magical". It is now part of python 3.4 arguably becoming the de-facto standard for asynchronous programming in Python world.
>
> I think that before doing this big move to yet another asynchronous framework we should ask the main question: Do we need it? Why do we actually need async framework inside our code?
> There most likely is some historical reason why (almost) every OpenStack project runs every its process with eventlet hub, but I think we should reconsider this now when it's clear that we can't go forward with eventlet (because of py3k mostly) and we're going to put considerable amount of resources into switching to another async framework.
>
> Let's take Nova for example.
>
> There are two kinds of processes there: nova-api and others.
>
> - nova-api process forks to a number of workers listening on one socket and running a single greenthread for each incoming request;
> - other services (workers) constantly poll some queue and spawn a greenthread for each incoming request.
>
> Both kinds to basically the same job: receive a request, run a handler in a greenthread. Sounds very much like a job for some application server that does just that and does it good.
> If we remove all dependencies from eventlet or any other async framework, we would not only be able to write Python code without need to keep in mind that we're running in some reactor (that's why eventlet was chosen over Twisted IIRC), but we can also forget about all these frameworks altogether.
>
> I suggest approach like this:
> - for API services use dead-simple threaded WSGI server (we have one in the stdlib by the way - in wsgiref);
> - for workers use simple threading-based oslo.messaging loop (it's on its way).
>
> Of course, it won't be production-ready. Dumb threaded approach won't scale but we don't have to write our own scaling here. There are other tools around to do this: Apache httpd, Gunicorn, uWSGI, etc. And they will work better in production environment than any code we write because they are proven with time and on huge scales.
>
> So once we want to go to production, we can deploy things this way for example:
> - API services can be deployed within Apache server or any other HTTP server with WSGI backend (Keystone already can be deployed within Apache);
> - workers can be deployed in any non-HTTP application server, uWSGI is a great example of one that can work in this mode.
>
> With this approach we can leave the burden of process management, load balancing, etc. to the services that are really good at it.
>
> What do you think about this?
>
> --
>
> Kind regards, Yuriy.
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> --
> Kevin Conway
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list