<div dir="ltr"><div><span style="font-size:12.8000001907349px">> All production Openstack applications today are fully serialized to only be able to emit a single query to the database at a time;</span><br></div><div><span style="font-size:12.8000001907349px">True. That's why any deployment configures tons (tens) of workers of any significant service.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">> </span><span style="font-size:12.8000001907349px"> When I talk about moving to threads, this is not a "won't help or hurt" kind of issue, at the moment it's a change that will immediately allow massive improvement to the performance of all Openstack applications instantly.</span></div><div><span style="font-size:12.8000001907349px">Not sure If it will give much benefit over separate processes. </span></div><div><span style="font-size:12.8000001907349px">I guess we don't configure many worker for gate testing (at least, neutron still doesn't do it), so there could be an improvement, but I guess to enable multithreading we would need to fix the same issues that prevented us from configuring multiple workers in the gate, plus possibly more.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">> We need to change the DB library or dump eventlet.</span></div><div><span style="font-size:12.8000001907349px">I'm +1 for the 1st option.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Other option, which is multithreading will most certainly bring concurrency issues other than database.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Thanks,</span></div><div><span style="font-size:12.8000001907349px">Eugene.</span></div><div><br></div></div><div class="gmail_extra"><br><div class="gmail_quote">On Mon, May 11, 2015 at 4:46 PM, Boris Pavlovic <span dir="ltr"><<a href="mailto:boris@pavlovic.me" target="_blank">boris@pavlovic.me</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Mike, <div><br></div><div>Thank you for saying all that you said above. </div><div><br></div><div>Best regards,</div><div>Boris Pavlovic </div></div><div class="HOEnZb"><div class="h5"><div class="gmail_extra"><br><div class="gmail_quote">On Tue, May 12, 2015 at 2:35 AM, Clint Byrum <span dir="ltr"><<a href="mailto:clint@fewbar.com" target="_blank">clint@fewbar.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Excerpts from Mike Bayer's message of 2015-05-11 15:44:30 -0700:<br>

<div><div>><br>

> On 5/11/15 5:25 PM, Robert Collins wrote:<br>

> ><br>

> > Details: Skip over this bit if you know it all already.<br>

> ><br>

> > The GIL plays a big factor here: if you want to scale the amount of<br>

> > CPU available to a Python service, you have two routes:<br>

> > A) move work to a different process through some RPC - be that DB's<br>

> > using SQL, other services using oslo.messaging or HTTP - whatever.<br>

> > B) use C extensions to perform work in threads - e.g. openssl context<br>

> > processing.<br>

> ><br>

> > To increase concurrency you can use threads, eventlet, asyncio,<br>

> > twisted etc - because within a single process *all* Python bytecode<br>

> > execution happens inside the GIL lock, so you get at most one CPU for<br>

> > a CPU bound workload. For an IO bound workload, you can fit more work<br>

> > in by context switching within that one CPU capacity. And - the GIL is<br>

> > a poor scheduler, so at the limit - an IO bound workload where the IO<br>

> > backend has more capacity than we have CPU to consume it within our<br>

> > process, you will run into priority inversion and other problems.<br>

> > [This varies by Python release too].<br>

> ><br>

> > request_duration = time_in_cpu + time_blocked<br>

> > request_cpu_utilisation = time_in_cpu/request_duration<br>

> > cpu_utilisation = concurrency * request_cpu_utilisation<br>

> ><br>

> > Assuming that we don't want any one process to spend a lot of time at<br>

> > 100% - to avoid such at-the-limit issues, lets pick say 80%<br>

> > utilisation, or a safety factor of 0.2. If a single request consumes<br>

> > 50% of its duration waiting on IO, and 50% of its duration executing<br>

> > bytecode, we can only run one such request concurrently without<br>

> > hitting 100% utilisations. (2*0.5 CPU == 1). For a request that spends<br>

> > 75% of its duration waiting on IO and 25% on CPU, we can run 3 such<br>

> > requests concurrently without exceeding our target of 80% utilisation:<br>

> > (3*0.25=0.75).<br>

> ><br>

> > What we have today in our standard architecture for OpenStack is<br>

> > optimised for IO bound workloads: waiting on the<br>

> > network/subprocesses/disk/libvirt etc. Running high numbers of<br>

> > eventlet handlers in a single process only works when the majority of<br>

> > the work being done by a handler is IO.<br>

><br>

> Everything stated here is great, however in our situation there is one<br>

> unfortunate fact which renders it completely incorrect at the moment.<br>

> I'm still puzzled why we are getting into deep think sessions about the<br>

> vagaries of the GIL and async when there is essentially a full-on<br>

> red-alert performance blocker rendering all of this discussion useless,<br>

> so I must again remind us: what we have *today* in Openstack is *as<br>

> completely un-optimized as you can possibly be*.<br>

><br>

> The most GIL-heavy nightmare CPU bound task you can imagine running on<br>

> 25 threads on a ten year old Pentium will run better than the Openstack<br>

> we have today, because we are running a C-based, non-eventlet patched DB<br>

> library within a single OS thread that happens to use eventlet, but the<br>

> use of eventlet is totally pointless because right now it blocks<br>

> completely on all database IO.   All production Openstack applications<br>

> today are fully serialized to only be able to emit a single query to the<br>

> database at a time; for each message sent, the entire application blocks<br>

> an order of magnitude more than it would under the GIL waiting for the<br>

> database library to send a message to MySQL, waiting for MySQL to send a<br>

> response including the full results, waiting for the database to unwrap<br>

> the response into Python structures, and finally back to the Python<br>

> space, where we can send another database message and block the entire<br>

> application and all greenlets while this single message proceeds.<br>

><br>

> To share a link I've already shared about a dozen times here, here's<br>

> some tests under similar conditions which illustrate what that<br>

> concurrency looks like:<br>

> <a href="http://www.diamondtin.com/2014/sqlalchemy-gevent-mysql-python-drivers-comparison/" target="_blank">http://www.diamondtin.com/2014/sqlalchemy-gevent-mysql-python-drivers-comparison/</a>.<br>

> MySQLdb takes *20 times longer* to handle the work of 100 sessions than<br>

> PyMySQL when it's inappropriately run under gevent, when there is<br>

> modestly high concurrency happening.   When I talk about moving to<br>

> threads, this is not a "won't help or hurt" kind of issue, at the moment<br>

> it's a change that will immediately allow massive improvement to the<br>

> performance of all Openstack applications instantly.  We need to change<br>

> the DB library or dump eventlet.<br>

><br>

> As far as if we should dump eventlet or use a pure-Python DB library, my<br>

> contention is that a thread based + C database library will outperform<br>

> an eventlet + Python-based database library. Additionally, if we make<br>

> either change, when we do so we may very well see all kinds of new<br>

> database-concurrency related bugs in our apps too, because we will be<br>

> talking to the database much more intensively all the sudden; it is my<br>

> opinion that a traditional threading model will be an easier environment<br>

> to handle working out the approach to these issues; we have to assume<br>

> "concurrency at any time" in any case because we run multiple instances<br>

> of Nova etc. at the same time.  At the end of the day, we aren't going<br>

> to see wildly better performance with one approach over the other in any<br>

> case, so we should pick the one that is easier to develop, maintain, and<br>

> keep stable.<br>

><br>

<br>

</div></div>Mike, I agree with the entire paragraph above, and I've been surprised to<br>

see the way this thread has gone with so much speculation.  Optimization<br>

can be such a divisive thing, I think we need to be mindful of that.<br>

<br>

Anyway, there is additional thought that might change the decision<br>

a bit. There is one "pro" to changing to use pymsql vs. changing to<br>

use threads, and that is that it isolates the change to only database<br>

access. Switching to threading means introducing threads to every piece<br>

of code we might touch while multiple threads are active.<br>

<br>

It really seems worth it to see if I/O bound portions of OpenStack<br>

become more responsive with pymysql before embarking on a change to the<br>

concurrency model. If it doesn't, not much harm done, and if it does,<br>

but makes us CPU bound, well then we have even more of a reason to set<br>

out on such a large task.<br>

<div><div><br>

__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</div></div></blockquote></div><br></div>

</div></div><br>__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

<br></blockquote></div><br></div>