Open Stack

Tue May 12 01:17:16 UTC 2015

On 12 May 2015 at 10:44, Mike Bayer <mbayer at redhat.com> wrote:

>> What we have today in our standard architecture for OpenStack is
>> optimised for IO bound workloads: waiting on the
>> network/subprocesses/disk/libvirt etc. Running high numbers of
>> eventlet handlers in a single process only works when the majority of
>> the work being done by a handler is IO.
>
>
> Everything stated here is great, however in our situation there is one
> unfortunate fact which renders it completely incorrect at the moment.   I'm
> still puzzled why we are getting into deep think sessions about the vagaries
> of the GIL and async when there is essentially a full-on red-alert
> performance blocker rendering all of this discussion useless, so I must
> again remind us: what we have *today* in Openstack is *as completely
> un-optimized as you can possibly be*.

Sorry if I seems like I went on a tangent, but choosing a concurrency
model in Python, which a lot of this discussion has been about, is
inextricably linked to the workload being tackled. The point of my
tl;dr was that using threads - which gets us out of the pit below - is
fine for most of our workloads and irrelevant to the actual issues in
the other ones. Clearly that didn't come across. - Sorry.

> The most GIL-heavy nightmare CPU bound task you can imagine running on 25
> threads on a ten year old Pentium will run better than the Openstack we have
> today, because we are running a C-based, non-eventlet patched DB library
> within a single OS thread that happens to use eventlet, but the use of
> eventlet is totally pointless because right now it blocks completely on all
> database IO.

To confirm my understanding: this library releases the GIL, but
because we only have one thread, we don't get more work done.

Yes, that sucks. And your tl;dr is that we need to either use an
eventlet ready library or not use eventlet's greenthreads, either of
which I support as a short term rectification.
...
> talking to the database much more intensively all the sudden; it is my
> opinion that a traditional threading model will be an easier environment to
> handle working out the approach to these issues; we have to assume
> "concurrency at any time" in any case because we run multiple instances of
> Nova etc. at the same time.  At the end of the day, we aren't going to see
> wildly better performance with one approach over the other in any case, so
> we should pick the one that is easier to develop, maintain, and keep stable.

I agree. I'd actually be quite interested in exploring a CSP model for
even clearer code and diagnosis of issues, but simple sequential code
within threads would be a win itself.

> Robert's analysis talks about various "at the limit" issues,  but I was

They tend to turn up at scale. You get 100 requests a day out of 5
million that are inexplicably slow, and eventually you have enough
data around the situation to try an experiment, and lo and behold the
problem goes away. They don't disagree with the argument you're making
though - this is just the bigger context, when folk go to deploy our
(real threads || eventlet friendly DB library) code, how many
processes will they need?

FWIW, I think moving to an eventlet friendly library should be the
first step because it can be done much more rapidly and with arguably
less risk.

I don't think the discuss ends there though :)

-Rob

-- 
Robert Collins <rbtcollins at hp.com>
Distinguished Technologist
HP Converged Cloud

Open Stack

[openstack-dev] [all] Replace mysql-python with mysqlclient

OpenStack

Community

Documentation

Branding & Legal