[openstack-dev] [cinder] [oslo] MySQL connection shared between green threads concurrently

Doug Hellmann doug at doughellmann.com
Tue Feb 17 17:24:50 UTC 2015



On Tue, Feb 17, 2015, at 11:17 AM, Mike Bayer wrote:
> 
> 
> Doug Hellmann <doug at doughellmann.com> wrote:
> 
> >> 
> >> So I’m not really sure what’s going on here.  Cinder seems to have some
> >> openstack greenlet code of its own in
> >> cinder/openstack/common/threadgroup.py, I don’t know the purpose of this
> >> code.   SQLAlchemy’s connection pool has been tested a lot with eventlet
> >> / gevent and this has never been reported before.     This is a very
> >> severe race and I’d think that this would be happening all over the
> >> place.
> > 
> > The threadgroup module is from the Oslo incubator, so if you need to
> > review the git history you'll want to look at that copy.
> 
> 
> I haven’t confirmed this yet today but based on some greenlet research as
> well as things I observed with PDB yesterday, my suspicion is that
> Cinder’s
> startup code runs in a traditional thread, at the same time the service
> is
> allowing connections to come in via green-threads, which are running in a
> separate greenlet event loop (how else did my PDB sessions have continued
> echo output stepping on my typing?). greenlet performs stack-slicing
> where
> it is memoizing the state of the interpreter to some extent, but
> importantly
> it does not provide this in conjunction with traditional threads. So
> Python
> code can’t even tell that it’s being shared, because all of the state is
> completely swapped out (but of course this doesn’t count when you’re a
> file
> descriptor). I’ve been observing this by watching the identical objects
> (same ID) magically have different state as a stale greenlet suddenly
> wakes
> up in the middle of the presumably thread-bound initialization code.
> 
> My question is then how is it that such an architecture would be
> possible,
> that Cinder’s service starts up without greenlets yet allows
> greenlet-based
> requests to come in before this critical task is complete? Shouldn’t the
> various oslo systems be providing patterns to prevent this disastrous
> combination?   

I would have thought so, but they are (mostly) libraries not frameworks
so they are often combined in unexpected ways. Let's see where the issue
is before deciding on where the fix should be.

Doug

> 
> 
> > Doug
> > 
> >> Current status is that I’m continuing to try to determine why this is
> >> happening here, and seemingly nowhere else.
> >> 
> >> 
> >> 
> >> 
> >> __________________________________________________________________________
> >> OpenStack Development Mailing List (not for usage questions)
> >> Unsubscribe:
> >> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> >> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> > 
> > __________________________________________________________________________
> > OpenStack Development Mailing List (not for usage questions)
> > Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> > http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe:
> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list