[openstack-dev] [cinder] [oslo] MySQL connection shared between green threads concurrently

Mike Bayer mbayer at redhat.com
Tue Feb 17 18:12:31 UTC 2015



Mike Bayer <mbayer at redhat.com> wrote:

> 
> 
> Doug Hellmann <doug at doughellmann.com> wrote:
> 
>>> My question is then how is it that such an architecture would be
>>> possible,
>>> that Cinder’s service starts up without greenlets yet allows
>>> greenlet-based
>>> requests to come in before this critical task is complete? Shouldn’t the
>>> various oslo systems be providing patterns to prevent this disastrous
>>> combination?   
>> 
>> I would have thought so, but they are (mostly) libraries not frameworks
>> so they are often combined in unexpected ways. Let's see where the issue
>> is before deciding on where the fix should be.
> 
> 
> my next suspicion is that this is actually a spawned subprocess, though
> oslo.concurrency, and it is failing to create a new connection pool and thus
> is sharing the same file descriptor between processes. That is a much more
> ordinary issue and would explain everything I’m seeing in a more familiar
> way. Let me see if i can confirm *that*.


So, that was it.

Here’s some debugging I added to oslo.db inside of its own create_engine() that catches this easily:

   engine = sqlalchemy.create_engine(url, **engine_args)

    import os
    from sqlalchemy import event

    @event.listens_for(engine, "connect")
    def connect(dbapi_connection, connection_record):
        connection_record.info['pid'] = os.getpid()

    @event.listens_for(engine, "checkout")
    def checkout(dbapi_connection, connection_record, connection_proxy):
        pid = os.getpid()
        if connection_record.info['pid'] != pid:
            raise Exception(
                    "Connection record belongs to pid %s, "
                    "attempting to check out in pid %s" % (connection_record.info['pid'], pid))


sure enough, the database errors go away and Cinders logs when I do a plain
startup are filled with:

 ERROR cinder.openstack.common.threadgroup [-] Connection record belongs to pid 21200, attempting to check out in pid 21408
 ERROR cinder.openstack.common.threadgroup [-] Connection record belongs to pid 21200, attempting to check out in pid 21409

etc.

The subprocesses here are logged as:

2015-02-17 13:05:12.583 DEBUG oslo_concurrency.processutils [req-a06464d1-4785-4a29-8e58-239d5a674451 None None] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C LVM_SYSTEM_DIR=/etc/cinder vgs --noheadings --unit=g -o name,size,free,lv_count,uuid --separator : --nosuffix stack-volumes-lvm-1" returned: 0 in 0.828s from (pid=21408) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:221

a trace is as follows:

Traceback (most recent call last):
 File "/opt/stack/cinder/cinder/openstack/common/threadgroup.py", line 143, in wait
   x.wait()
 File "/opt/stack/cinder/cinder/openstack/common/threadgroup.py", line 47, in wait
   return self.thread.wait()
 File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
   return self._exit_event.wait()
 File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
   return hubs.get_hub().switch()
 File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
   return self.greenlet.switch()
 File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
   result = function(*args, **kwargs)
 File "/opt/stack/cinder/cinder/openstack/common/service.py", line 492, in run_service
   service.start()
 File "/opt/stack/cinder/cinder/service.py", line 142, in start
   self.manager.init_host()
 File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
   return f(*args, **kwargs)
 File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
   return f(*args, **kwargs)
 File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
   return f(*args, **kwargs)
 File "/opt/stack/cinder/cinder/volume/manager.py", line 294, in init_host
   volumes = self.db.volume_get_all_by_host(ctxt, self.host)
 File "/opt/stack/cinder/cinder/db/api.py", line 191, in volume_get_all_by_host
   return IMPL.volume_get_all_by_host(context, host)
 File "/opt/stack/cinder/cinder/db/sqlalchemy/api.py", line 160, in wrapper
   return f(*args, **kwargs)
 File "/opt/stack/cinder/cinder/db/sqlalchemy/api.py", line 1229, in volume_get_all_by_host
   result = _volume_get_query(context).filter(or_(*conditions)).all()
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2320, in all
   return list(self)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2438, in __iter__
   return self._execute_and_instances(context)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2451, in _execute_and_instances
   close_with_result=True)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2442, in _connection_from_session
   **kw)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 854, in connection
   close_with_result=close_with_result)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 860, in _connection_for_bind
   return engine.contextual_connect(**kwargs)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1800, in contextual_connect
   self.pool.connect(),
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 338, in connect
   return _ConnectionFairy._checkout(self)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 671, in _checkout
   fairy)
 File "/usr/lib64/python2.7/site-packages/sqlalchemy/event/attr.py", line 260, in __call__
   fn(*args, **kw)
 File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/session.py", line 408, in checkout
   "attempting to check out in pid %s" % (connection_record.info['pid'], pid))
2015-02-17 13:05:14.626 TRACE cinder.openstack.common.threadgroup Exception: Connection record belongs to pid 21200, attempting to check out in pid 21558







> 
> 
> 
>> Doug
>> 
>>>> Doug
>>>> 
>>>>> Current status is that I’m continuing to try to determine why this is
>>>>> happening here, and seemingly nowhere else.
>>>>> 
>>>>> 
>>>>> 
>>>>> 
>>>>> __________________________________________________________________________
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe:
>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>> 
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>> 
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>> 
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> 
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list