[openstack-dev] [cinder] [oslo] MySQL connection shared between green threads concurrently
Mike Bayer
mbayer at redhat.com
Tue Feb 17 18:12:31 UTC 2015
Mike Bayer <mbayer at redhat.com> wrote:
>
>
> Doug Hellmann <doug at doughellmann.com> wrote:
>
>>> My question is then how is it that such an architecture would be
>>> possible,
>>> that Cinder’s service starts up without greenlets yet allows
>>> greenlet-based
>>> requests to come in before this critical task is complete? Shouldn’t the
>>> various oslo systems be providing patterns to prevent this disastrous
>>> combination?
>>
>> I would have thought so, but they are (mostly) libraries not frameworks
>> so they are often combined in unexpected ways. Let's see where the issue
>> is before deciding on where the fix should be.
>
>
> my next suspicion is that this is actually a spawned subprocess, though
> oslo.concurrency, and it is failing to create a new connection pool and thus
> is sharing the same file descriptor between processes. That is a much more
> ordinary issue and would explain everything I’m seeing in a more familiar
> way. Let me see if i can confirm *that*.
So, that was it.
Here’s some debugging I added to oslo.db inside of its own create_engine() that catches this easily:
engine = sqlalchemy.create_engine(url, **engine_args)
import os
from sqlalchemy import event
@event.listens_for(engine, "connect")
def connect(dbapi_connection, connection_record):
connection_record.info['pid'] = os.getpid()
@event.listens_for(engine, "checkout")
def checkout(dbapi_connection, connection_record, connection_proxy):
pid = os.getpid()
if connection_record.info['pid'] != pid:
raise Exception(
"Connection record belongs to pid %s, "
"attempting to check out in pid %s" % (connection_record.info['pid'], pid))
sure enough, the database errors go away and Cinders logs when I do a plain
startup are filled with:
ERROR cinder.openstack.common.threadgroup [-] Connection record belongs to pid 21200, attempting to check out in pid 21408
ERROR cinder.openstack.common.threadgroup [-] Connection record belongs to pid 21200, attempting to check out in pid 21409
etc.
The subprocesses here are logged as:
2015-02-17 13:05:12.583 DEBUG oslo_concurrency.processutils [req-a06464d1-4785-4a29-8e58-239d5a674451 None None] CMD "sudo cinder-rootwrap /etc/cinder/rootwrap.conf env LC_ALL=C LVM_SYSTEM_DIR=/etc/cinder vgs --noheadings --unit=g -o name,size,free,lv_count,uuid --separator : --nosuffix stack-volumes-lvm-1" returned: 0 in 0.828s from (pid=21408) execute /usr/lib/python2.7/site-packages/oslo_concurrency/processutils.py:221
a trace is as follows:
Traceback (most recent call last):
File "/opt/stack/cinder/cinder/openstack/common/threadgroup.py", line 143, in wait
x.wait()
File "/opt/stack/cinder/cinder/openstack/common/threadgroup.py", line 47, in wait
return self.thread.wait()
File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 175, in wait
return self._exit_event.wait()
File "/usr/lib/python2.7/site-packages/eventlet/event.py", line 121, in wait
return hubs.get_hub().switch()
File "/usr/lib/python2.7/site-packages/eventlet/hubs/hub.py", line 294, in switch
return self.greenlet.switch()
File "/usr/lib/python2.7/site-packages/eventlet/greenthread.py", line 214, in main
result = function(*args, **kwargs)
File "/opt/stack/cinder/cinder/openstack/common/service.py", line 492, in run_service
service.start()
File "/opt/stack/cinder/cinder/service.py", line 142, in start
self.manager.init_host()
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
return f(*args, **kwargs)
File "/usr/lib/python2.7/site-packages/osprofiler/profiler.py", line 105, in wrapper
return f(*args, **kwargs)
File "/opt/stack/cinder/cinder/volume/manager.py", line 294, in init_host
volumes = self.db.volume_get_all_by_host(ctxt, self.host)
File "/opt/stack/cinder/cinder/db/api.py", line 191, in volume_get_all_by_host
return IMPL.volume_get_all_by_host(context, host)
File "/opt/stack/cinder/cinder/db/sqlalchemy/api.py", line 160, in wrapper
return f(*args, **kwargs)
File "/opt/stack/cinder/cinder/db/sqlalchemy/api.py", line 1229, in volume_get_all_by_host
result = _volume_get_query(context).filter(or_(*conditions)).all()
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2320, in all
return list(self)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2438, in __iter__
return self._execute_and_instances(context)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2451, in _execute_and_instances
close_with_result=True)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/query.py", line 2442, in _connection_from_session
**kw)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 854, in connection
close_with_result=close_with_result)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/orm/session.py", line 860, in _connection_for_bind
return engine.contextual_connect(**kwargs)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/engine/base.py", line 1800, in contextual_connect
self.pool.connect(),
File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 338, in connect
return _ConnectionFairy._checkout(self)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/pool.py", line 671, in _checkout
fairy)
File "/usr/lib64/python2.7/site-packages/sqlalchemy/event/attr.py", line 260, in __call__
fn(*args, **kw)
File "/usr/lib/python2.7/site-packages/oslo_db/sqlalchemy/session.py", line 408, in checkout
"attempting to check out in pid %s" % (connection_record.info['pid'], pid))
2015-02-17 13:05:14.626 TRACE cinder.openstack.common.threadgroup Exception: Connection record belongs to pid 21200, attempting to check out in pid 21558
>
>
>
>> Doug
>>
>>>> Doug
>>>>
>>>>> Current status is that I’m continuing to try to determine why this is
>>>>> happening here, and seemingly nowhere else.
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> __________________________________________________________________________
>>>>> OpenStack Development Mailing List (not for usage questions)
>>>>> Unsubscribe:
>>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list