[Openstack] Is there a reason Nova doesn't use scoped sessions in sqlalchemy ?

Vishvananda Ishaya vishvananda at gmail.com
Mon Oct 31 19:50:03 UTC 2011


All of the workers are single-threaded, so I'm not sure that scoped sessions are really necessary.

We did however decide that objects from the db layer are supposed to be simple dictionaries.  We currently allow nested dictionaries to optimize joined objects. Unfortunately we never switched to sanitizing data from sqlalchemy, and instead we make the sqlalchemy objects provide a dictionary-like interface and pass the object itself.

The issue that you're seeing is because network wasn't properly 'joinedload'ed in the initial query, and because the data is not sanitized, sqlalchemy tries to joinedload, but the session has been terminated.  If we had sanitized data, we would get a more useful error like a key error when network is accessed. The current solution is to add the proper joinedload.

One of the goals of the nova-database team is to do the necessary data sanitization and to remove as many of the joinedloads as possible (hopefully all of them).

Vish

On Oct 31, 2011, at 12:25 PM, Day, Phil wrote:

> Hi Folks,
>  
> We’ve been looking into a problem which looks a lot like:
>  
> https://bugs.launchpad.net/nova/+bug/855660
>  
>  
>  
> 2011-10-21 14:13:31,035 ERROR nova.api [5bd52130-d46f-4702-b06b-9ca5045473d7 smokeuser smokeproject] Unexpected error raised: Parent instance <FixedIp at 0x4e74490> is not bound to a Session; lazy load operation of attribute 'network' cannot proceed 
> (nova.api): TRACE: Traceback (most recent call last): 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/nova/api/ec2/__init__.py", line 363, in __call__ 
> (nova.api): TRACE: result = api_request.invoke(context) 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/nova/api/ec2/apirequest.py", line 90, in invoke 
> (nova.api): TRACE: result = method(context, **args) 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/nova/api/ec2/cloud.py", line 1195, in describe_instances 
> (nova.api): TRACE: instance_id=instance_id) 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/nova/api/ec2/cloud.py", line 1204, in _format_describe_instances 
> (nova.api): TRACE: return {'reservationSet': self._format_instances(context, **kwargs)} 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/nova/api/ec2/cloud.py", line 1309, in _format_instances 
> (nova.api): TRACE: if fixed['network'] and use_v6: 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/nova/db/sqlalchemy/models.py", line 76, in __getitem__ 
> (nova.api): TRACE: return getattr(self, key) 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 163, in __get__ 
> (nova.api): TRACE: instance_dict(instance)) 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/attributes.py", line 383, in get 
> (nova.api): TRACE: value = callable_(passive=passive) 
> (nova.api): TRACE: File "/usr/lib/python2.7/dist-packages/sqlalchemy/orm/strategies.py", line 595, in __call__ 
> (nova.api): TRACE: (mapperutil.state_str(state), self.key) 
> (nova.api): TRACE: DetachedInstanceError: Parent instance <FixedIp at 0x4e74490> is not bound to a Session; lazy load operation of attribute 'network' cannot proceed 
> (nova.api): TRACE:
>  
>  
> As far as we can see the problem seems to be related to some conflict between multiple threads in the same API server instance and lazy loading of some part of the object.
>  
> Looking at the sqlalchemy documentation it seems to strongly suggest that when used from multi-threaded WSGI applications that scoped_sessions should be used (I’m not clear on the details but it seems that this effectively makes lazy load operations thread safe).    However whilst this fixes the problem it has a bad effect on the unit tests – in particular it seems to upset all of the DB migration code used in the unit tests.
>  
> So does anyone know if there was an explicit decision / reason not to use scoped_sessions in Nova ?
>  
> Thanks,
> Phil
>  
> PS:  The other possible fix we’ve found is to change sqlalchemy/models.py so that the associations are explicitly set to use eager load – which also seems to fix the problem but feels like a more clumsy way to go about it.   Any thoughts on that would also be appreciated ?
>  
>  
>  
> _______________________________________________
> Mailing list: https://launchpad.net/~openstack
> Post to     : openstack at lists.launchpad.net
> Unsubscribe : https://launchpad.net/~openstack
> More help   : https://help.launchpad.net/ListHelp

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack/attachments/20111031/e8a9e5f8/attachment.html>


More information about the Openstack mailing list