[Openstack-operators] Race in FixedIP.associate_pool
Arun SAG
sagarun at gmail.com
Tue Jan 30 00:45:46 UTC 2018
Hello,
On Tue, Dec 12, 2017 at 12:22 PM, Arun SAG <sagarun at gmail.com> wrote:
> Hello,
>
> We are running nova-network in ocata. We use mysql in a master-slave
> configuration, The master is read/write, and all reads go to the slave
> (slave_connection is set). When we tried to boot multiple VMs in
> parallel (lets say 15), we see a race in allocate_for_instance's
> FixedIP.associate_pool. We see FixedIP.associate_pool associates an
> IP, but later in the code we try to read the allocated FixedIP using
> objects.FixedIPList.get_by_instance_uuid and it throws
> FixedIPNotFoundException. We also checked the slave replication status
> and Seconds_Behind_Master: 0
>
[snip]
>
> This kind of how the logs look like
> 2017-12-08 22:33:37,124 DEBUG
> [yahoo.contrib.ocata_openstack_yahoo_plugins.nova.network.manager]
> /opt/openstack/venv/nova/lib/python2.7/site-packages/yahoo/contrib/ocata_openstack_yahoo_plugins/nova/network/manager.py:get_instance_nw_info:894
> Fixed IP NOT found for instance
> 2017-12-08 22:33:37,125 DEBUG
> [yahoo.contrib.ocata_openstack_yahoo_plugins.nova.network.manager]
> /opt/openstack/venv/nova/lib/python2.7/site-packages/yahoo/contrib/ocata_openstack_yahoo_plugins/nova/network/manager.py:get_instance_nw_info:965
> Built network info: |[]|
> 2017-12-08 22:33:37,126 INFO [nova.network.manager]
> /opt/openstack/venv/nova/lib/python2.7/site-packages/nova/network/manager.py:allocate_for_instance:428
> Allocated network: '[]' for instance
> 2017-12-08 22:33:37,126 ERROR [oslo_messaging.rpc.server]
> /opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/server.py:_process_incoming:164
> Exception during message handling
> Traceback (most recent call last):
> File "/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
> line 155, in _process_incoming
> res = self.dispatcher.dispatch(message)
> File "/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
> line 222, in dispatch
> return self._do_dispatch(endpoint, method, ctxt, args)
> File "/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
> line 192, in _do_dispatch
> result = func(ctxt, **new_args)
> File "/opt/openstack/venv/nova/lib/python2.7/site-packages/yahoo/contrib/ocata_openstack_yahoo_plugins/nova/network/manager.py",
> line 347, in allocate_for_instance
> vif = nw_info[0]
> IndexError: list index out of range
>
>
> This problem goes way when we get rid of the slave_connection setting
> and just use single master. Has any one else seen this? Any
> recommendation to fix this issue?
>
> This issue is kind of similar to https://bugs.launchpad.net/nova/+bug/1249065
>
If anyone is running into db race while running database in
master-slave mode with async replication, The bug has been identified
and getting fixed here
https://bugs.launchpad.net/oslo.db/+bug/1746116
--
Arun S A G
http://zer0c00l.in/
More information about the OpenStack-operators
mailing list