[openstack-dev] Race in FixedIP.associate_pool
Arun SAG
sagarun at gmail.com
Sat Dec 16 02:38:00 UTC 2017
Hi Jay,
On Fri, Dec 15, 2017 at 1:56 PM, Jay Pipes <jaypipes at gmail.com> wrote:
> Can you point us to the code that is generating the above? It seems that
> get_instance_nw_info() in the Yahoo! manager.py contrib module line 965 is
> trying to build network information for an empty list of vNICs... where is
> that list of vNICs coming from?
The vNICs are empty because objects.FixedIPList.get_by_instance_uuid
is empty https://github.com/openstack/nova/blob/master/nova/network/manager.py#L527
The Yahoo! manager.py's get_by_instance_uuid is essentially same as
the upstream code except we change the VIF_TYPE in
get_instance_nw_info
@messaging.expected_exceptions(exception.InstanceNotFound)
def get_instance_nw_info(self, context, instance_id, rxtx_factor,
host, instance_uuid=None, **kwargs):
"""Creates network info list for instance.
called by allocate_for_instance and network_api
context needs to be elevated
:returns: network info list [(network,info),(network,info)...]
where network = dict containing pertinent data from a network db object
and info = dict containing pertinent networking data
"""
if not uuidutils.is_uuid_like(instance_id):
instance_id = instance_uuid
instance_uuid = instance_id
LOG.debug('Get instance network info', instance_uuid=instance_uuid)
try:
fixed_ips = objects.FixedIPList.get_by_instance_uuid(
context, instance_uuid)
except exception.FixedIpNotFoundForInstance:
fixed_ips = []
LOG.debug('Found %d fixed IPs associated to the instance in the '
'database.',
len(fixed_ips), instance_uuid=instance_uuid)
nw_info = network_model.NetworkInfo()
# (saga): The default VIF_TYPE is bridge. We need to use OVS
# This is the only reason we copied this method from the base class
if not CONF.network_driver or CONF.network_driver ==
'nova.network.linux_net':
if CONF.linuxnet_interface_driver ==
'nova.network.linux_net.LinuxOVSInterfaceDriver':
vif_type = network_model.VIF_TYPE_OVS
Here are the sequence of actions happen in nova-network
1. allocate_for_instance calls -> allocate_fixed_ips
2. FixedIPs are successfully associated (we can see this in the log)
3. allocate_for_instance calls get_instance_nw_info, which in turn
gets the fixedip's associated in step 2 using
objects.FixedIPList.get_by_instance_uuid, This raises FixedIPNotFound
exception
We remove the slave and just ran with just single master, the errors
went away. We also switched to using semi-synchronous replication
between master
and slave, the errors went away too. All of this points to a race
between write and read to the DB.
Does openstack expects synchronous replication to read-only slaves?
--
Arun S A G
http://zer0c00l.in/
More information about the OpenStack-dev
mailing list