[openstack-dev] Race in FixedIP.associate_pool

Arun SAG sagarun at gmail.com
Tue Dec 12 20:22:29 UTC 2017


Hello,

We are running nova-network in ocata. We use mysql in a master-slave
configuration, The master is read/write, and all reads go to the slave
(slave_connection is set). When we tried to boot multiple VMs in
parallel (lets say 15), we see a race in allocate_for_instance's
FixedIP.associate_pool. We see FixedIP.associate_pool associates an
IP, but later in the code we try to read the allocated FixedIP using
objects.FixedIPList.get_by_instance_uuid and it throws
FixedIPNotFoundException. We also checked the slave replication status
and Seconds_Behind_Master: 0

mysql> show slave status \G
*************************** 1. row ***************************
               Slave_IO_State: Waiting for master to send event
                  Master_Host: master.hostname
                  Master_User: repl
                  Master_Port: 3306
                Connect_Retry: 60
              Master_Log_File: mysqld-bin.000003
          Read_Master_Log_Pos: 142605121
               Relay_Log_File: mysqld-relay-bin.000006
                Relay_Log_Pos: 142602813
        Relay_Master_Log_File: mysqld-bin.000003
             Slave_IO_Running: Yes
            Slave_SQL_Running: Yes
              Replicate_Do_DB:
          Replicate_Ignore_DB: mysql,mysql
           Replicate_Do_Table:
       Replicate_Ignore_Table:
      Replicate_Wild_Do_Table:
  Replicate_Wild_Ignore_Table:
                   Last_Errno: 0
                   Last_Error:
                 Skip_Counter: 0
          Exec_Master_Log_Pos: 142605121
              Relay_Log_Space: 142603151
              Until_Condition: None
               Until_Log_File:
                Until_Log_Pos: 0
           Master_SSL_Allowed: No
           Master_SSL_CA_File:
           Master_SSL_CA_Path:
              Master_SSL_Cert:
            Master_SSL_Cipher:
               Master_SSL_Key:
        Seconds_Behind_Master: 0
Master_SSL_Verify_Server_Cert: No
                Last_IO_Errno: 0
                Last_IO_Error:
               Last_SQL_Errno: 0
               Last_SQL_Error:
  Replicate_Ignore_Server_Ids:
             Master_Server_Id: 172687130
                  Master_UUID: 8112eeef-de1e-11e7-8098-ac162da46cbc
             Master_Info_File: mysql.slave_master_info
                    SQL_Delay: 0
          SQL_Remaining_Delay: NULL
      Slave_SQL_Running_State: Slave has read all relay log; waiting
for the slave I/O thread to update it
           Master_Retry_Count: 86400
                  Master_Bind:
      Last_IO_Error_Timestamp:
     Last_SQL_Error_Timestamp:
               Master_SSL_Crl:
           Master_SSL_Crlpath:
           Retrieved_Gtid_Set:
            Executed_Gtid_Set:
                Auto_Position: 0

This kind of how the logs look like
2017-12-08 22:33:37,124 DEBUG
[yahoo.contrib.ocata_openstack_yahoo_plugins.nova.network.manager]
/opt/openstack/venv/nova/lib/python2.7/site-packages/yahoo/contrib/ocata_openstack_yahoo_plugins/nova/network/manager.py:get_instance_nw_info:894
Fixed IP NOT found for instance
2017-12-08 22:33:37,125 DEBUG
[yahoo.contrib.ocata_openstack_yahoo_plugins.nova.network.manager]
/opt/openstack/venv/nova/lib/python2.7/site-packages/yahoo/contrib/ocata_openstack_yahoo_plugins/nova/network/manager.py:get_instance_nw_info:965
Built network info: |[]|
2017-12-08 22:33:37,126 INFO [nova.network.manager]
/opt/openstack/venv/nova/lib/python2.7/site-packages/nova/network/manager.py:allocate_for_instance:428
Allocated network: '[]' for instance
2017-12-08 22:33:37,126 ERROR [oslo_messaging.rpc.server]
/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/server.py:_process_incoming:164
Exception during message handling
Traceback (most recent call last):
  File "/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/server.py",
line 155, in _process_incoming
    res = self.dispatcher.dispatch(message)
  File "/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
line 222, in dispatch
    return self._do_dispatch(endpoint, method, ctxt, args)
  File "/opt/openstack/venv/nova/lib/python2.7/site-packages/oslo_messaging/rpc/dispatcher.py",
line 192, in _do_dispatch
    result = func(ctxt, **new_args)
  File "/opt/openstack/venv/nova/lib/python2.7/site-packages/yahoo/contrib/ocata_openstack_yahoo_plugins/nova/network/manager.py",
line 347, in allocate_for_instance
    vif = nw_info[0]
IndexError: list index out of range


This problem goes way when we get rid of the slave_connection setting
and just use single master. Has any one else seen this? Any
recommendation to fix this issue?

This issue is kind of  similar to https://bugs.launchpad.net/nova/+bug/1249065

-- 
Arun S A G
http://zer0c00l.in/



More information about the OpenStack-dev mailing list