[Openstack] Quantum dhcp-agent getting out of sync with multiple instance launch / termination

Markku Tavasti markku.tavasti at cybercom.com
Wed Oct 9 13:34:29 UTC 2013


I found out that quantum dhcp agent gets out of sync when many instances 
are launched or deleted same time. As a result, dnsmasq host-file have 
lines missing (all launched are not added), or some lines aren't removed 
when they should. When same ip is re-used, remaining extra lines aren't 
removed, but same ip has two lines, and dhcp discoveries get no response 
(no address available error on dnsmasq logs).

Restarting quantum-dhcp-agent will get everything back to proper state, 
but running restart on cron every minute does not sound proper fix :-(

On my test, creating 16 instances, 2-3 of them did not get line in 
host-file. When removing all those 16, 3-5 false lines were left behind.

On instance creation I see following message on quantum-server.log:

For every instance:
WARNING [quantum.db.agentschedulers_db] Fail scheduling network 
{'status': u'ACTIVE', 'subnets': 
[u'e9299278-bd49-4dc8-8df1-25b034f3ecea'], 'name': u'pk_tunk2', 
'provider:physical_network': u'vlans-osprv', 'admin_state_up': True, 
'tenant_id': u'06b9c423e10741ef83877b56d7608d7f', 
'provider:network_type': u'vlan', 'router:external': False, 'shared': 
False, 'id': u'cbce52ef-58ba-4f3c-96fb-9d51bdbf32fb', 
'provider:segmentation_id': 427L}
However, network ports get created, and seems to work?

And most likely related to missing lines on host file on creation:
WARNING [quantum.scheduler.dhcp_agent_scheduler] No active DHCP agents

For termination of instances there is no warning or error messages.

Any ideas for fixing the situation?

--Tavasti





More information about the Openstack mailing list