[openstack-dev] dhcp 'Address already in use' errors when trying to start a dnsmasq

Ihar Hrachyshka ihrachys at redhat.com
Tue Sep 27 19:54:58 UTC 2016


Kevin Benton <kevin at benton.pub> wrote:

> There is no side effect other than log noise and a delayed reload? I  
> don't see why a revert would be appropriate.
>
> I looked at the logs and the issue seems to be that the process isn't  
> tracked correctly the first time it starts.
>
> grep for the following:
>
> ea141299-ce07-4ff7-9a03-7a1b7a75a371', 'dnsmasq'
>
> in
> http://logs.openstack.org/26/377626/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b6953d4/logs/screen-q-dhcp.txt.gz
>
> The first time dnsmasq is called it gives a 0 return code but the agent  
> doesn't seem to get a pid for it. So the next time it is called it  
> conflicts with the running proc.

Id you mean those log messages:

2016-09-27 12:21:24.760
  13751 DEBUG neutron.agent.linux.utils [req-128c3e79-151a-4f57-9dbc-053ff0999679 - -] Unable to access /opt/stack/data/neutron/external/pids/ea141299-ce07-4ff7-9a03-7a1b7a75a371.pid get_value_from_file /opt/stack/new/neutron/neutron/agent/linux/utils.py:204

2016-09-27 12:21:24.760
  13751 DEBUG neutron.agent.linux.utils [req-128c3e79-151a-4f57-9dbc-053ff0999679 - -] Unable to access /opt/stack/data/neutron/external/pids/ea141299-ce07-4ff7-9a03-7a1b7a75a371.pid get_value_from_file /opt/stack/new/neutron/neutron/agent/linux/utils.py:204

2016-09-27 12:21:24.761
  13751 DEBUG neutron.agent.linux.external_process [req-128c3e79-151a-4f57-9dbc-053ff0999679 - -] No process started for ea141299-ce07-4ff7-9a03-7a1b7a75a371 disable /opt/stack/new/neutron/neutron/agent/linux/external_process.py:123

then I don’t think that’s correct interpretation of the log messages.  
Notice that the pid file names there are not in dnsmasq network dir, but in  
external/<netid>.pid. Those pid files are not dnsmasq ones but potentially  
belong to metadata proxies managed by the agent. The agent attempts to  
disable proxy because it’s not needed (as per logic in  
configure_dhcp_for_network). Since the network does not have a proxy  
process running, it can’t find the pid file and hence cannot disable the  
proxy process. Then it completes configuration process.

It should not influence the flow of the program.

To prove that dnsmasq is properly tracked, also see that later when we  
restart the process for the network, we correctly extract PID from the file  
and use it for kill -9 call:

http://logs.openstack.org/26/377626/1/check/gate-tempest-dsvm-neutron-full-ubuntu-xenial/b6953d4/logs/screen-q-dhcp.txt.gz#_2016-09-27_12_21_24_878

You can check for yourself that the same PID was actually used by the  
dnsmasq process started the first time. It’s logged in syslog.

Ihar



More information about the OpenStack-dev mailing list