[neutron]: Latest Queens release: dhcp-client takes too long processing messages and falls behind.

fsbiz at yahoo.com fsbiz at yahoo.com
Mon Oct 7 22:45:15 UTC 2019


 
  Hi neutron team,
We've been troubleshooting an issue with neutron's dhcp-client for sometime now.

We were previously on neutron 12.0.5 and observed that upon reloading 5-8 baremetals simultaneously almost always led to a few baremetals failing PXE boot during provisioning and/or cleaning.
This was traced to afixed bug.
https://bugs.launchpad.net/neutron/+bug/1760047
which was applied to Queens in April 2019.
https://review.opendev.org/#/c/649580/


We patched the above fix but found out the problem was not resolved.
The fix gets rid of the semaphores by serializing the multiple messages into aPriority Queue.
The Priority Queue then drains the messages serially one by one making sure notto yield during the processing
of each message.  All in all this just seems like a more elegant way ofgetting rid of the semaphores but does not really 
fix the issue at hand.


Below are the logs from dhcp-agent in neutron release 12.0.5.  As can be seen the semaphore locks all threads for almost 6 seconds.While the below has been fixed using https://review.opendev.org/#/c/649580/
the underlying problem has not been fixed.  The semaphore has been removed but instead the message is being serialized and does not yieldresulting in PXE boot failures on the baremetal nodes.
Any pointers would be appreciated.
thanks,Fred


2019-10-03 18:07:37.454 318956DEBUG oslo_concurrency.lockutils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Acquired semaphore"dhcp-agent-network-lock-077aa2d1-605c-48ec-842d-7dd6767bfd01" lock/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:212

2019-10-03 18:07:37.455 318956DEBUG neutron.agent.dhcp.agent [req-eac79995-3846-46e6-b946-c5b5ccdb7aa5 8941137e383548bda725e74a93b2f86519f6fb7446dc47dd88c63cf03c1cce94 - - -] Calling driver for network:077aa2d1-605c-48ec-842d-7dd6767bfd01 action: reload_allocations call_driver/usr/lib/python2.7/site-packages/neutron/agent/dhcp/agent.py:135

2019-10-03 18:07:37.456 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.77', '9c:71:3a:cb:7c:43', '01:9c:71:3a:cb:7c:43']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:38.101 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.75', '9c:71:3a:cb:7b:fb'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:38.717 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'dhcp_release', 'ns-8387b854-d1','10.33.27.75', '9c:71:3a:cb:7b:fb', 'ff:3a:cb:7b:fb:00:04:8a:ef:2f:58:b4:20:45:03:80:27:0f:15:84:a4:70:7b']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:39.631 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Building host file: /var/lib/neutron/dhcp/077aa2d1-605c-48ec-842d-7dd6767bfd01/host_output_hosts_file/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:695

2019-10-03 18:07:39.632 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Donebuilding host file/var/lib/neutron/dhcp/077aa2d1-605c-48ec-842d-7dd6767bfd01/host_output_hosts_file/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:734

2019-10-03 18:07:39.633 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip', 'addr', 'show','ns-8387b854-d1'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:40.263 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['kill', '-HUP', '319109']execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:40.843 318956DEBUG neutron.agent.linux.dhcp [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Reloading allocations for network: 077aa2d1-605c-48ec-842d-7dd6767bfd01reload_allocations/usr/lib/python2.7/site-packages/neutron/agent/linux/dhcp.py:524

2019-10-03 18:07:40.843 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -]Running command (rootwrap daemon): ['ip', 'netns', 'exec','qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip', '-4', 'route', 'list','dev', 'ns-8387b854-d1'] execute_rootwrap_daemon/usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:41.462 318956DEBUG neutron.agent.linux.utils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa5 8941137e383548bda725e74a93b2f86519f6fb7446dc47dd88c63cf03c1cce94 - - -] Running command (rootwrap daemon):['ip', 'netns', 'exec', 'qdhcp-077aa2d1-605c-48ec-842d-7dd6767bfd01', 'ip','-6', 'route', 'list', 'dev', 'ns-8387b854-d1'] execute_rootwrap_daemon /usr/lib/python2.7/site-packages/neutron/agent/linux/utils.py:108

2019-10-03 18:07:42.101 318956DEBUG oslo_concurrency.lockutils [req-eac79995-3846-46e6-b946-c5b5ccdb7aa58941137e383548bda725e74a93b2f865 19f6fb7446dc47dd88c63cf03c1cce94 - - -] Releasing semaphore"dhcp-agent-network-lock-077aa2d1-605c-48ec-842d-7dd6767bfd01" lock/usr/lib/python2.7/site-packages/oslo_concurrency/lockutils.py:228
 



  
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20191007/dd8b9f05/attachment-0001.html>


More information about the openstack-discuss mailing list