We have a relatively stable 140 node Openstack deployment consisting of hypervisors and bare-metal.
Periodically, we have observed that the baremetals timeout during the PXE.
We have been able to reproduce this relatively easily by rebooting several baremetals simultaneously.
I believe I have traced the issue to the dnsmasq reloading every time an entry is added / deleted.
Can anyone who has experienced similar issue provide feedback on what you did to work around this ?
We are running a stable Queens release.
thanks,
Fred.