<html xmlns="http://www.w3.org/1999/xhtml" xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office"><head><!--[if gte mso 9]><xml><o:OfficeDocumentSettings><o:AllowPNG/><o:PixelsPerInch>96</o:PixelsPerInch></o:OfficeDocumentSettings></xml><![endif]--></head><body><div class="ydp40925e82yahoo-style-wrap" style="font-family: times new roman, new york, times, serif; font-size: 16px;"><div></div>
        <div dir="ltr" data-setdir="false">Thanks. </div><div dir="ltr" data-setdir="false">Yes, it helps breathe some CPU cycles.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false"><div><span style="font-size:11.0pt;font-family:Calibri,sans-serif;mso-fareast-font-family:Calibri;mso-fareast-theme-font:minor-latin;mso-ansi-language:EN-US;mso-fareast-language:EN-US;mso-bidi-language:AR-SA">This was traced to a
fixed bug.<br>
<a href="https://bugs.launchpad.net/neutron/+bug/1760047" rel="nofollow" target="_blank">https://bugs.launchpad.net/neutron/+bug/1760047</a><br>
which was applied to Queens in April 2019.<br>
<a href="https://review.opendev.org/#/c/649580/" rel="nofollow" target="_blank">https://review.opendev.org/#/c/649580/</a><br>
<!--[if !supportLineBreakNewLine]--><br>
<!--[endif]--></span></div>Unfortunately, the patch simply makes the code more elegant by removing the semaphores.</div><div dir="ltr" data-setdir="false">But it does not really fix the real issue that is dhcp-client serializes all the port update messages and each</div><div dir="ltr" data-setdir="false">message is processed too slowly resulting in PXE boot timeouts.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">The issue still remains open.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">thanks,</div><div dir="ltr" data-setdir="false">Fred.</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false"><br></div></div><div id="ydpf2d0e03dyahoo_quoted_1154183869" class="ydpf2d0e03dyahoo_quoted"><div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;"><div><div id="ydpf2d0e03dyiv4535606017"><div><div class="ydpf2d0e03dyiv4535606017ydpb5ed3697yahoo-style-wrap" style="font-family:times new roman, new york, times, serif;font-size:16px;"><div dir="ltr"><br clear="none"></div>
        
        </div><div class="ydpf2d0e03dyiv4535606017yqt5706253922" id="ydpf2d0e03dyiv4535606017yqt93247"><div class="ydpf2d0e03dyiv4535606017ydpa787e11byahoo_quoted" id="ydpf2d0e03dyiv4535606017ydpa787e11byahoo_quoted_0666179143">
            <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">
                
                <div>
                    On Wednesday, October 2, 2019, 11:34:39 AM PDT, Chris Apsey <bitskrieg@bitskrieg.net> wrote:
                </div>
                <div><br clear="none"></div>
                <div><br clear="none"></div>
                <div><div id="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617"><div>Is that still spitting out a vif plug failure or are your instances spawning but not getting addresses?  I've found that adding in the no-ping option to dnsmasq lowers load significantly, but can be dangerous if you've got potentially conflicting sources of address allocation.  While it doesn't address the below bug report specifically, it may breathe some more CPU cycles into dnsmasq so it can handle other items better.<br clear="none"><br clear="none"><br clear="none">R<br clear="none"><br clear="none">CA<br clear="none"><br clear="none"><br clear="none"><br clear="none">-------- Original Message --------<br clear="none">On Oct 2, 2019, 12:41, fsbiz@yahoo.com < fsbiz@yahoo.com> wrote:<blockquote class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617protonmail_quote"><br clear="none"></blockquote></div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617yqt4687389434" id="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617yqt43274"><div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp95e21200yahoo-style-wrap" style="font-family:times new roman, new york, times, serif;font-size:16px;"><div></div>
        <div dir="ltr">Thanks. This definitely helps.</div><div dir="ltr"><br clear="none"></div><div dir="ltr">I am running a stable release of Queens.</div><div dir="ltr">Even after this change I still see 10-15 failures when I create 100 VMs in our cluster.</div><div dir="ltr"><br clear="none"></div><div dir="ltr">I have tracked this down (to a reasonable degree of certainty) to the SIGHUPs caused by DNSMASQ reloads</div><div dir="ltr">every time a new MAC entry is added, deleted or updated. </div><div dir="ltr"><br clear="none"></div><div dir="ltr">It seems to be related to</div><div dir="ltr"><span><a shape="rect" href="https://bugs.launchpad.net/neutron/+bug/1598078" rel="nofollow" target="_blank">https://bugs.launchpad.net/neutron/+bug/1598078</a></span><br clear="none"></div><div dir="ltr"><br clear="none"></div><div dir="ltr">The fix for the above bug was abandoned.  </div><div dir="ltr"><span><a shape="rect" class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617enhancr_card_0667079528" href="https://review.opendev.org/#/c/336462/" rel="nofollow" target="_blank">Gerrit Code Review</a></span><div><br clear="none"></div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259yahoo-link-enhancr-card ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259yahoo-link-enhancr-not-allow-cover ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259ymail-preserve-class ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259ymail-preserve-style" id="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259enhancr_card_0667079528" style="max-width:400px;font-family:Helvetica Neue, Segoe UI, Helvetica, Arial, sans-serif;" data-url="https://review.opendev.org/#/c/336462/" data-type="YENHANCER" data-size="MEDIUM"><a shape="rect" class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259yahoo-enhancr-cardlink" href="https://review.opendev.org/#/c/336462/" style="text-decoration:none !important;color:#000 !important;" rel="nofollow" target="_blank"><table class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259card-wrapper ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259yahoo-ignore-table" border="0" cellpadding="0" cellspacing="0" style="max-width:400px;"><tbody><tr><td colspan="1" rowspan="1" width="400"><table class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259card ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259yahoo-ignore-table" border="0" cellpadding="0" cellspacing="0" width="100%" style="max-width:400px;border-width:1px;border-style:solid;border-color:rgb(224, 228, 233);border-radius:2px;"><tbody><tr><td colspan="1" rowspan="1"><table class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259card-info ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259yahoo-ignore-table" border="0" cellpadding="0" cellspacing="0" style="background:#fff;position:relative;z-index:2;width:100%;max-width:400px;border-radius:0 0 2px 2px;border-top:1px solid rgb(224, 228, 233);"><tbody><tr><td colspan="1" rowspan="1" style="background-color:#ffffff;padding:16px 0 16px 12px;vertical-align:top;border-radius:0 0 0 2px;"></td><td colspan="1" rowspan="1" style="vertical-align:middle;padding:12px 24px 16px 12px;width:99%;font-family:Helvetica Neue, Segoe UI, Helvetica, Arial, sans-serif;border-radius:0 0 2px 0;"><h2 class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259card-title" style="font-size:14px;line-height:19px;margin:0px 0px 6px;font-family:Helvetica Neue, Segoe UI, Helvetica, Arial, sans-serif;color:rgb(38, 40, 42);">Gerrit Code Review</h2><p class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp36573259card-description" style="font-size:12px;line-height:16px;margin:0px;color:rgb(151, 155, 167);"></p></td></tr></tbody></table></td></tr></tbody></table></td></tr></tbody></table></a></div><div><br clear="none"></div><div><br clear="none"></div><div dir="ltr">Any further fine tuning that can be done? </div><div dir="ltr"><br clear="none"></div><div dir="ltr">Thanks,</div><div dir="ltr">Fred.</div><div><br clear="none"></div><br clear="none"></div><div><br clear="none"></div>

        </div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yahoo_quoted" id="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yahoo_quoted_0516677938">
            <div style="font-family:'Helvetica Neue', Helvetica, Arial, sans-serif;font-size:13px;color:#26282a;">

                <div>
                    On Friday, September 27, 2019, 09:37:51 AM PDT, Chris Apsey <bitskrieg@bitskrieg.net> wrote:
                </div>
                <div><br clear="none"></div>
                <div><br clear="none"></div>
                <div><div id="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030"><div><div>Albert,<br clear="none"></div><div><br clear="none"></div><div>Do this: <a shape="rect" href="https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/" rel="nofollow" target="_blank">https://cloudblog.switch.ch/2017/08/28/starting-1000-instances-on-switchengines/</a></div><div><br clear="none"></div><div>The problem will go away.  I'm of the opinion that daemon mode for rootwrap should be the default since the performance improvement is an order of magnitude, but privsep may obviate that concern once its fully implemented.<br clear="none"></div><div><br clear="none"></div><div>Either way, that should solve your problem.<br clear="none"></div><div><br clear="none"></div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030protonmail_signature_block"><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030protonmail_signature_block-user"><div>r<br clear="none"></div><div><br clear="none"></div><div>Chris Apsey<br clear="none"></div></div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030protonmail_signature_block-proton ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030protonmail_signature_block-empty"><br clear="none"></div></div><div><br clear="none"></div><div>‐‐‐‐‐‐‐ Original Message ‐‐‐‐‐‐‐<br clear="none"></div><div class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030yqt1194612915" id="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030yqtfd67219"><div> On Friday, September 27, 2019 12:17 PM, Albert Braden <Albert.Braden@synopsys.com> wrote:<br clear="none"></div><div> <br clear="none"></div><blockquote class="ydpf2d0e03dyiv4535606017ydpa787e11byiv0116234617ydp3a4527a8yiv2290141030protonmail_quote" type="cite"><div><p>When I create 100 VMs in our prod cluster:<br clear="none"></p><p> <br clear="none"></p><p>openstack server create --flavor s1.tiny --network it-network --image cirros-0.4.0-x86_64 --min 100 --max 100 alberttest<br clear="none"></p><p> <br clear="none"></p><p>Most of them build successfully in about a minute. 5 or 10 will stay in BUILD status for 5 minutes and then fail with “ BuildAbortException: Build of instance <UUID> aborted: Failed to allocate the network(s), not rescheduling.”<br clear="none"></p><p> <br clear="none"></p><p>If I build smaller numbers, I see less failures, and no failures if I build one at a time. This does not happen in dev or QA; it appears that we are exhausting a resource in prod. I tried reducing various config values in dev but am not
 able to duplicate the issue. The neutron servers don’t appear to be overloaded during the failure.<br clear="none"></p><p> <br clear="none"></p><p>What config variables should I be looking at?<br clear="none"></p><p> <br clear="none"></p><p>Here are the relevant log entries from the HV:<br clear="none"></p><p> <br clear="none"></p><p>2019-09-26 10:10:43.001 57008 INFO os_vif [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] Successfully plugged vif VIFBridge(active=False,address=fa:16:3e:8b:45:07,bridge_name='brq49cbe55d-51',has_traffic_filtering=True,id=18f4e419-b19c-4b62-b6e4-152ec78e72bc,network=Network(49cbe55d-5188-4183-b5ad-e65f9b46f8f2),plugin='linux_bridge',port_profile=<?>,preserve_on_delete=False,vif_name='tap18f4e419-b1')<br clear="none"></p><p>2019-09-26 10:15:44.029 57008 WARNING nova.virt.libvirt.driver [req-dea54d9a-3f3e-4d47-b901-a4f41b1947a8 d28c3871f61e4c8c8f8c7600417f7b14 e9621e3b105245ba8660f434ab21016c - default 4fb72165eee4468e8033cdc7d506ddf0] [instance: dc58f154-00f9-4c45-8986-94b10821cbc9]
 Timeout waiting for [('network-vif-plugged', u'18f4e419-b19c-4b62-b6e4-152ec78e72bc')] for instance with vm_state building and task_state spawning.: Timeout: 300 seconds<br clear="none"></p><p> <br clear="none"></p><p>More logs and data:<br clear="none"></p><p> <br clear="none"></p><p>http://paste.openstack.org/show/779524/<br clear="none"></p><p> <br clear="none"></p></div></blockquote><div><br clear="none"></div></div></div></div></div>
            </div>
        </div></div></div></div></div>
            </div>
        </div></div></div></div></div>
            </div>
        </div></body></html>