<div dir="ltr">><span style="font-size:12.8000001907349px">Just to be sure, I assume we're focussing here on the issue that Daniel reported</span><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Yes.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">></span><span style="font-size:12.8000001907349px">To be clear, though, what code are you trying to reproduce on?  Current master?</span></div><span class="im" style="font-size:12.8000001907349px"><br></span><div>I was trying on 2014.1.3, which is the version I understand to be on Fuel 5.1.1.</div><div><br></div><div>><span style="font-size:12.8000001907349px">I'm not clear whether that would qualify as 'concurrent', in the sense that you have in mind.</span></div><br style="font-size:12.8000001907349px"><div>It doesn't look like it based on the pseudocode. I was thinking of a condition where a port is deleted nearly very quickly after it was created. Is that possible with your test? If not, then my theory about out-of-order notifications might not be any good.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 9, 2015 at 3:34 AM, Neil Jerram <span dir="ltr"><<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 09/06/15 01:15, Kevin Benton wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I'm having difficulty reproducing the issue. The bug that Neil<br>
referenced (<a href="https://bugs.launchpad.net/neutron/+bug/1192381" target="_blank">https://bugs.launchpad.net/neutron/+bug/1192381</a>) looks like<br>
it was in Icehouse well before the 2014.1.3 release that looks like Fuel<br>
5.1.1 is using.<br>
</blockquote>
<br></span>
Just to be sure, I assume we're focussing here on the issue that Daniel reported (IP appears twice in Dnsmasq config), and for which I described a possible corollary (Dnsmasq config size keeps growing), and NOT on the "Another DHCP agent problem" that I mentioned below. :-)<br>
<br>
BTW, now that I've reviewed the history of when my team saw this, I can say that it was actually first reported to us with the 'IP appears twice in Dnsmasq config' symptom - i.e. exactly the same as Daniel's case. The fact of the Dnsmasq config increasing in size was noticed later.<span class=""><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I tried setting the agent report interval to something higher than the<br>
downtime to make it seem like the agent is failing sporadically to the<br>
server, but it's not impacting the notifications.<br>
</blockquote>
<br></span>
Makes sense - that's the effect of the fix for 1192381.<br>
<br>
To be clear, though, what code are you trying to reproduce on?  Current master?<span class=""><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Neil, does your testing where you saw something similar have a lot of<br>
concurrent creation/deletion?<br>
</blockquote>
<br></span>
It was a test of continuously deleting and creating VMs, with this pseudocode:<br>
<br>
thread_pool = new_thread_pool(size=30)<br>
for x in range(0,30):<br>
    thread_pool.submit(create_vm)<br>
thread_pool.wait_for_all_threads_to_complete()<br>
while True:<br>
     time.sleep(5)<br>
     for x in range(0,int(random.random()*5)):<br>
          thread_pool.submit(randomly_delete_a_vm_and_create_a_new_one)<br>
<br>
I'm not clear whether that would qualify as 'concurrent', in the sense that you have in mind.<br>
<br>
Regards,<br>
        Neil<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
On Mon, Jun 8, 2015 at 12:21 PM, Andrew Woodward <<a href="mailto:awoodward@mirantis.com" target="_blank">awoodward@mirantis.com</a><br></span><span class="">
<mailto:<a href="mailto:awoodward@mirantis.com" target="_blank">awoodward@mirantis.com</a>>> wrote:<br>
<br>
    Daniel,<br>
<br>
    This sounds familiar, see if this matches [1]. IIRC, there was<br>
    another issue like this that was might already address this in the<br>
    updates into Fuel 5.1.2 packages repo [2]. You can either update the<br>
    neutron packages from [2] Or try one of community builds for 5.1.2<br>
    [3]. If this doesn't resolve the issue, open a bug against MOS dev [4].<br>
<br>
    [1] <a href="https://bugs.launchpad.net/bugs/1295715" target="_blank">https://bugs.launchpad.net/bugs/1295715</a><br>
    [2] <a href="http://fuel-repository.mirantis.com/fwm/5.1.2/ubuntu/pool/main/" target="_blank">http://fuel-repository.mirantis.com/fwm/5.1.2/ubuntu/pool/main/</a><br>
    [3] <a href="https://ci.fuel-infra.org/" target="_blank">https://ci.fuel-infra.org/</a><br>
    [4] <a href="https://bugs.launchpad.net/mos/+filebug" target="_blank">https://bugs.launchpad.net/mos/+filebug</a><br>
<br>
    On Mon, Jun 8, 2015 at 10:15 AM Neil Jerram<br></span><div><div class="h5">
    <<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a> <mailto:<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a>>> wrote:<br>
<br>
        Two further thoughts on this:<br>
<br>
        1. Another DHCP agent problem that my team noticed is that it<br>
        call_driver('reload_allocations') takes a bit of time (to<br>
        regenerate the<br>
        Dnsmasq config files, and to spawn a shell that sends a HUP<br>
        signal) -<br>
        enough so that if there is a fast steady rate of port-create and<br>
        port-delete notifications coming from the Neutron server, these can<br>
        build up in DHCPAgent's RPC queue, and then they still only get<br>
        dispatched one at a time.  So the queue and the time delay<br>
        become longer<br>
        and longer.<br>
<br>
        I have a fix pending for this, which uses an extra thread to<br>
        read those<br>
        notifications off the RPC queue onto an internal queue, and then<br>
        batches<br>
        the call_driver('reload_allocations') processing when there is a<br>
        contiguous sequence of such notifications - i.e. only does the<br>
        config<br>
        regeneration and HUP once, instead of lots of times.<br>
<br>
        I don't think this is directly related to what you are seeing - but<br>
        perhaps there actually is some link that I am missing.<br>
<br>
        2. There is an interesting and vaguely similar thread currently<br>
        being<br>
        discussed about the L3 agent (subject "L3 agent rescheduling<br>
        issue") -<br>
        about possible RPC/threading issues between the agent and the<br>
        Neutron<br>
        server.  You might like to review that thread and see if it<br>
        describes<br>
        any problems analogous to your DHCP one.<br>
<br>
        Regards,<br>
                 Neil<br>
<br>
<br>
        On 08/06/15 17:53, Neil Jerram wrote:<br>
         > My team has seen a problem that could be related: in a churn<br>
        test where<br>
         > VMs are created and terminated at a constant rate - but so<br>
        that the<br>
         > number of active VMs should remain roughly constant - the<br>
        size of the<br>
         > host and addn_hosts files keeps increasing.<br>
         ><br>
         > In other words, it appears that the config for VMs that have<br>
        actually<br>
         > been terminated is not being removed from the config file.<br>
        Clearly, if<br>
         > you have a limited pool of IP addresses, this can eventually<br>
        lead to the<br>
         > problem that you have described.<br>
         ><br>
         > For your case - i.e. with Icehouse - the problem might be<br>
         > <a href="https://bugs.launchpad.net/neutron/+bug/1192381" target="_blank">https://bugs.launchpad.net/neutron/+bug/1192381</a>.  I'm not<br>
        sure if the<br>
         > fix for that problem - i.e. sending port-create and port-delete<br>
         > notifications to DHCP agents even when the server thinks they<br>
        are down -<br>
         > was merged before the Icehouse release, or not.<br>
         ><br>
         > But there must be at least one other cause as well, because<br>
        my team was<br>
         > seeing this with Juno-level code.<br>
         ><br>
         > Therefore I, too, would be interested in any other insights<br>
        about this<br>
         > problem.<br>
         ><br>
         > Regards,<br>
         >      Neil<br>
         ><br>
         ><br>
         ><br>
         > On 08/06/15 16:26, Daniel Comnea wrote:<br>
         >> Any help, ideas please?<br>
         >><br>
         >> Thx,<br>
         >> Dani<br>
         >><br>
         >> On Mon, Jun 8, 2015 at 9:25 AM, Daniel Comnea<br>
        <<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>><br></div></div>
         >> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a><span class=""><br>
        <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>>>> wrote:<br>
         >><br>
         >>     + Operators<br>
         >><br>
         >>     Much thanks in advance,<br>
         >>     Dani<br>
         >><br>
         >><br>
         >><br>
         >><br>
         >>     On Sun, Jun 7, 2015 at 6:31 PM, Daniel Comnea<br>
        <<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>><br></span>
         >>     <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a><div><div class="h5"><br>
        <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>>>> wrote:<br>
         >><br>
         >>         Hi all,<br>
         >><br>
         >>         I'm running IceHouse (build using Fuel 5.1.1) on<br>
        Ubuntu where<br>
         >>         dnsmask version 2.59-4.<br>
         >>         I have a very basic network layout where i have a<br>
        private net<br>
         >>         which has 2 subnets<br>
         >><br>
         >>           2fb7de9d-d6df-481f-acca-2f7860cffa60 | private-net<br>
         >>                                     |<br>
         >>         e79c3477-d3e5-471c-a728-8d881cf31bee<br>
        <a href="http://192.168.110.0/24" target="_blank">192.168.110.0/24</a> <<a href="http://192.168.110.0/24" target="_blank">http://192.168.110.0/24</a>><br>
         >>         <<a href="http://192.168.110.0/24" target="_blank">http://192.168.110.0/24</a>> |<br>
         >>         |<br>
         >>         |<br>
              |<br>
         >>         f48c3223-8507-455c-9c13-8b727ea5f441<br>
        <a href="http://192.168.111.0/24" target="_blank">192.168.111.0/24</a> <<a href="http://192.168.111.0/24" target="_blank">http://192.168.111.0/24</a>><br>
         >>         <<a href="http://192.168.111.0/24" target="_blank">http://192.168.111.0/24</a>> |<br>
         >><br>
         >>         and i'm creating VMs via HEAT.<br>
         >>         What is happening is that sometimes i get duplicated<br>
        entries in<br>
         >>         [1] and because of that the VM which was spun up<br>
        doesn't get<br>
         >> an ip.<br>
         >>         The Dnsmask processes are running okay [2] and i<br>
        can't see<br>
         >>         anything special/ wrong in it.<br>
         >><br>
         >>         Any idea why this is happening? Or are you aware of<br>
        any bugs<br>
         >>         around this area? Do you see a problems with having<br>
        2 subnets<br>
         >>         mapped to 1 private-net?<br>
         >><br>
         >><br>
         >><br>
         >>         Thanks,<br>
         >>         Dani<br>
         >><br>
         >>         [1]<br>
         >><br>
         >><br>
        /var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/addn_hosts<br>
         >><br>
         >>         [2]<br>
         >><br>
         >>         nobody    5664     1  0 Jun02 ?        00:00:08 dnsmasq<br>
         >>         --no-hosts --no-resolv --strict-order --bind-interfaces<br>
         >>         --interface=tapc9164734-0c --except-interface=lo<br>
         >><br>
         >><br>
        --pid-file=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/pid<br>
         >><br>
         >><br>
        --dhcp-hostsfile=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/host<br>
         >><br>
         >><br>
         >><br>
        --addn-hosts=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/addn_hosts<br>
         >><br>
         >><br>
         >><br>
        --dhcp-optsfile=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/opts<br>
         >><br>
         >>         --leasefile-ro --dhcp-authoritative<br>
         >>         --dhcp-range=set:tag0,192.168.110.0,static,86400s<br>
         >>         --dhcp-range=set:tag1,192.168.111.0,static,86400s<br>
         >>         --dhcp-lease-max=512 --conf-file= --server=10.0.0.31<br>
         >>         --server=10.0.0.32 --domain=openstacklocal<br>
         >><br>
         >><br>
         >><br>
         >><br>
         >><br>
         >> _______________________________________________<br>
         >> OpenStack-operators mailing list<br>
         >> <a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br></div></div>
        <mailto:<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a>><span class=""><br>
         >><br>
        <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
         >><br>
         ><br>
         > _______________________________________________<br>
         > OpenStack-operators mailing list<br>
         > <a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br></span>
        <mailto:<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a>><span class=""><br>
         ><br>
        <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
<br>
        __________________________________________________________________________<br>
        OpenStack Development Mailing List (not for usage questions)<br>
        Unsubscribe:<br>
        <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br></span>
        <<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>><span class=""><br>
        <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
    --<br>
    --<br>
    Andrew Woodward<br>
    Mirantis<br>
    Fuel Community Ambassador<br>
    Ceph Community<br>
<br>
    __________________________________________________________________________<br>
    OpenStack Development Mailing List (not for usage questions)<br>
    Unsubscribe:<br>
    <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br></span>
    <<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>><span class=""><br>
    <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
<br>
<br>
<br>
--<br>
Kevin Benton<br>
<br>
<br></span><span class="">
_______________________________________________<br>
OpenStack-operators mailing list<br>
<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
<br>
</span></blockquote>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div>Kevin Benton</div></div>
</div>