<div dir="ltr">><span style="font-size:12.8000001907349px">Just to be sure, I assume we're focussing here on the issue that Daniel reported</span><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">Yes.</span></div><div><span style="font-size:12.8000001907349px"><br></span></div><div><span style="font-size:12.8000001907349px">></span><span style="font-size:12.8000001907349px">To be clear, though, what code are you trying to reproduce on? Current master?</span></div><span class="im" style="font-size:12.8000001907349px"><br></span><div>I was trying on 2014.1.3, which is the version I understand to be on Fuel 5.1.1.</div><div><br></div><div>><span style="font-size:12.8000001907349px">I'm not clear whether that would qualify as 'concurrent', in the sense that you have in mind.</span></div><br style="font-size:12.8000001907349px"><div>It doesn't look like it based on the pseudocode. I was thinking of a condition where a port is deleted nearly very quickly after it was created. Is that possible with your test? If not, then my theory about out-of-order notifications might not be any good.</div></div><div class="gmail_extra"><br><div class="gmail_quote">On Tue, Jun 9, 2015 at 3:34 AM, Neil Jerram <span dir="ltr"><<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">On 09/06/15 01:15, Kevin Benton wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I'm having difficulty reproducing the issue. The bug that Neil<br>
referenced (<a href="https://bugs.launchpad.net/neutron/+bug/1192381" target="_blank">https://bugs.launchpad.net/neutron/+bug/1192381</a>) looks like<br>
it was in Icehouse well before the 2014.1.3 release that looks like Fuel<br>
5.1.1 is using.<br>
</blockquote>
<br></span>
Just to be sure, I assume we're focussing here on the issue that Daniel reported (IP appears twice in Dnsmasq config), and for which I described a possible corollary (Dnsmasq config size keeps growing), and NOT on the "Another DHCP agent problem" that I mentioned below. :-)<br>
<br>
BTW, now that I've reviewed the history of when my team saw this, I can say that it was actually first reported to us with the 'IP appears twice in Dnsmasq config' symptom - i.e. exactly the same as Daniel's case. The fact of the Dnsmasq config increasing in size was noticed later.<span class=""><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
I tried setting the agent report interval to something higher than the<br>
downtime to make it seem like the agent is failing sporadically to the<br>
server, but it's not impacting the notifications.<br>
</blockquote>
<br></span>
Makes sense - that's the effect of the fix for 1192381.<br>
<br>
To be clear, though, what code are you trying to reproduce on? Current master?<span class=""><br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Neil, does your testing where you saw something similar have a lot of<br>
concurrent creation/deletion?<br>
</blockquote>
<br></span>
It was a test of continuously deleting and creating VMs, with this pseudocode:<br>
<br>
thread_pool = new_thread_pool(size=30)<br>
for x in range(0,30):<br>
thread_pool.submit(create_vm)<br>
thread_pool.wait_for_all_threads_to_complete()<br>
while True:<br>
time.sleep(5)<br>
for x in range(0,int(random.random()*5)):<br>
thread_pool.submit(randomly_delete_a_vm_and_create_a_new_one)<br>
<br>
I'm not clear whether that would qualify as 'concurrent', in the sense that you have in mind.<br>
<br>
Regards,<br>
Neil<br>
<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><span class="">
On Mon, Jun 8, 2015 at 12:21 PM, Andrew Woodward <<a href="mailto:awoodward@mirantis.com" target="_blank">awoodward@mirantis.com</a><br></span><span class="">
<mailto:<a href="mailto:awoodward@mirantis.com" target="_blank">awoodward@mirantis.com</a>>> wrote:<br>
<br>
Daniel,<br>
<br>
This sounds familiar, see if this matches [1]. IIRC, there was<br>
another issue like this that was might already address this in the<br>
updates into Fuel 5.1.2 packages repo [2]. You can either update the<br>
neutron packages from [2] Or try one of community builds for 5.1.2<br>
[3]. If this doesn't resolve the issue, open a bug against MOS dev [4].<br>
<br>
[1] <a href="https://bugs.launchpad.net/bugs/1295715" target="_blank">https://bugs.launchpad.net/bugs/1295715</a><br>
[2] <a href="http://fuel-repository.mirantis.com/fwm/5.1.2/ubuntu/pool/main/" target="_blank">http://fuel-repository.mirantis.com/fwm/5.1.2/ubuntu/pool/main/</a><br>
[3] <a href="https://ci.fuel-infra.org/" target="_blank">https://ci.fuel-infra.org/</a><br>
[4] <a href="https://bugs.launchpad.net/mos/+filebug" target="_blank">https://bugs.launchpad.net/mos/+filebug</a><br>
<br>
On Mon, Jun 8, 2015 at 10:15 AM Neil Jerram<br></span><div><div class="h5">
<<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a> <mailto:<a href="mailto:Neil.Jerram@metaswitch.com" target="_blank">Neil.Jerram@metaswitch.com</a>>> wrote:<br>
<br>
Two further thoughts on this:<br>
<br>
1. Another DHCP agent problem that my team noticed is that it<br>
call_driver('reload_allocations') takes a bit of time (to<br>
regenerate the<br>
Dnsmasq config files, and to spawn a shell that sends a HUP<br>
signal) -<br>
enough so that if there is a fast steady rate of port-create and<br>
port-delete notifications coming from the Neutron server, these can<br>
build up in DHCPAgent's RPC queue, and then they still only get<br>
dispatched one at a time. So the queue and the time delay<br>
become longer<br>
and longer.<br>
<br>
I have a fix pending for this, which uses an extra thread to<br>
read those<br>
notifications off the RPC queue onto an internal queue, and then<br>
batches<br>
the call_driver('reload_allocations') processing when there is a<br>
contiguous sequence of such notifications - i.e. only does the<br>
config<br>
regeneration and HUP once, instead of lots of times.<br>
<br>
I don't think this is directly related to what you are seeing - but<br>
perhaps there actually is some link that I am missing.<br>
<br>
2. There is an interesting and vaguely similar thread currently<br>
being<br>
discussed about the L3 agent (subject "L3 agent rescheduling<br>
issue") -<br>
about possible RPC/threading issues between the agent and the<br>
Neutron<br>
server. You might like to review that thread and see if it<br>
describes<br>
any problems analogous to your DHCP one.<br>
<br>
Regards,<br>
Neil<br>
<br>
<br>
On 08/06/15 17:53, Neil Jerram wrote:<br>
> My team has seen a problem that could be related: in a churn<br>
test where<br>
> VMs are created and terminated at a constant rate - but so<br>
that the<br>
> number of active VMs should remain roughly constant - the<br>
size of the<br>
> host and addn_hosts files keeps increasing.<br>
><br>
> In other words, it appears that the config for VMs that have<br>
actually<br>
> been terminated is not being removed from the config file.<br>
Clearly, if<br>
> you have a limited pool of IP addresses, this can eventually<br>
lead to the<br>
> problem that you have described.<br>
><br>
> For your case - i.e. with Icehouse - the problem might be<br>
> <a href="https://bugs.launchpad.net/neutron/+bug/1192381" target="_blank">https://bugs.launchpad.net/neutron/+bug/1192381</a>. I'm not<br>
sure if the<br>
> fix for that problem - i.e. sending port-create and port-delete<br>
> notifications to DHCP agents even when the server thinks they<br>
are down -<br>
> was merged before the Icehouse release, or not.<br>
><br>
> But there must be at least one other cause as well, because<br>
my team was<br>
> seeing this with Juno-level code.<br>
><br>
> Therefore I, too, would be interested in any other insights<br>
about this<br>
> problem.<br>
><br>
> Regards,<br>
> Neil<br>
><br>
><br>
><br>
> On 08/06/15 16:26, Daniel Comnea wrote:<br>
>> Any help, ideas please?<br>
>><br>
>> Thx,<br>
>> Dani<br>
>><br>
>> On Mon, Jun 8, 2015 at 9:25 AM, Daniel Comnea<br>
<<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>><br></div></div>
>> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a><span class=""><br>
<mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>>>> wrote:<br>
>><br>
>> + Operators<br>
>><br>
>> Much thanks in advance,<br>
>> Dani<br>
>><br>
>><br>
>><br>
>><br>
>> On Sun, Jun 7, 2015 at 6:31 PM, Daniel Comnea<br>
<<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>><br></span>
>> <mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a><div><div class="h5"><br>
<mailto:<a href="mailto:comnea.dani@gmail.com" target="_blank">comnea.dani@gmail.com</a>>>> wrote:<br>
>><br>
>> Hi all,<br>
>><br>
>> I'm running IceHouse (build using Fuel 5.1.1) on<br>
Ubuntu where<br>
>> dnsmask version 2.59-4.<br>
>> I have a very basic network layout where i have a<br>
private net<br>
>> which has 2 subnets<br>
>><br>
>> 2fb7de9d-d6df-481f-acca-2f7860cffa60 | private-net<br>
>> |<br>
>> e79c3477-d3e5-471c-a728-8d881cf31bee<br>
<a href="http://192.168.110.0/24" target="_blank">192.168.110.0/24</a> <<a href="http://192.168.110.0/24" target="_blank">http://192.168.110.0/24</a>><br>
>> <<a href="http://192.168.110.0/24" target="_blank">http://192.168.110.0/24</a>> |<br>
>> |<br>
>> |<br>
|<br>
>> f48c3223-8507-455c-9c13-8b727ea5f441<br>
<a href="http://192.168.111.0/24" target="_blank">192.168.111.0/24</a> <<a href="http://192.168.111.0/24" target="_blank">http://192.168.111.0/24</a>><br>
>> <<a href="http://192.168.111.0/24" target="_blank">http://192.168.111.0/24</a>> |<br>
>><br>
>> and i'm creating VMs via HEAT.<br>
>> What is happening is that sometimes i get duplicated<br>
entries in<br>
>> [1] and because of that the VM which was spun up<br>
doesn't get<br>
>> an ip.<br>
>> The Dnsmask processes are running okay [2] and i<br>
can't see<br>
>> anything special/ wrong in it.<br>
>><br>
>> Any idea why this is happening? Or are you aware of<br>
any bugs<br>
>> around this area? Do you see a problems with having<br>
2 subnets<br>
>> mapped to 1 private-net?<br>
>><br>
>><br>
>><br>
>> Thanks,<br>
>> Dani<br>
>><br>
>> [1]<br>
>><br>
>><br>
/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/addn_hosts<br>
>><br>
>> [2]<br>
>><br>
>> nobody 5664 1 0 Jun02 ? 00:00:08 dnsmasq<br>
>> --no-hosts --no-resolv --strict-order --bind-interfaces<br>
>> --interface=tapc9164734-0c --except-interface=lo<br>
>><br>
>><br>
--pid-file=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/pid<br>
>><br>
>><br>
--dhcp-hostsfile=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/host<br>
>><br>
>><br>
>><br>
--addn-hosts=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/addn_hosts<br>
>><br>
>><br>
>><br>
--dhcp-optsfile=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/opts<br>
>><br>
>> --leasefile-ro --dhcp-authoritative<br>
>> --dhcp-range=set:tag0,192.168.110.0,static,86400s<br>
>> --dhcp-range=set:tag1,192.168.111.0,static,86400s<br>
>> --dhcp-lease-max=512 --conf-file= --server=10.0.0.31<br>
>> --server=10.0.0.32 --domain=openstacklocal<br>
>><br>
>><br>
>><br>
>><br>
>><br>
>> _______________________________________________<br>
>> OpenStack-operators mailing list<br>
>> <a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br></div></div>
<mailto:<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a>><span class=""><br>
>><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
>><br>
><br>
> _______________________________________________<br>
> OpenStack-operators mailing list<br>
> <a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br></span>
<mailto:<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a>><span class=""><br>
><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe:<br>
<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br></span>
<<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>><span class=""><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
--<br>
--<br>
Andrew Woodward<br>
Mirantis<br>
Fuel Community Ambassador<br>
Ceph Community<br>
<br>
__________________________________________________________________________<br>
OpenStack Development Mailing List (not for usage questions)<br>
Unsubscribe:<br>
<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br></span>
<<a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" target="_blank">http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a>><span class=""><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>
<br>
<br>
<br>
<br>
--<br>
Kevin Benton<br>
<br>
<br></span><span class="">
_______________________________________________<br>
OpenStack-operators mailing list<br>
<a href="mailto:OpenStack-operators@lists.openstack.org" target="_blank">OpenStack-operators@lists.openstack.org</a><br>
<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators</a><br>
<br>
</span></blockquote>
</blockquote></div><br><br clear="all"><div><br></div>-- <br><div class="gmail_signature"><div>Kevin Benton</div></div>
</div>