[Openstack-operators] [openstack-dev] [openstack-operators][neutron[dhcp][dnsmask]: duplicate entries in addn_hosts causing no IP allocation
Kevin Benton
blak111 at gmail.com
Tue Jun 9 11:36:45 UTC 2015
>Just to be sure, I assume we're focussing here on the issue that Daniel
reported
Yes.
>To be clear, though, what code are you trying to reproduce on? Current
master?
I was trying on 2014.1.3, which is the version I understand to be on Fuel
5.1.1.
>I'm not clear whether that would qualify as 'concurrent', in the sense
that you have in mind.
It doesn't look like it based on the pseudocode. I was thinking of a
condition where a port is deleted nearly very quickly after it was created.
Is that possible with your test? If not, then my theory about out-of-order
notifications might not be any good.
On Tue, Jun 9, 2015 at 3:34 AM, Neil Jerram <Neil.Jerram at metaswitch.com>
wrote:
> On 09/06/15 01:15, Kevin Benton wrote:
>
>> I'm having difficulty reproducing the issue. The bug that Neil
>> referenced (https://bugs.launchpad.net/neutron/+bug/1192381) looks like
>> it was in Icehouse well before the 2014.1.3 release that looks like Fuel
>> 5.1.1 is using.
>>
>
> Just to be sure, I assume we're focussing here on the issue that Daniel
> reported (IP appears twice in Dnsmasq config), and for which I described a
> possible corollary (Dnsmasq config size keeps growing), and NOT on the
> "Another DHCP agent problem" that I mentioned below. :-)
>
> BTW, now that I've reviewed the history of when my team saw this, I can
> say that it was actually first reported to us with the 'IP appears twice in
> Dnsmasq config' symptom - i.e. exactly the same as Daniel's case. The fact
> of the Dnsmasq config increasing in size was noticed later.
>
> I tried setting the agent report interval to something higher than the
>> downtime to make it seem like the agent is failing sporadically to the
>> server, but it's not impacting the notifications.
>>
>
> Makes sense - that's the effect of the fix for 1192381.
>
> To be clear, though, what code are you trying to reproduce on? Current
> master?
>
> Neil, does your testing where you saw something similar have a lot of
>> concurrent creation/deletion?
>>
>
> It was a test of continuously deleting and creating VMs, with this
> pseudocode:
>
> thread_pool = new_thread_pool(size=30)
> for x in range(0,30):
> thread_pool.submit(create_vm)
> thread_pool.wait_for_all_threads_to_complete()
> while True:
> time.sleep(5)
> for x in range(0,int(random.random()*5)):
> thread_pool.submit(randomly_delete_a_vm_and_create_a_new_one)
>
> I'm not clear whether that would qualify as 'concurrent', in the sense
> that you have in mind.
>
> Regards,
> Neil
>
> On Mon, Jun 8, 2015 at 12:21 PM, Andrew Woodward <awoodward at mirantis.com
>> <mailto:awoodward at mirantis.com>> wrote:
>>
>> Daniel,
>>
>> This sounds familiar, see if this matches [1]. IIRC, there was
>> another issue like this that was might already address this in the
>> updates into Fuel 5.1.2 packages repo [2]. You can either update the
>> neutron packages from [2] Or try one of community builds for 5.1.2
>> [3]. If this doesn't resolve the issue, open a bug against MOS dev
>> [4].
>>
>> [1] https://bugs.launchpad.net/bugs/1295715
>> [2] http://fuel-repository.mirantis.com/fwm/5.1.2/ubuntu/pool/main/
>> [3] https://ci.fuel-infra.org/
>> [4] https://bugs.launchpad.net/mos/+filebug
>>
>> On Mon, Jun 8, 2015 at 10:15 AM Neil Jerram
>> <Neil.Jerram at metaswitch.com <mailto:Neil.Jerram at metaswitch.com>>
>> wrote:
>>
>> Two further thoughts on this:
>>
>> 1. Another DHCP agent problem that my team noticed is that it
>> call_driver('reload_allocations') takes a bit of time (to
>> regenerate the
>> Dnsmasq config files, and to spawn a shell that sends a HUP
>> signal) -
>> enough so that if there is a fast steady rate of port-create and
>> port-delete notifications coming from the Neutron server, these
>> can
>> build up in DHCPAgent's RPC queue, and then they still only get
>> dispatched one at a time. So the queue and the time delay
>> become longer
>> and longer.
>>
>> I have a fix pending for this, which uses an extra thread to
>> read those
>> notifications off the RPC queue onto an internal queue, and then
>> batches
>> the call_driver('reload_allocations') processing when there is a
>> contiguous sequence of such notifications - i.e. only does the
>> config
>> regeneration and HUP once, instead of lots of times.
>>
>> I don't think this is directly related to what you are seeing -
>> but
>> perhaps there actually is some link that I am missing.
>>
>> 2. There is an interesting and vaguely similar thread currently
>> being
>> discussed about the L3 agent (subject "L3 agent rescheduling
>> issue") -
>> about possible RPC/threading issues between the agent and the
>> Neutron
>> server. You might like to review that thread and see if it
>> describes
>> any problems analogous to your DHCP one.
>>
>> Regards,
>> Neil
>>
>>
>> On 08/06/15 17:53, Neil Jerram wrote:
>> > My team has seen a problem that could be related: in a churn
>> test where
>> > VMs are created and terminated at a constant rate - but so
>> that the
>> > number of active VMs should remain roughly constant - the
>> size of the
>> > host and addn_hosts files keeps increasing.
>> >
>> > In other words, it appears that the config for VMs that have
>> actually
>> > been terminated is not being removed from the config file.
>> Clearly, if
>> > you have a limited pool of IP addresses, this can eventually
>> lead to the
>> > problem that you have described.
>> >
>> > For your case - i.e. with Icehouse - the problem might be
>> > https://bugs.launchpad.net/neutron/+bug/1192381. I'm not
>> sure if the
>> > fix for that problem - i.e. sending port-create and port-delete
>> > notifications to DHCP agents even when the server thinks they
>> are down -
>> > was merged before the Icehouse release, or not.
>> >
>> > But there must be at least one other cause as well, because
>> my team was
>> > seeing this with Juno-level code.
>> >
>> > Therefore I, too, would be interested in any other insights
>> about this
>> > problem.
>> >
>> > Regards,
>> > Neil
>> >
>> >
>> >
>> > On 08/06/15 16:26, Daniel Comnea wrote:
>> >> Any help, ideas please?
>> >>
>> >> Thx,
>> >> Dani
>> >>
>> >> On Mon, Jun 8, 2015 at 9:25 AM, Daniel Comnea
>> <comnea.dani at gmail.com <mailto:comnea.dani at gmail.com>
>> >> <mailto:comnea.dani at gmail.com
>> <mailto:comnea.dani at gmail.com>>> wrote:
>> >>
>> >> + Operators
>> >>
>> >> Much thanks in advance,
>> >> Dani
>> >>
>> >>
>> >>
>> >>
>> >> On Sun, Jun 7, 2015 at 6:31 PM, Daniel Comnea
>> <comnea.dani at gmail.com <mailto:comnea.dani at gmail.com>
>> >> <mailto:comnea.dani at gmail.com
>>
>> <mailto:comnea.dani at gmail.com>>> wrote:
>> >>
>> >> Hi all,
>> >>
>> >> I'm running IceHouse (build using Fuel 5.1.1) on
>> Ubuntu where
>> >> dnsmask version 2.59-4.
>> >> I have a very basic network layout where i have a
>> private net
>> >> which has 2 subnets
>> >>
>> >> 2fb7de9d-d6df-481f-acca-2f7860cffa60 | private-net
>> >> |
>> >> e79c3477-d3e5-471c-a728-8d881cf31bee
>> 192.168.110.0/24 <http://192.168.110.0/24>
>> >> <http://192.168.110.0/24> |
>> >> |
>> >> |
>> |
>> >> f48c3223-8507-455c-9c13-8b727ea5f441
>> 192.168.111.0/24 <http://192.168.111.0/24>
>> >> <http://192.168.111.0/24> |
>> >>
>> >> and i'm creating VMs via HEAT.
>> >> What is happening is that sometimes i get duplicated
>> entries in
>> >> [1] and because of that the VM which was spun up
>> doesn't get
>> >> an ip.
>> >> The Dnsmask processes are running okay [2] and i
>> can't see
>> >> anything special/ wrong in it.
>> >>
>> >> Any idea why this is happening? Or are you aware of
>> any bugs
>> >> around this area? Do you see a problems with having
>> 2 subnets
>> >> mapped to 1 private-net?
>> >>
>> >>
>> >>
>> >> Thanks,
>> >> Dani
>> >>
>> >> [1]
>> >>
>> >>
>>
>> /var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/addn_hosts
>> >>
>> >> [2]
>> >>
>> >> nobody 5664 1 0 Jun02 ? 00:00:08
>> dnsmasq
>> >> --no-hosts --no-resolv --strict-order
>> --bind-interfaces
>> >> --interface=tapc9164734-0c --except-interface=lo
>> >>
>> >>
>>
>> --pid-file=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/pid
>> >>
>> >>
>>
>> --dhcp-hostsfile=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/host
>> >>
>> >>
>> >>
>>
>> --addn-hosts=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/addn_hosts
>> >>
>> >>
>> >>
>>
>> --dhcp-optsfile=/var/lib/neutron/dhcp/2fb7de9d-d6df-481f-acca-2f7860cffa60/opts
>> >>
>> >> --leasefile-ro --dhcp-authoritative
>> >> --dhcp-range=set:tag0,192.168.110.0,static,86400s
>> >> --dhcp-range=set:tag1,192.168.111.0,static,86400s
>> >> --dhcp-lease-max=512 --conf-file= --server=10.0.0.31
>> >> --server=10.0.0.32 --domain=openstacklocal
>> >>
>> >>
>> >>
>> >>
>> >>
>> >> _______________________________________________
>> >> OpenStack-operators mailing list
>> >> OpenStack-operators at lists.openstack.org
>> <mailto:OpenStack-operators at lists.openstack.org>
>> >>
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>> >>
>> >
>> > _______________________________________________
>> > OpenStack-operators mailing list
>> > OpenStack-operators at lists.openstack.org
>> <mailto:OpenStack-operators at lists.openstack.org>
>> >
>>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> <
>> http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> --
>> --
>> Andrew Woodward
>> Mirantis
>> Fuel Community Ambassador
>> Ceph Community
>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> <http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe
>> >
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>>
>>
>> --
>> Kevin Benton
>>
>>
>> _______________________________________________
>> OpenStack-operators mailing list
>> OpenStack-operators at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>>
--
Kevin Benton
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20150609/5bf701de/attachment.html>
More information about the OpenStack-operators
mailing list