[Openstack-operators] neutron + OVS 1.11 (or 2.0.1)

Jacob Godin jacobgodin at gmail.com
Fri Jan 24 12:50:29 UTC 2014


I think it's supposed to be called on the network/compute nodes while
neutron services are off (aka before they start on boot).


On Fri, Jan 24, 2014 at 8:33 AM, George Shuklin <george.shuklin at gmail.com>wrote:

> Thanks. Seems be good, but it not called during hosts startup. This is a
> bug or a feature?
>
>
>
> On 24.01.2014 10:58, Robert Collins wrote:
>
>> neutron-ovs-cleanup should clean that right up for you.
>>
>> On 24 January 2014 21:54, George Shuklin <george.shuklin at gmail.com>
>> wrote:
>>
>>> OVS 2.0.1 is still in plans.
>>>
>>> But I found the source of networking delays!
>>>
>>> If someone touch ovs configuration (f.e. add iface to bridge) it save
>>> state
>>> to internal database and restore it on next boot.
>>>
>>> But if br-int has some tun/tap interfaces, they gonna be saved too. With
>>> all
>>> gre-rules. And restored on next boot. Before nova start any instances.
>>> That
>>> cause some huge delay and sometime irresponsiveness of floatingips, and
>>> even
>>> DHCP gonna ask for lease few times, or even fail completely to get lease
>>> due
>>> timeout.
>>>
>>> I don't know proper solution so far, but for now it just rule 'how to
>>> change
>>> OVS config':
>>>
>>> * schedule downtime
>>> * shutoff/migrate every instance
>>> * disable routers/dhcp agents (for networking node)
>>> * change configuration
>>> * clean bridges of all strange interfaces
>>> * move instances back or start them
>>>
>>> Situation is bit simpler for netboot nodes, where is no local
>>> 'configuration
>>> database' on the node.
>>>
>>>
>>>
>>> On 24.01.2014 00:51, Jacob Godin wrote:
>>>
>>> Hi George,
>>>
>>> Thanks for the detailed response. How has OVS 2 been working for you so
>>> far?
>>>
>>> I'm running the 3.5 kernel from Precise (generic-lts-quantal), so
>>> hopefully
>>> the kernel module won't be an issue. Not sure why that delay would occur
>>> after boot..
>>>
>>>
>>>
>>> On Sun, Jan 19, 2014 at 3:45 PM, George Shuklin <
>>> george.shuklin at gmail.com>
>>> wrote:
>>>
>>>> Yes and no.
>>>>
>>>> Yes: I was able to upgrade laboratory cluster from OVS 1.10 to OVS 1.11
>>>> and it performs few orders better under --rand-source DoS attack then
>>>> OVS
>>>> 1.10-based installation.
>>>> No: there is issues.
>>>>
>>>> Issue #1:
>>>> OVS 1.11 (vanilla version) has datapath (kernel module) with is not
>>>> compilable with linux-3.11 (which is default for ubuntu cloud archive).
>>>> Because canonical was able to build OVS 1.10 against 3.11, I think this
>>>> is
>>>> possible. Research pending, but right now I stuck with linux-3.8
>>>>
>>>> Issue #2:
>>>> Delayed recovery after reboot. Because of unknown reason (research
>>>> pending) systems under OVS 1.11 behave bit strange after whole system
>>>> (all
>>>> hosts) reboot. There is a long delay (about 6-10 minutes) before
>>>> networking
>>>> restore after successful booting of every server and instance start. At
>>>> first I even thought it is broken (uptime 3 minutes - no dhcp for
>>>> instances).
>>>>
>>>> I'll continue to play around ovs 2.0.1 and other questions with
>>>> networking, because deploying OVS 1.10 to production environment is some
>>>> kind of slow suicide. Any script kiddie with hping and just 15 Mbit
>>>> channel
>>>> will able to completely shutoff networking node (>90% packet loss), and
>>>> just
>>>> about 5Mbit/s of --rand-source flood is enough to cripple it (>5% packet
>>>> loss).
>>>>
>>>>
>>>> On 01/18/2014 04:24 PM, Jacob Godin wrote:
>>>>
>>>> Hi George,
>>>>
>>>> To clarify, you were able to upgrade from 1.10 or install 1.11 fresh
>>>> without any issues?
>>>>
>>>> Sent from my mobile device
>>>>
>>>> On Jan 17, 2014 4:56 PM, "George Shuklin" <george.shuklin at gmail.com>
>>>> wrote:
>>>>
>>>>> For 1.11 I was wrong, it working fine.
>>>>>
>>>>> For 2.0.1 something is broken, but I still can't get where. VMs can
>>>>> ping
>>>>> each other within host (if configured manually), but traffic is not
>>>>> getting
>>>>> out br-tun (no GRE, no DHCP from network node).
>>>>>
>>>>> On 01/16/14 18:11, Aaron Rosen wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> Can you give more details on how it breaks? Did you restart the agents
>>>>> so
>>>>> it reprograms the flows back down?
>>>>>
>>>>> On Jan 16, 2014 2:06 AM, "George Shuklin" <george.shuklin at gmail.com>
>>>>> wrote:
>>>>>
>>>>>> Good day.
>>>>>>
>>>>>> Did anyone successfully combine havanna and OVS > 1.10? OVS 1.10 is
>>>>>> really suck under specific types of load (was fixed in OVS 1.11 and
>>>>>> later).
>>>>>> But plain upgrade of OVS breaks neutron (under research).
>>>>>>
>>>>>> Did anyone walk that path?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>> _______________________________________________
>>>>>> OpenStack-operators mailing list
>>>>>> OpenStack-operators at lists.openstack.org
>>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/
>>>>>> openstack-operators
>>>>>>
>>>>>
>>>>>
>>>>> _______________________________________________
>>>>> OpenStack-operators mailing list
>>>>> OpenStack-operators at lists.openstack.org
>>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/
>>>>> openstack-operators
>>>>>
>>>>>
>>>
>>> _______________________________________________
>>> OpenStack-operators mailing list
>>> OpenStack-operators at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140124/4d938250/attachment.html>


More information about the OpenStack-operators mailing list