[Openstack-operators] neutron + OVS 1.11 (or 2.0.1)

George Shuklin george.shuklin at gmail.com
Fri Jan 24 08:54:31 UTC 2014


OVS 2.0.1 is still in plans.

But I found the source of networking delays!

If someone touch ovs configuration (f.e. add iface to bridge) it save 
state to internal database and restore it on next boot.

But if br-int has some tun/tap interfaces, they gonna be saved too. With 
all gre-rules. And restored on next boot. Before nova start any 
instances. That cause some huge delay and sometime irresponsiveness of 
floatingips, and even DHCP gonna ask for lease few times, or even fail 
completely to get lease due timeout.

I don't know proper solution so far, but for now it just rule 'how to 
change OVS config':

* schedule downtime
* shutoff/migrate every instance
* disable routers/dhcp agents (for networking node)
* change configuration
* clean bridges of all strange interfaces
* move instances back or start them

Situation is bit simpler for netboot nodes, where is no local 
'configuration database' on the node.


On 24.01.2014 00:51, Jacob Godin wrote:
> Hi George,
>
> Thanks for the detailed response. How has OVS 2 been working for you 
> so far?
>
> I'm running the 3.5 kernel from Precise (generic-lts-quantal), so 
> hopefully the kernel module won't be an issue. Not sure why that delay 
> would occur after boot..
>
>
>
> On Sun, Jan 19, 2014 at 3:45 PM, George Shuklin 
> <george.shuklin at gmail.com <mailto:george.shuklin at gmail.com>> wrote:
>
>     Yes and no.
>
>     Yes: I was able to upgrade laboratory cluster from OVS 1.10 to OVS
>     1.11 and it performs few orders better under --rand-source DoS
>     attack then OVS 1.10-based installation.
>     No: there is issues.
>
>     Issue #1:
>     OVS 1.11 (vanilla version) has datapath (kernel module) with is
>     not compilable with linux-3.11 (which is default for ubuntu cloud
>     archive).  Because canonical was able to build OVS 1.10 against
>     3.11, I think this is possible. Research pending, but right now I
>     stuck with linux-3.8
>
>     Issue #2:
>     Delayed recovery after reboot. Because of unknown reason (research
>     pending) systems under OVS 1.11 behave bit strange after whole
>     system (all hosts) reboot. There is a long delay (about 6-10
>     minutes) before networking restore after successful booting of
>     every server and instance start. At first I even thought it is
>     broken (uptime 3 minutes - no dhcp for instances).
>
>     I'll continue to play around ovs 2.0.1 and other questions with
>     networking, because deploying OVS 1.10 to production environment
>     is some kind of slow suicide. Any script kiddie with hping and
>     just 15 Mbit channel will able to completely shutoff networking
>     node (>90% packet loss), and just about 5Mbit/s of --rand-source
>     flood is enough to cripple it (>5% packet loss).
>
>
>     On 01/18/2014 04:24 PM, Jacob Godin wrote:
>>
>>     Hi George,
>>
>>     To clarify, you were able to upgrade from 1.10 or install 1.11
>>     fresh without any issues?
>>
>>     Sent from my mobile device
>>
>>     On Jan 17, 2014 4:56 PM, "George Shuklin"
>>     <george.shuklin at gmail.com <mailto:george.shuklin at gmail.com>> wrote:
>>
>>         For 1.11 I was wrong, it working fine.
>>
>>         For 2.0.1 something is broken, but I still can't get where.
>>         VMs can ping each other within host (if configured manually),
>>         but traffic is not getting out br-tun (no GRE, no DHCP from
>>         network node).
>>
>>         On 01/16/14 18:11, Aaron Rosen wrote:
>>>
>>>         Hi,
>>>
>>>         Can you give more details on how it breaks? Did you restart
>>>         the agents so it reprograms the flows back down?
>>>
>>>         On Jan 16, 2014 2:06 AM, "George Shuklin"
>>>         <george.shuklin at gmail.com <mailto:george.shuklin at gmail.com>>
>>>         wrote:
>>>
>>>             Good day.
>>>
>>>             Did anyone successfully combine havanna and OVS > 1.10?
>>>             OVS 1.10 is really suck under specific types of load
>>>             (was fixed in OVS 1.11 and later). But plain upgrade of
>>>             OVS breaks neutron (under research).
>>>
>>>             Did anyone walk that path?
>>>
>>>             Thanks.
>>>
>>>             _______________________________________________
>>>             OpenStack-operators mailing list
>>>             OpenStack-operators at lists.openstack.org
>>>             <mailto:OpenStack-operators at lists.openstack.org>
>>>             http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>>
>>
>>
>>         _______________________________________________
>>         OpenStack-operators mailing list
>>         OpenStack-operators at lists.openstack.org
>>         <mailto:OpenStack-operators at lists.openstack.org>
>>         http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-operators
>>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-operators/attachments/20140124/7de0d5de/attachment.html>


More information about the OpenStack-operators mailing list