[Openstack-operators] Liberty and OVS Agent restarts
mkassawara at gmail.com
Fri Feb 12 14:27:58 UTC 2016
Out of curiosity, what do you have for the "external_network_bridge" option
in the L3 agent config?
On Wed, Feb 10, 2016 at 2:42 PM, Bajin, Joseph <jbajin at verisign.com> wrote:
> This is really good information.
> I’m wondering how we can help support you and get the necessary dev
> support to get this resolved sooner than later. I totally agree with you
> that this should be backported to at least Liberty.
> Please let me know how I and other can help!
> On 2/10/16, 8:55 AM, "Clayton O'Neill" <clayton at oneill.net> wrote:
> >Summary: Liberty OVS agent restarts are better, but still need work.
> >See: https://bugs.launchpad.net/neutron/+bug/1514056
> >As many of you know, Liberty has a fix for OVS agent restarts such
> >that it doesn’t dump all flows when starting, resulting in a loss of
> >traffic. Unfortunately, Liberty neutron still has issues with OVS
> >agent restarts. The fix that went into Liberty prevents it from
> >dropping flows on the br-tun and br-int bridges and that helps
> >greatly, but the br-ex bridge still has it’s flows cleared on startup.
> >You may be thinking: Wait, br-ex only has like 3 flows on it, how can
> >that be a problem? The issue appears to be that the br-ex flows are
> >cleared early and not setup again until late in the process. This
> >means that routers on the node where OVS agent is lose network
> >connectivity for the majority of the restart time.
> >I did some testing with this yesterday, comparing a few scenarios with
> >100 FIPS, 100 instances and various scenarios for routers. You can
> >find the the complete data here:
> >The summary looks like this:
> >100 routers, 100 networks, 100 floating ips, 100 instances, single node
> >Kilo average outage time: 47 seconds
> >Liberty average outage time: 37 seconds
> >1 router, 1 network, 100 floating ips, 100 instances, single node test:
> >Kilo average outage time: 46 seconds
> >Liberty average outage time: 13 seconds
> >1 router, 1 network, 100 floating its, 100 instances, router on a
> >separate node, all instances on a single node, OVS restart on compute
> >Kilo average outage time: 25 seconds
> >Liberty average outage time: 0 to 1 seconds
> >I did my testing using 1 second pings using fping to all of the
> >floating IPs. With the last test, it frequently lost no packets, and
> >as a result I was not really able to test the scenario other than to
> >qualify it as good.
> >This is a huge operational issue for us and I suspect for many of the
> >rest of you using OVS. I’d encourage everyone that is using OVS to
> >register interest in having this fixed in the LP bug
> >(https://bugs.launchpad.net/neutron/+bug/1514056). Right now this bug
> >as marked as low priority.
> >OpenStack-operators mailing list
> >OpenStack-operators at lists.openstack.org
> OpenStack-operators mailing list
> OpenStack-operators at lists.openstack.org
-------------- next part --------------
An HTML attachment was scrubbed...
More information about the OpenStack-operators