[openstack-dev] [neutron][TripleO] Clear all flows when ovs agent start? why and how avoid?

Salvatore Orlando sorlando at nicira.com
Wed Nov 5 07:36:28 UTC 2014


>From what I gather from this thread and related bug report, the change
introduced in the OVS agent is causing a data plane outage upon agent
restart, which is not desirable in most cases.

The rationale for the change that introduced this bug was, I believe,
cleaning up stale flows on the OVS agent, which also makes some sense.

Unless I'm missing something, I reckon the best way forward is actually
quite straightforward; we might add a startup flag to reset all flows and
not reset them by default.
While I agree the "flow synchronisation" process proposed in the previous
post is valuable too, I hope we might be able to fix this with a simpler
approach.

Salvatore

On 5 November 2014 04:43, Germy Lure <germy.lure at gmail.com> wrote:

> Hi,
>
> Consider the triggering of restart agent, I think it's nothing but:
> 1). only restart agent
> 2). reboot the host that agent deployed on
>
> When the agent started, the ovs may:
> a.have all correct flows
> b.have nothing at all
> c.have partly correct flows, the others may need to be reprogrammed,
> deleted or added
>
> In any case, I think both user and developer would happy to see that the
> system recovery ASAP after agent restarting. The best is agent only push
> those incorrect flows, but keep the correct ones. This can ensure those
> business with correct flows working during agent starting.
>
> So, I suggest two solutions:
> 1.Agent gets all flows from ovs and compare with its local flows after
> restarting. And agent only corrects the different ones.
> 2.Adapt ovs and agent. Agent just push all(not remove) flows every time
> and ovs prepares two tables for flows switch(like RCU lock).
>
> 1 is recommended because of the 3rd vendors.
>
> BR,
> Germy
>
>
> On Fri, Oct 31, 2014 at 10:28 PM, Ben Nemec <openstack at nemebean.com>
> wrote:
>
>> On 10/29/2014 10:17 AM, Kyle Mestery wrote:
>> > On Wed, Oct 29, 2014 at 7:25 AM, Hly <henry4hly at gmail.com> wrote:
>> >>
>> >>
>> >> Sent from my iPad
>> >>
>> >> On 2014-10-29, at 下午8:01, Robert van Leeuwen <
>> Robert.vanLeeuwen at spilgames.com> wrote:
>> >>
>> >>>>> I find our current design is remove all flows then add flow by
>> entry, this
>> >>>>> will cause every network node will break off all tunnels between
>> other
>> >>>>> network node and all compute node.
>> >>>> Perhaps a way around this would be to add a flag on agent startup
>> >>>> which would have it skip reprogramming flows. This could be used for
>> >>>> the upgrade case.
>> >>>
>> >>> I hit the same issue last week and filed a bug here:
>> >>> https://bugs.launchpad.net/neutron/+bug/1383674
>> >>>
>> >>> From an operators perspective this is VERY annoying since you also
>> cannot push any config changes that requires/triggers a restart of the
>> agent.
>> >>> e.g. something simple like changing a log setting becomes a hassle.
>> >>> I would prefer the default behaviour to be to not clear the flows or
>> at the least an config option to disable it.
>> >>>
>> >>
>> >> +1, we also suffered from this even when a very little patch is done
>> >>
>> > I'd really like to get some input from the tripleo folks, because they
>> > were the ones who filed the original bug here and were hit by the
>> > agent NOT reprogramming flows on agent restart. It does seem fairly
>> > obvious that adding an option around this would be a good way forward,
>> > however.
>>
>> Since nobody else has commented, I'll put in my two cents (though I
>> might be overcharging you ;-).  I've also added the TripleO tag to the
>> subject, although with Summit coming up I don't know if that will help.
>>
>> Anyway, if the bug you're referring to is the one I think, then our
>> issue was just with the flows not existing.  I don't think we care
>> whether they get reprogrammed on agent restart or not as long as they
>> somehow come into existence at some point.
>>
>> It's possible I'm wrong about that, and probably the best person to talk
>> to would be Robert Collins since I think he's the one who actually
>> tracked down the problem in the first place.
>>
>> -Ben
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141105/1a0171e6/attachment-0001.html>


More information about the OpenStack-dev mailing list