[openstack-dev] [Neutron] l2pop problems
Mathieu Rohon
mathieu.rohon at gmail.com
Wed Aug 6 07:28:33 UTC 2014
Hi Zang,
On Tue, Aug 5, 2014 at 1:18 PM, Zang MingJie <zealot0630 at gmail.com> wrote:
> Hi Mathieu:
>
> We have deployed the new l2pop described in the previous mail in our
> environment, and works pretty well. It solved the timing problem, and
> also reduces lots of l2pop rpc calls. I'm going to file a blueprint to
> propose the changes.
great, I would be pleased to review this BP.
> On Fri, Jul 18, 2014 at 10:26 PM, Mathieu Rohon <mathieu.rohon at gmail.com> wrote:
>> Hi Zang,
>>
>> On Wed, Jul 16, 2014 at 4:43 PM, Zang MingJie <zealot0630 at gmail.com> wrote:
>>> Hi, all:
>>>
>>> While resolving ovs restart rebuild br-tun flows[1], we have found
>>> several l2pop problems:
>>>
>>> 1. L2pop is depending on agent_boot_time to decide whether send all
>>> port information or not, but the agent_boot_time is unreliable, for
>>> example if the service receives port up message before agent status
>>> report, the agent won't receive any port on other agents forever.
>>
>> you're right, there a race condition here, if the agent has more than
>> 1 port on the same network and if the agent sends its
>> update_device_up() on every port before it sends its report_state(),
>> it won't receive fdb concerning these network. Is it the race you are
>> mentionning above?
>> Since the report_state is done in a dedicated greenthread, and is
>> launched before the greenthread that manages ovsdb_monitor, the state
>> of the agent should be updated before the agent gets aware of its
>> ports and sends get_device_details()/update_device_up(), am I wrong?
>> So, after a restart of an agent, the agent_uptime() should be less
>> than the agent_boot_time configured by default in the conf when the
>> agent sent its first update_device_up(), the l2pop MD will be aware of
>> this restart and trigger the cast of all fdb entries to the restarted
>> agent.
>>
>> But I agree that it might relies on enventlet thread managment and on
>> agent_boot_time that can be misconfigured by the provider.
>>
>>> 2. If the openvswitch restarted, all flows will be lost, including all
>>> l2pop flows, the agent is unable to fetch or recreate the l2pop flows.
>>>
>>> To resolve the problems, I'm suggesting some changes:
>>>
>>> 1. Because the agent_boot_time is unreliable, the service can't decide
>>> whether to send flooding entry or not. But the agent can build up the
>>> flooding entries from unicast entries, it has already been
>>> implemented[2]
>>>
>>> 2. Create a rpc from agent to service which fetch all fdb entries, the
>>> agent calls the rpc in `provision_local_vlan`, before setting up any
>>> port.[3]
>>>
>>> After these changes, the l2pop service part becomes simpler and more
>>> robust, mainly 2 function: first, returns all fdb entries at once when
>>> requested; second, broadcast fdb single entry when a port is up/down.
>>
>> That's an implementation that we have been thinking about during the
>> l2pop implementation.
>> Our purpose was to minimize RPC calls. But if this implementation is
>> buggy due to uncontrolled thread order and/or bad usage of the
>> agent_boot_time parameter, it's worth investigating your proposal [3].
>> However, I don't get why [3] depends on [2]. couldn't we have a
>> network_sync() sent by the agent during provision_local_vlan() which
>> will reconfigure ovs when the agent and/or the ovs restart?
>
> actual, [3] doesn't strictly depend [2], we have encountered l2pop
> problems several times where the unicast is correct, but the broadcast
> fails, so we decide completely ignore the broadcast entries in rpc,
> only deal unicast entries, and use unicast entries to build broadcast
> rules.
Understood, but i could be interesting to understand why the MD sends
wrong broadcast entries. Do you have any clue?
>
>>
>>
>>> [1] https://bugs.launchpad.net/neutron/+bug/1332450
>>> [2] https://review.openstack.org/#/c/101581/
>>> [3] https://review.openstack.org/#/c/107409/
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list