[openstack-dev] [neutron] Backup port info to restore the flow rules

Jian Wen wenjianhn at gmail.com
Mon Feb 22 10:40:27 UTC 2016


   I don't think it's enough for a large scale cloud.

   When the neutron server is not available and the flow rules are gone,
   we need the backup to restore the flow rules.

   We have more than a thousand physical servers in our production
environment.
   Rare events will occur where combined failures or unanticipated failures
   require human interaction. For example, a cron job accidentlly killed the
   OvS service(flows will be gone) when one of RabbitMQ, MySQL and neutron
   server is down/unavailable.


On Mon, Feb 22, 2016 at 5:44 PM, Ihar Hrachyshka <ihrachys at redhat.com>
wrote:

> Jian Wen <wenjianhn at gmail.com> wrote:
>
> Hello,
>>
>> If we restart OvS/ovs-agent when one or more of Neutron, MySQL and
>> RabbitMQ is not available, the flow rules in OvS will be gone. If
>> Neutron/MySQL/RabbitMQ doesn't become available in time, the VMs
>> will lose their network connections. It's not easy for an
>> operations engineer to manually restore the flow rules. An
>> operations engineer working under pressure at 2 a.m. will make
>> mistakes.
>>
>> We can backup the ports info to a local file. In case of emergency
>> the ovs-agent can use it to restore the flow rules. What do you
>> think of this feature?
>>
>> Related bugs:
>>     Restarting neutron openvswitch agent causes network hiccup by
>> throwing away all flows
>>     https://bugs.launchpad.net/neutron/+bug/1383674
>>
>>     Restarting OVS agent drops VMs traffic when using VLAN provider
>> bridges
>>     https://bugs.launchpad.net/neutron/+bug/1514056
>>
>>     After restarting an ovs agent, it still drops useful flows if the
>> neutron server is busy/down
>>     https://bugs.launchpad.net/neutron/+bug/1515075
>>
>>     Ovs agent loses OpenFlow rules if OVS gets restarted while Neutron is
>> disconnected from SQL
>>     https://bugs.launchpad.net/neutron/+bug/1531210
>>
>>
> Most of those bugs are fixed (at least for stable/liberty+). Isn’t it
> enough to avoid data plane reset when the agent fails to fetch new port
> data from its controller? Why do we need another mechanism here?
>
> Ihar
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



-- 
Best,

Jian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160222/6d67ec40/attachment.html>


More information about the OpenStack-dev mailing list