[openstack-dev] [Neutron][networking-ovn][networking-odl] Syncing neutron DB and OVN DB

Russell Bryant rbryant at redhat.com
Wed Jul 27 14:15:41 UTC 2016


On Wed, Jul 27, 2016 at 5:58 AM, Kevin Benton <kevin at benton.pub> wrote:

> > I'd like to see if we can solve the problems more generally.
>
> We've tried before but we very quickly run into competing requirements
> with regards to eventual consistency. For example, asynchronous background
> sync doesn't work if someone wants their backend to confirm that port
> details are acceptable (e.g. mac isn't in use by some other system outside
> of openstack). Then each backend has different methods for detecting what
> is out of sync (e.g. config numbers, hashes, or just full syncs on startup)
> that each come with their own requirements for how much data needs to be
> resent when an inconsistency is detected.
>
> If we can come to some common ground of what is required by all of them,
> then I would love to get some of this built into the ML2 framework.
> However, we've discussed this at meetups/mid-cycles/summits and it
> inevitably ends up with two people drawing furiously on a whiteboard,
> someone crying in the corner, and everyone else arguing about the lack of
> parametric polymorphism in Go.
>

​Ha, yes, makes sense that this is really hard to solve in a way that works
for everyone ...
​


> Even between OVN and ODL in this thread, it sounds like the only thing in
> common is a background worker that consumes from a queue of tasks in the
> db. Maybe realistically the only common thing we can come up with is a
> taskflow queue stored in the DB to solve the multiple workers issue...
>

​To clarify, ODL has this background worker and the discussion was whether
OVN should try to follow a similar approach.

So far, my gut feeling is that it's far too complicated for the problems it
would solve.  There's one identified multiple-worker related race condition
on updates, but I think we can solve that another way.​



> On Tue, Jul 26, 2016 at 11:31 AM, Russell Bryant <rbryant at redhat.com>
> wrote:
>
>>
>>
>> On Fri, Jul 22, 2016 at 7:51 AM, Numan Siddique <nusiddiq at redhat.com>
>> wrote:
>>
>>> Thanks for the comments Amitabha.
>>> Please see comments inline
>>>
>>> On Fri, Jul 22, 2016 at 5:50 AM, Amitabha Biswas <azbiswas at gmail.com>
>>> wrote:
>>>
>>>> Hi Numan,
>>>>
>>>> Thanks for the proposal. We have also been thinking about this use-case.
>>>>
>>>> If I’m reading this accurately (and I may not be), it seems that the
>>>> proposal is to not have any OVN NB (CUD) operations (R operations outside
>>>> the scope) done by the api_worker threads but rather by a new journal
>>>> thread.
>>>>
>>>>
>>> Correct.
>>>>>>
>>>
>>>> If this is indeed the case, I’d like to consider the scenario when
>>>> there any N neutron nodes, each node with M worker threads. The journal
>>>> thread at the each node contain list of pending operations. Could there be
>>>> (sequence) dependency in the pending operations amongst each the journal
>>>> threads in the nodes that prevents them from getting applied (for e.g.
>>>> Logical_Router_Port and Logical_Switch_Port inter-dependency), because we
>>>> are returning success on neutron operations that have still not been
>>>> committed to the NB DB.
>>>>
>>>>
>>> I
>>> ​ts a valid scenario and should be designed properly to handle such
>>> scenarios in case we take this approach.
>>>
>>
>> ​I believe a new table in the Neutron DB is used to synchronize all of
>> the journal threads.
>>>> Also note that OVN currently has no custom tables in the Neutron database
>> and it would be *very* good to keep it that way if we can.
>>
>>
>>>
>>>>>>
>>>> Couple of clarifications and thoughts below.
>>>>
>>>> Thanks
>>>> Amitabha <abiswas at us.ibm.com>
>>>>
>>>> On Jul 13, 2016, at 1:20 AM, Numan Siddique <nusiddiq at redhat.com>
>>>> wrote:
>>>>
>>>> Adding the proper tags in subject
>>>>
>>>> On Wed, Jul 13, 2016 at 1:22 PM, Numan Siddique <nusiddiq at redhat.com>
>>>> wrote:
>>>>
>>>>> Hi Neutrinos,
>>>>>
>>>>> Presently, In the OVN ML2 driver we have 2 ways to sync neutron DB and
>>>>> OVN DB
>>>>>  - At neutron-server startup, OVN ML2 driver syncs the neutron DB and
>>>>> OVN DB if sync mode is set to repair.
>>>>>  - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.
>>>>>
>>>>> Recently, in the v2 of networking-odl ML2 driver (Please see (1) below
>>>>> which has more details). (ODL folks please correct me if I am wrong here)
>>>>>
>>>>>   - a journal thread is created which does the CRUD operations of
>>>>> neutron resources asynchronously (i.e it sends the REST APIs to the ODL
>>>>> controller).
>>>>>
>>>>
>>>> Would this be the equivalent of making OVSDB transactions to the OVN NB
>>>> DB?
>>>>
>>>
>>> ​Correct.
>>>>>>
>>>
>>>>
>>>>   - a maintenance thread is created which does some cleanup
>>>>> periodically and at startup does full sync if it detects ODL controller
>>>>> cold reboot.
>>>>>
>>>>>
>>>>> Few question I have
>>>>>  - can OVN ML2 driver take same or similar approach. Are there any
>>>>> advantages in taking this approach ? One advantage is neutron resources can
>>>>> be created/updated/deleted even if the OVN ML2 driver has lost connection
>>>>> to the ovsdb-server. The journal thread would eventually sync these
>>>>> resources in the OVN DB. I would like to know the communities thoughts on
>>>>> this.
>>>>>
>>>>
>>>>
>> ​I question whether making operations appear to be successful even when
>> ovsdb-server is unreachable is a useful thing.  API calls fail today if the
>> Neutron db is unreachable.  Why would we bend over backwards for the OVN
>> database?
>>
>> If this was easy to do, sure, but this solution seems *incredibly*
>> complex to me, so I see it as an absolute last resort.​
>>
>>
>>
>>> If we can make it work, it would indeed be a huge plus for system wide
>>>> upgrades and some corner cases in the code (ACL specifically), where the
>>>> post_commit relies on all transactions to be successful and doesn’t revert
>>>> the neutron db if something fails.
>>>>
>>>
>>>
>> Can we just improve the ML2 framework to make this problem easier to deal
>> with?​  This problem would affect several drivers.  Driver specific partial
>> solutions just keep getting replicated.  I'd like to see if we can solve
>> the problems more generally.
>>
>>
>>
>>>
>>>
>>>
>>>>
>>>>
>>>>>  - Are there are other ML2 drivers which might have to handle the DB
>>>>> sync's (cases where the other controllers also maintain their own DBs) and
>>>>> how they are handling it ?
>>>>>
>>>>>  - Can a common approach be taken to sync the neutron DB and
>>>>> controller DBs ?
>>>>>
>>>>>
>>>>>
>>>>> -----------------------------------------------------------------------------------------------------------
>>>>>
>>>>> (1)
>>>>> Sync threads created by networking-odl ML2 driver
>>>>> --------------------------------------------------
>>>>> ODL ML2 driver creates 2 threads (threading.Thread module) at init
>>>>>  - Journal thread
>>>>>  - Maintenance thread
>>>>>
>>>>> Journal thread
>>>>> ----------------
>>>>> The journal module creates a new journal table by name
>>>>> “opendaylightjournal”  -
>>>>> https://github.com/openstack/networking-odl/blob/master/networking_odl/db/models.py#L23
>>>>>
>>>>> Journal thread will be in loop waiting for the sync event from the ODL
>>>>> ML2 driver.
>>>>>
>>>>>  - ODL ML2 driver resource (network, subnet, port) precommit functions
>>>>> when called by the ML2 plugin adds an entry in the “opendaylightjournal”
>>>>> table with the resource data and sets the journal operation state for this
>>>>> entry to “PENDING”.
>>>>>  - The corresponding resource postcommit function of the ODL ML2
>>>>> plugin when called, sets the sync event flag.
>>>>>  - A timer is also created which sets the sync event flag when it
>>>>> expires (the default value is 10 seconds).
>>>>>  - Journal thread wakes up, looks into the “opendaylightjournal” table
>>>>> with the entries with state “pending” and runs the CRUD operation on those
>>>>> resources in the ODL DB. Once done, it sets the state to “completed”.
>>>>>
>>>>> Maintenance thread
>>>>> ------------------
>>>>> Maintenance thread does 3 operations
>>>>>  - JournalCleanup - Delete completed rows from journal table
>>>>> “opendaylightjournal”.
>>>>>  - CleanupProcessing - Mark orphaned processing rows to pending.
>>>>>  - Full sync - Re-sync when detecting an ODL "cold reboot”.
>>>>>
>>>>>
>>>>>
>>>>> Thanks
>>>>> Numan
>>>>>
>>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org
>>>> ?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>>
>>>>
>>>> __________________________________________________________________________
>>>> OpenStack Development Mailing List (not for usage questions)
>>>> Unsubscribe:
>>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>>
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>>
>> --
>> Russell Bryant
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>


-- 
Russell Bryant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160727/04dfe550/attachment.html>


More information about the OpenStack-dev mailing list