[openstack-dev] [Neutron][networking-ovn][networking-odl] Syncing neutron DB and OVN DB

Kevin Benton kevin at benton.pub
Wed Jul 27 09:58:18 UTC 2016


> I'd like to see if we can solve the problems more generally.

We've tried before but we very quickly run into competing requirements with
regards to eventual consistency. For example, asynchronous background sync
doesn't work if someone wants their backend to confirm that port details
are acceptable (e.g. mac isn't in use by some other system outside of
openstack). Then each backend has different methods for detecting what is
out of sync (e.g. config numbers, hashes, or just full syncs on startup)
that each come with their own requirements for how much data needs to be
resent when an inconsistency is detected.

If we can come to some common ground of what is required by all of them,
then I would love to get some of this built into the ML2 framework.
However, we've discussed this at meetups/mid-cycles/summits and it
inevitably ends up with two people drawing furiously on a whiteboard,
someone crying in the corner, and everyone else arguing about the lack of
parametric polymorphism in Go.

Even between OVN and ODL in this thread, it sounds like the only thing in
common is a background worker that consumes from a queue of tasks in the
db. Maybe realistically the only common thing we can come up with is a
taskflow queue stored in the DB to solve the multiple workers issue...

On Tue, Jul 26, 2016 at 11:31 AM, Russell Bryant <rbryant at redhat.com> wrote:

>
>
> On Fri, Jul 22, 2016 at 7:51 AM, Numan Siddique <nusiddiq at redhat.com>
> wrote:
>
>> Thanks for the comments Amitabha.
>> Please see comments inline
>>
>> On Fri, Jul 22, 2016 at 5:50 AM, Amitabha Biswas <azbiswas at gmail.com>
>> wrote:
>>
>>> Hi Numan,
>>>
>>> Thanks for the proposal. We have also been thinking about this use-case.
>>>
>>> If I’m reading this accurately (and I may not be), it seems that the
>>> proposal is to not have any OVN NB (CUD) operations (R operations outside
>>> the scope) done by the api_worker threads but rather by a new journal
>>> thread.
>>>
>>>
>> Correct.
>>>>
>>
>>> If this is indeed the case, I’d like to consider the scenario when there
>>> any N neutron nodes, each node with M worker threads. The journal thread at
>>> the each node contain list of pending operations. Could there be (sequence)
>>> dependency in the pending operations amongst each the journal threads in
>>> the nodes that prevents them from getting applied (for e.g.
>>> Logical_Router_Port and Logical_Switch_Port inter-dependency), because we
>>> are returning success on neutron operations that have still not been
>>> committed to the NB DB.
>>>
>>>
>> I
>> ​ts a valid scenario and should be designed properly to handle such
>> scenarios in case we take this approach.
>>
>
> ​I believe a new table in the Neutron DB is used to synchronize all of the
> journal threads.
>> Also note that OVN currently has no custom tables in the Neutron database
> and it would be *very* good to keep it that way if we can.
>
>
>>
>>>>
>>> Couple of clarifications and thoughts below.
>>>
>>> Thanks
>>> Amitabha <abiswas at us.ibm.com>
>>>
>>> On Jul 13, 2016, at 1:20 AM, Numan Siddique <nusiddiq at redhat.com> wrote:
>>>
>>> Adding the proper tags in subject
>>>
>>> On Wed, Jul 13, 2016 at 1:22 PM, Numan Siddique <nusiddiq at redhat.com>
>>> wrote:
>>>
>>>> Hi Neutrinos,
>>>>
>>>> Presently, In the OVN ML2 driver we have 2 ways to sync neutron DB and
>>>> OVN DB
>>>>  - At neutron-server startup, OVN ML2 driver syncs the neutron DB and
>>>> OVN DB if sync mode is set to repair.
>>>>  - Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.
>>>>
>>>> Recently, in the v2 of networking-odl ML2 driver (Please see (1) below
>>>> which has more details). (ODL folks please correct me if I am wrong here)
>>>>
>>>>   - a journal thread is created which does the CRUD operations of
>>>> neutron resources asynchronously (i.e it sends the REST APIs to the ODL
>>>> controller).
>>>>
>>>
>>> Would this be the equivalent of making OVSDB transactions to the OVN NB
>>> DB?
>>>
>>
>> ​Correct.
>>>>
>>
>>>
>>>   - a maintenance thread is created which does some cleanup periodically
>>>> and at startup does full sync if it detects ODL controller cold reboot.
>>>>
>>>>
>>>> Few question I have
>>>>  - can OVN ML2 driver take same or similar approach. Are there any
>>>> advantages in taking this approach ? One advantage is neutron resources can
>>>> be created/updated/deleted even if the OVN ML2 driver has lost connection
>>>> to the ovsdb-server. The journal thread would eventually sync these
>>>> resources in the OVN DB. I would like to know the communities thoughts on
>>>> this.
>>>>
>>>
>>>
> ​I question whether making operations appear to be successful even when
> ovsdb-server is unreachable is a useful thing.  API calls fail today if the
> Neutron db is unreachable.  Why would we bend over backwards for the OVN
> database?
>
> If this was easy to do, sure, but this solution seems *incredibly* complex
> to me, so I see it as an absolute last resort.​
>
>
>
>> If we can make it work, it would indeed be a huge plus for system wide
>>> upgrades and some corner cases in the code (ACL specifically), where the
>>> post_commit relies on all transactions to be successful and doesn’t revert
>>> the neutron db if something fails.
>>>
>>
>>
> Can we just improve the ML2 framework to make this problem easier to deal
> with?​  This problem would affect several drivers.  Driver specific partial
> solutions just keep getting replicated.  I'd like to see if we can solve
> the problems more generally.
>
>
>
>>
>>
>>
>>>
>>>
>>>>  - Are there are other ML2 drivers which might have to handle the DB
>>>> sync's (cases where the other controllers also maintain their own DBs) and
>>>> how they are handling it ?
>>>>
>>>>  - Can a common approach be taken to sync the neutron DB and controller
>>>> DBs ?
>>>>
>>>>
>>>>
>>>> -----------------------------------------------------------------------------------------------------------
>>>>
>>>> (1)
>>>> Sync threads created by networking-odl ML2 driver
>>>> --------------------------------------------------
>>>> ODL ML2 driver creates 2 threads (threading.Thread module) at init
>>>>  - Journal thread
>>>>  - Maintenance thread
>>>>
>>>> Journal thread
>>>> ----------------
>>>> The journal module creates a new journal table by name
>>>> “opendaylightjournal”  -
>>>> https://github.com/openstack/networking-odl/blob/master/networking_odl/db/models.py#L23
>>>>
>>>> Journal thread will be in loop waiting for the sync event from the ODL
>>>> ML2 driver.
>>>>
>>>>  - ODL ML2 driver resource (network, subnet, port) precommit functions
>>>> when called by the ML2 plugin adds an entry in the “opendaylightjournal”
>>>> table with the resource data and sets the journal operation state for this
>>>> entry to “PENDING”.
>>>>  - The corresponding resource postcommit function of the ODL ML2 plugin
>>>> when called, sets the sync event flag.
>>>>  - A timer is also created which sets the sync event flag when it
>>>> expires (the default value is 10 seconds).
>>>>  - Journal thread wakes up, looks into the “opendaylightjournal” table
>>>> with the entries with state “pending” and runs the CRUD operation on those
>>>> resources in the ODL DB. Once done, it sets the state to “completed”.
>>>>
>>>> Maintenance thread
>>>> ------------------
>>>> Maintenance thread does 3 operations
>>>>  - JournalCleanup - Delete completed rows from journal table
>>>> “opendaylightjournal”.
>>>>  - CleanupProcessing - Mark orphaned processing rows to pending.
>>>>  - Full sync - Re-sync when detecting an ODL "cold reboot”.
>>>>
>>>>
>>>>
>>>> Thanks
>>>> Numan
>>>>
>>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe: OpenStack-dev-request at lists.openstack.org
>>> ?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>>
>>>
>>> __________________________________________________________________________
>>> OpenStack Development Mailing List (not for usage questions)
>>> Unsubscribe:
>>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>>
>>
>> __________________________________________________________________________
>> OpenStack Development Mailing List (not for usage questions)
>> Unsubscribe:
>> OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>
>
> --
> Russell Bryant
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160727/4f657aee/attachment.html>


More information about the OpenStack-dev mailing list