[openstack-dev] [Neutron][networking-ovn][networking-odl] Syncing neutron DB and OVN DB
Zhou, Han
hzhou8 at ebay.com
Sat Jul 23 01:37:32 UTC 2016
Thanks Numan & Amitabha, this may be the right direction to solve the bug [1].
It basically implements Neutron API as async call, and queuing the request within DB transaction, and the ordering is preserved by the journal thread "lock" that is implemented with state PROCESSING plus DB transaction "with_for_update", with the help of validation functions for dependency checking (e.g. same object cannot be updated by 2 journal threads at the same time, etc.).
However, I didn't figure out how errors are handled with this approach. For example, a port is created in Neutron but ODL controller failed to create it although the journal thread successfully sent the request to ODL. And I didn't see how the port states (UP & DOWN) are handled (I didn’t see any call to ProvisioningBlock, so does it mean it will just be UP from the beginning?) It would be great if anyone can help answer this question.
[1] https://bugs.launchpad.net/networking-ovn/+bug/1605089
Thanks,
Han Zhou
From: Numan Siddique <nusiddiq at redhat.com>
Reply-To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
Date: Friday, July 22, 2016 at 4:51 AM
To: "OpenStack Development Mailing List (not for usage questions)" <openstack-dev at lists.openstack.org>
Subject: Re: [openstack-dev] [Neutron][networking-ovn][networking-odl] Syncing neutron DB and OVN DB
Thanks for the comments Amitabha.
Please see comments inline
On Fri, Jul 22, 2016 at 5:50 AM, Amitabha Biswas <azbiswas at gmail.com<mailto:azbiswas at gmail.com>> wrote:
Hi Numan,
Thanks for the proposal. We have also been thinking about this use-case.
If I’m reading this accurately (and I may not be), it seems that the proposal is to not have any OVN NB (CUD) operations (R operations outside the scope) done by the api_worker threads but rather by a new journal thread.
Correct.
If this is indeed the case, I’d like to consider the scenario when there any N neutron nodes, each node with M worker threads. The journal thread at the each node contain list of pending operations. Could there be (sequence) dependency in the pending operations amongst each the journal threads in the nodes that prevents them from getting applied (for e.g. Logical_Router_Port and Logical_Switch_Port inter-dependency), because we are returning success on neutron operations that have still not been committed to the NB DB.
I
ts a valid scenario and should be designed properly to handle such scenarios in case we take this approach.
Couple of clarifications and thoughts below.
Thanks
Amitabha <abiswas at us.ibm.com<mailto:abiswas at us.ibm.com>>
On Jul 13, 2016, at 1:20 AM, Numan Siddique <nusiddiq at redhat.com<mailto:nusiddiq at redhat.com>> wrote:
Adding the proper tags in subject
On Wed, Jul 13, 2016 at 1:22 PM, Numan Siddique <nusiddiq at redhat.com<mailto:nusiddiq at redhat.com>> wrote:
Hi Neutrinos,
Presently, In the OVN ML2 driver we have 2 ways to sync neutron DB and OVN DB
- At neutron-server startup, OVN ML2 driver syncs the neutron DB and OVN DB if sync mode is set to repair.
- Admin can run the "neutron-ovn-db-sync-util" to sync the DBs.
Recently, in the v2 of networking-odl ML2 driver (Please see (1) below which has more details). (ODL folks please correct me if I am wrong here)
- a journal thread is created which does the CRUD operations of neutron resources asynchronously (i.e it sends the REST APIs to the ODL controller).
Would this be the equivalent of making OVSDB transactions to the OVN NB DB?
Correct.
- a maintenance thread is created which does some cleanup periodically and at startup does full sync if it detects ODL controller cold reboot.
Few question I have
- can OVN ML2 driver take same or similar approach. Are there any advantages in taking this approach ? One advantage is neutron resources can be created/updated/deleted even if the OVN ML2 driver has lost connection to the ovsdb-server. The journal thread would eventually sync these resources in the OVN DB. I would like to know the communities thoughts on this.
If we can make it work, it would indeed be a huge plus for system wide upgrades and some corner cases in the code (ACL specifically), where the post_commit relies on all transactions to be successful and doesn’t revert the neutron db if something fails.
- Are there are other ML2 drivers which might have to handle the DB sync's (cases where the other controllers also maintain their own DBs) and how they are handling it ?
- Can a common approach be taken to sync the neutron DB and controller DBs ?
-----------------------------------------------------------------------------------------------------------
(1)
Sync threads created by networking-odl ML2 driver
--------------------------------------------------
ODL ML2 driver creates 2 threads (threading.Thread module) at init
- Journal thread
- Maintenance thread
Journal thread
----------------
The journal module creates a new journal table by name “opendaylightjournal” - https://github.com/openstack/networking-odl/blob/master/networking_odl/db/models.py#L23
Journal thread will be in loop waiting for the sync event from the ODL ML2 driver.
- ODL ML2 driver resource (network, subnet, port) precommit functions when called by the ML2 plugin adds an entry in the “opendaylightjournal” table with the resource data and sets the journal operation state for this entry to “PENDING”.
- The corresponding resource postcommit function of the ODL ML2 plugin when called, sets the sync event flag.
- A timer is also created which sets the sync event flag when it expires (the default value is 10 seconds).
- Journal thread wakes up, looks into the “opendaylightjournal” table with the entries with state “pending” and runs the CRUD operation on those resources in the ODL DB. Once done, it sets the state to “completed”.
Maintenance thread
------------------
Maintenance thread does 3 operations
- JournalCleanup - Delete completed rows from journal table “opendaylightjournal”.
- CleanupProcessing - Mark orphaned processing rows to pending.
- Full sync - Re-sync when detecting an ODL "cold reboot”.
Thanks
Numan
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org<mailto:OpenStack-dev-request at lists.openstack.org>?subject:unsubscribe
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
__________________________________________________________________________
OpenStack Development Mailing List (not for usage questions)
Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe<http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe>
http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160723/9db93703/attachment.html>
More information about the OpenStack-dev
mailing list