[openstack-dev] [Neutron][networking-ovn][networking-odl] Syncing neutron DB and OVN DB
Russell Bryant
rbryant at redhat.com
Wed Jul 27 20:43:22 UTC 2016
On Wed, Jul 27, 2016 at 4:15 PM, Zhou, Han <hzhou8 at ebay.com> wrote:
>
>
> On Wed, Jul 27, 2016 at 7:15 AM, Russell Bryant <rbryant at redhat.com>
> wrote:
>
>>
>>
>> On Wed, Jul 27, 2016 at 5:58 AM, Kevin Benton <kevin at benton.pub> wrote:
>>
>>> > I'd like to see if we can solve the problems more generally.
>>>
>>> We've tried before but we very quickly run into competing requirements
>>> with regards to eventual consistency. For example, asynchronous background
>>> sync doesn't work if someone wants their backend to confirm that port
>>> details are acceptable (e.g. mac isn't in use by some other system outside
>>> of openstack). Then each backend has different methods for detecting what
>>> is out of sync (e.g. config numbers, hashes, or just full syncs on startup)
>>> that each come with their own requirements for how much data needs to be
>>> resent when an inconsistency is detected.
>>>
>>> If we can come to some common ground of what is required by all of them,
>>> then I would love to get some of this built into the ML2 framework.
>>> However, we've discussed this at meetups/mid-cycles/summits and it
>>> inevitably ends up with two people drawing furiously on a whiteboard,
>>> someone crying in the corner, and everyone else arguing about the lack of
>>> parametric polymorphism in Go.
>>>
>>
>> Ha, yes, makes sense that this is really hard to solve in a way that
>> works for everyone ...
>>
>>
>>
>>> Even between OVN and ODL in this thread, it sounds like the only thing
>>> in common is a background worker that consumes from a queue of tasks in the
>>> db. Maybe realistically the only common thing we can come up with is a
>>> taskflow queue stored in the DB to solve the multiple workers issue...
>>>
>>
>> To clarify, ODL has this background worker and the discussion was
>> whether OVN should try to follow a similar approach.
>>
>> So far, my gut feeling is that it's far too complicated for the problems
>> it would solve. There's one identified multiple-worker related race
>> condition on updates, but I think we can solve that another way.
>>
>>
> Russell, in fact I think this background worker is the good way to solve
> both problems:
>
> Problem 1. When something failed when updating OVN DB in post-commit: With
> the help of background worker, it can do the retries and the job state can
> be tracked, and with the information proper actions can be taken against
> failure jobs, e.g. cleanups. It is basically a declarative way of
> implementation, which IMHO is a particularly good way in ML2 context,
> because we cannot just rollback Neutron DB changes at failure because it is
> shared by all mech-drivers. (Even in a monolithic plugin, handling the
> partial failures and doing rollback is a big headache).
>
> Problem 2. Race condition because of lack of critical section between
> Neutron DB transaction and post-commit: With the help of journal, the
> ordering is ensured to be the same as DB transaction commits. Protection
> against the journal processing between multiple background workers can be
> properly enforced with the help of DB transaction.
>
> I think ODL and OVN are not the only ones facing these problems. They are
> pretty general to most drivers if not all. It would be great to have a
> common task flow mechanism in ML2, but I'd like to try it in OVN first (if
> no better solution to the problems above).
>
I had another idea for problem 2, at least. I posted a more detailed
description of the idea on the bug you posted [1].
This is unrelated to problem 1, though. I guess I was hoping we could just
come up with a better way of doing rollbacks when necessary.
I also had a long term dream of not using the Neutron DB at all, and only
relying on the OVN database. That seems much less practical now that we've
moved back to ML2. Maybe it was a crazy idea, anyway. :-)
[1] https://bugs.launchpad.net/networking-ovn/+bug/1605089
--
Russell Bryant
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160727/4954404a/attachment.html>
More information about the OpenStack-dev
mailing list