[openstack-dev] [all] Does OpenStack need a common solution for DLM? (was: Re: [Cinder] A possible solution for HA Active-Active)

Mark Voelker mvoelker at vmware.com
Tue Aug 4 13:36:38 UTC 2015


On Aug 3, 2015, at 6:09 PM, Flavio Percoco <flavio at redhat.com> wrote:
> 
> On 03/08/15 19:48 +0200, Gorka Eguileor wrote:
>> On Mon, Aug 03, 2015 at 03:42:48PM +0000, Fox, Kevin M wrote:
>>> I'm usually for abstraction layers, but they don't always pay off very well due to catering to the lowest common denominator.
>>> 
>>> Lets clearly define the problem space first. IFF the problem space can be fully implemented using Tooz, then lets do that. Then the operator can choose. If Tooz cant and wont handle the problem space, then we're trying to fit a square peg in a round hole.
>> 
>> What do you mean with clearly define the problem space?  We know what we
>> want, we just need to agree on the compromises we are willing to make,
>> use a DLM and make admins' life a little harder (only for those that
>> deploy A-A) but have an A-A solution earlier, or postpone A-A
>> functionality but make their life easier.
>> 
>> And we already know that Tooz is not the Holy Grail and will not perform
>> the miracle of giving Cinder HA A-A.  It is only a piece of the problem,
>> so there's nothing to discuss there, and it's not a square peg on a
>> round hole, because it fits perfectly for what it is intended. But once
>> you have filled that square hole you need another peg, the round one for
>> the round hole.
>> 
>> If people are expecting to find one thing that fixes everything and
>> gives us HA A-A on its own, then I believe they are a little bit lost.
> 
> As confusing as it seems, we've now moved from talking about just
> Cinder to understanding whether this is a problem many projects have
> and whether we can find a solution that will work for most of them.
> Therefore, I've renamed this thread to make this more evident.
> 
> Now, so far we have:
> 
> - Ironic has an internal distributed lock and it uses a hash-ring
> - Ceilometer uses tooz
> - Several projects use a file lock of some other fashion of
> distributed lock.
> - *Add yours here*



/me adds a couple more here and fixes formatting

From an operator’s point of view, it may be worth noting some other parts of various projects that could or do use the same systems that provide DLM capabilities in that *if* those systems solve well for multiple use cases, it may make operator’s lives easier to congeal around them when possible.  E.g.: fewer moving parts, less to debug, more shared logic.  From that perspective, Neutron has also had some discussion around tooz in the recent past as well for agent monitoring and state awareness.  This thread captures some of people’s thinking:

http://lists.openstack.org/pipermail/openstack-dev/2015-April/061268.html

And there’s a related spec for agent monitoring using tooz as an alternative to heartbeat/DB mechanisms currently in review here:

https://review.openstack.org/#/c/174438/

It may also be worth noting that some of the original thinking behind this came from Nova’s adoption of Zookeeper for ServiceGroups several years ago:

https://blueprints.launchpad.net/nova/+spec/zk-service-heartbeat

However when the Neutron community began discussing this, the idea of using tooz (which didn’t exist back when Nova implemented the ServiceGroup API’s) rather than using Zookeeper directly came up in review and seemed to make a lot more sense to everyone.  

And as I’ve noted, there are individual plugins that have already adopted tooz specifically for distributed locking, such as the NSXv plugin for Neutron:

https://review.openstack.org/#/c/188015/

At Your Service,

Mark T. Voelker



> 
> Each one of these projects has a specific use-case that doesn't
> necessarily overlap. I'd like to see those cases listed somewhere.
> We've done this in the past already and I believe we can do it now as
> well. As I've mentioned in another thread, Gorka has done this for
> Cinder already now we need to do it for other services too. Even if
> your project has a DLM in place, it'd be good to know what problem you
> solved with it as it may be a problem that other projects have as
> well.
> 
> As a community, we've been able to do away with adding a new service
> for DLM's thus far. I'm not saying we don't need one but, as mentioned
> in other threads, lets give this some more thought before we add a new
> service that'll make deploying and maintaining OpenStack harder.
> 
> Flavio
> 
>>> From: Gorka Eguileor [geguileo at redhat.com]
>>> Sent: Monday, August 03, 2015 1:43 AM
>>> To: OpenStack Development Mailing List (not for usage questions)
>>> Subject: Re: [openstack-dev] [Cinder] A possible solution for HA        Active-Active
>>> 
>>> On Mon, Aug 03, 2015 at 10:22:42AM +0200, Thierry Carrez wrote:
>>> > Flavio Percoco wrote:
>>> > > [...]
>>> > > So, to summarize, I love the effort behind this. But, as others have
>>> > > mentioned, I'd like us to take a step back, run this accross teams and
>>> > > come up with an opinonated solution that would work for everyone.
>>> > >
>>> > > Starting this discussion now would allow us to prepare enough material
>>> > > to reach an agreement in Tokyo and work on a single solution for
>>> > > Mikata. This sounds like a good topic for a cross-project session.
>>> >
>>> > +1
>>> >
>>> > The last thing we want is to rush a solution that would only solve a
>>> > particular project use case. Personally I'd like us to pick the simplest
>>> > solution that can solve most of the use cases. Each of the solutions
>>> > bring something to the table -- Zookeeper is mature, Consul is
>>> > featureful, etcd is lean and simple... Let's not dive into the best
>>> > solution but clearly define the problem space first.
>>> >
>>> > --
>>> > Thierry Carrez (ttx)
>>> >
>>> 
>>> I don't see those as different solutions from the point of view of
>>> Cinder, they are different implementations to the same solution case,
>>> using a DLM to lock resources.
>>> 
>>> We keep circling back to the fancy names like moths to a flame, when we
>>> are still discussing whether we need or want a DLM for the solution.  I
>>> think we should stop doing that, we need to decide on the solution from
>>> an abstract point of view (like you say, define the problem space) and
>>> not get caught up on discussions of which one of those is best.  If we
>>> end up deciding to use a DLM, which is unlikely, then we can look into
>>> available drivers in Tooz and if we are not convinced with the ones we
>>> have (Redis, ZooKeeper, etc.) then we discuss which one we should be
>>> using instead and just add it to Tooz.
> 
> -- 
> @flaper87
> Flavio Percoco
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list