[openstack-dev] [nova] Distributed locking

Matthew Booth mbooth at redhat.com
Tue Jun 17 08:36:11 UTC 2014


On 17/06/14 00:28, Joshua Harlow wrote:
> So this is a reader/write lock then?
> 
> I have seen https://github.com/python-zk/kazoo/pull/141 come up in the
> kazoo (zookeeper python library) but there was a lack of a maintainer for
> that 'recipe', perhaps if we really find this needed we can help get that
> pull request 'sponsored' so that it can be used for this purpose?
> 
> 
> As far as resiliency, the thing I was thinking about was how correct do u
> want this lock to be?
> 
> If u say go with memcached and a locking mechanism using it this will not
> be correct but it might work good enough under normal usage. So that¹s why
> I was wondering about what level of correctness do you want and what do
> you want to happen if a server that is maintaining the lock record dies.
> In memcaches case this will literally be 1 server, even if sharding is
> being used, since a key hashes to one server. So if that one server goes
> down (or a network split happens) then it is possible for two entities to
> believe they own the same lock (and if the network split recovers this
> gets even weirder); so that¹s what I was wondering about when mentioning
> resiliency and how much incorrectness you are willing to tolerate.

>From my POV, the most important things are:

* 2 nodes must never believe they hold the same lock
* A node must eventually get the lock

I was expecting to implement locking on all three backends as long as
they support it. I haven't looked closely at memcached, but if it can
detect a split it should be able to have a fencing race with the
possible lock holder before continuing. This is obviously undesirable,
as you will probably be fencing an otherwise correctly functioning node,
but it will be correct.

Matt

> 
> -----Original Message-----
> From: Matthew Booth <mbooth at redhat.com>
> Organization: Red Hat
> Date: Friday, June 13, 2014 at 1:40 AM
> To: Joshua Harlow <harlowja at yahoo-inc.com>, "OpenStack Development Mailing
> List (not for usage questions)" <openstack-dev at lists.openstack.org>
> Subject: Re: [openstack-dev] [nova] Distributed locking
> 
>> On 12/06/14 21:38, Joshua Harlow wrote:
>>> So just a few thoughts before going to far down this path,
>>>
>>> Can we make sure we really really understand the use-case where we think
>>> this is needed. I think it's fine that this use-case exists, but I just
>>> want to make it very clear to others why its needed and why distributing
>>> locking is the only *correct* way.
>>
>> An example use of this would be side-loading an image from another
>> node's image cache rather than fetching it from glance, which would have
>> very significant performance benefits in the VMware driver, and possibly
>> other places. The copier must take a read lock on the image to prevent
>> the owner from ageing it during the copy. Holding a read lock would also
>> assure the copier that the image it is copying is complete.
>>
>>> This helps set a good precedent for others that may follow down this
>>> path
>>> that they also clearly explain the situation, how distributed locking
>>> fixes it and all the corner cases that now pop-up with distributed
>>> locking.
>>>
>>> Some of the questions that I can think of at the current moment:
>>>
>>> * What happens when a node goes down that owns the lock, how does the
>>> software react to this?
>>
>> This can be well defined according to the behaviour of the backend. For
>> example, it is well defined in zookeeper when a node's session expires.
>> If the lock holder is no longer a valid node, it would be fenced before
>> deleting its lock, allowing other nodes to continue.
>>
>> Without fencing it would not be possible to safely continue in this case.
>>
>>> * What resources are being locked; what is the lock target, what is its
>>> lifetime?
>>
>> These are not questions for a locking implementation. A lock would be
>> held on a name, and it would be up to the api user to ensure that the
>> protected resource is only used while correctly locked, and that the
>> lock is not held longer than necessary.
>>
>>> * What resiliency do you want this lock to provide (this becomes a
>>> critical question when considering memcached, since memcached is not
>>> really the best choice for a resilient distributing locking backend)?
>>
>> What does resiliency mean in this context? We really just need the lock
>> to be correct
>>
>>> * What do entities that try to acquire a lock do when they can't acquire
>>> it?
>>
>> Typically block, but if a use case emerged for trylock() it would be
>> simple to implement. For example, in the image side-loading case we may
>> decide that if it isn't possible to immediately acquire the lock it
>> isn't worth waiting, and we just fetch it from glance anyway.
>>
>>> A useful thing I wrote up a while ago, might still be useful:
>>>
>>> https://wiki.openstack.org/wiki/StructuredWorkflowLocks
>>>
>>> Feel free to move that wiki if u find it useful (its sorta a high-level
>>> doc on the different strategies and such).
>>
>> Nice list of implementation pros/cons.
>>
>> Matt
>>
>>>
>>> -Josh
>>>
>>> -----Original Message-----
>>> From: Matthew Booth <mbooth at redhat.com>
>>> Organization: Red Hat
>>> Reply-To: "OpenStack Development Mailing List (not for usage questions)"
>>> <openstack-dev at lists.openstack.org>
>>> Date: Thursday, June 12, 2014 at 7:30 AM
>>> To: "OpenStack Development Mailing List (not for usage questions)"
>>> <openstack-dev at lists.openstack.org>
>>> Subject: [openstack-dev] [nova] Distributed locking
>>>
>>>> We have a need for a distributed lock in the VMware driver, which I
>>>> suspect isn't unique. Specifically it is possible for a VMware
>>>> datastore
>>>> to be accessed via multiple nova nodes if it is shared between
>>>> clusters[1]. Unfortunately the vSphere API doesn't provide us with the
>>>> primitives to implement robust locking using the storage layer itself,
>>>> so we're looking elsewhere.
>>>>
>>>> The closest we seem to have in Nova currently are service groups, which
>>>> currently have 3 implementations: DB, Zookeeper and Memcached. The
>>>> service group api currently provides simple membership, but for locking
>>>> we'd be looking for something more.
>>>>
>>>> I think the api we'd be looking for would be something along the lines
>>>> of:
>>>>
>>>> Foo.lock(name, fence_info)
>>>> Foo.unlock(name)
>>>>
>>>> Bar.fence(fence_info)
>>>>
>>>> Note that fencing would be required in this case. We believe we can
>>>> fence by terminating the other Nova's vSphere session, but other
>>>> options
>>>> might include killing a Nova process, or STONITH. These would be
>>>> implemented as fencing drivers.
>>>>
>>>> Although I haven't worked through the detail, I believe lock and unlock
>>>> would be implementable in all 3 of the current service group drivers.
>>>> Fencing would be implemented separately.
>>>>
>>>> My questions:
>>>>
>>>> * Does this already exist, or does anybody have patches pending to do
>>>> something like this?
>>>> * Are there other users for this?
>>>> * Would service groups be an appropriate place, or a new distributed
>>>> locking class?
>>>> * How about if we just used zookeeper directly in the driver?
>>>>
>>>> Matt
>>>>
>>>> [1] Cluster ~= hypervisor
>>>> -- 
>>>> Matthew Booth
>>>> Red Hat Engineering, Virtualisation Team
>>>>
>>>> Phone: +442070094448 (UK)
>>>> GPG ID:  D33C3490
>>>> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>
>>
>>
>> -- 
>> Matthew Booth
>> Red Hat Engineering, Virtualisation Team
>>
>> Phone: +442070094448 (UK)
>> GPG ID:  D33C3490
>> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
> 


-- 
Matthew Booth
Red Hat Engineering, Virtualisation Team

Phone: +442070094448 (UK)
GPG ID:  D33C3490
GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490



More information about the OpenStack-dev mailing list