[openstack-dev] [oslo][all] The lock files saga (and where we can go from here)

Joshua Harlow harlowja at fastmail.com
Tue Dec 1 17:28:18 UTC 2015

Sean Dague wrote:
> On 12/01/2015 08:08 AM, Duncan Thomas wrote:
>> On 1 December 2015 at 13:40, Sean Dague<sean at dague.net
>> <mailto:sean at dague.net>>  wrote:
>>      The current approach means locks block on their own, are processed in
>>      the order they come in, but deletes aren't possible. The busy lock would
>>      mean deletes were normal. Some extra cpu spent on waiting, and lock
>>      order processing would be non deterministic. It's trade offs, but I
>>      don't know anywhere that we are using locks as queues, so order
>>      shouldn't matter. The cpu cost on the busy wait versus the lock file
>>      cleanliness might be worth making. It would also let you actually see
>>      what's locked from the outside pretty easily.
>> The cinder locks are very much used as queues in places, e.g. making
>> delete wait until after an image operation finishes. Given that cinder
>> can already bring a node into resource issues while doing lots of image
>> operations concurrently (such as creating lots of bootable volumes at
>> once) I'd be resistant to anything that makes it worse to solve a
>> cosmetic issue.
> Is that really a queue? Don't do X while Y is a lock. Do X, Y, Z, in
> order after W is done is a queue. And what you've explains above about
> Don't DELETE while DOING OTHER ACTION, is really just the queue model.
> What I mean by treating locks as queues was depending on X, Y, Z
> happening in that order after W. With a busy wait approach they might
> happen as Y, Z, X or X, Z, B, Y. They will all happen after W is done.
> But relative to each other, or to new ops coming in, no real order is
> enforced.

So ummm, just so people know the fasteners lock code (and the stuff that 
has existed for file locks in oslo.concurrency and prior to that 
oslo-incubator...) never has guaranteed the aboved sequencing.

How it works (and has always worked) is the following:

1. A lock object is created 
2. That lock object acquire is performed 
3. At that point do_open is called to ensure the file exists (if it 
exists already it is opened in append mode, so no overwrite happen) and 
the lock object has a reference to the file descriptor of that file 
4. A retry loop starts, that repeats until either a provided timeout is 
elapsed or the lock is acquired, the retry logic u can skip over but the 
code that the retry loop calls is 

The retry loop (really this loop @ 
will idle for a given delay between the next attempt to lock the file, 
so that means there is no queue like sequencing, and that if for example 
entity A (who created lock object at t0) sleeps for 50 seconds between 
delays and entity B (who created lock object at t1) and sleeps for 5 
seconds between delays would prefer entity B getting it (since entity B 
has a smaller retry delay).

So just fyi, I wouldn't be depending on these for queuing/ordering as is...


> 	-Sean

More information about the OpenStack-dev mailing list