[openstack-dev] [oslo][all] The lock files saga (and where we can go from here)
Joshua Harlow
harlowja at fastmail.com
Tue Dec 1 17:28:18 UTC 2015
Sean Dague wrote:
> On 12/01/2015 08:08 AM, Duncan Thomas wrote:
>>
>> On 1 December 2015 at 13:40, Sean Dague<sean at dague.net
>> <mailto:sean at dague.net>> wrote:
>>
>>
>> The current approach means locks block on their own, are processed in
>> the order they come in, but deletes aren't possible. The busy lock would
>> mean deletes were normal. Some extra cpu spent on waiting, and lock
>> order processing would be non deterministic. It's trade offs, but I
>> don't know anywhere that we are using locks as queues, so order
>> shouldn't matter. The cpu cost on the busy wait versus the lock file
>> cleanliness might be worth making. It would also let you actually see
>> what's locked from the outside pretty easily.
>>
>>
>> The cinder locks are very much used as queues in places, e.g. making
>> delete wait until after an image operation finishes. Given that cinder
>> can already bring a node into resource issues while doing lots of image
>> operations concurrently (such as creating lots of bootable volumes at
>> once) I'd be resistant to anything that makes it worse to solve a
>> cosmetic issue.
>
> Is that really a queue? Don't do X while Y is a lock. Do X, Y, Z, in
> order after W is done is a queue. And what you've explains above about
> Don't DELETE while DOING OTHER ACTION, is really just the queue model.
>
> What I mean by treating locks as queues was depending on X, Y, Z
> happening in that order after W. With a busy wait approach they might
> happen as Y, Z, X or X, Z, B, Y. They will all happen after W is done.
> But relative to each other, or to new ops coming in, no real order is
> enforced.
>
So ummm, just so people know the fasteners lock code (and the stuff that
has existed for file locks in oslo.concurrency and prior to that
oslo-incubator...) never has guaranteed the aboved sequencing.
How it works (and has always worked) is the following:
1. A lock object is created
(https://github.com/harlowja/fasteners/blob/master/fasteners/process_lock.py#L85)
2. That lock object acquire is performed
(https://github.com/harlowja/fasteners/blob/master/fasteners/process_lock.py#L125)
3. At that point do_open is called to ensure the file exists (if it
exists already it is opened in append mode, so no overwrite happen) and
the lock object has a reference to the file descriptor of that file
(https://github.com/harlowja/fasteners/blob/master/fasteners/process_lock.py#L112)
4. A retry loop starts, that repeats until either a provided timeout is
elapsed or the lock is acquired, the retry logic u can skip over but the
code that the retry loop calls is
https://github.com/harlowja/fasteners/blob/master/fasteners/process_lock.py#L92
The retry loop (really this loop @
https://github.com/harlowja/fasteners/blob/master/fasteners/_utils.py#L87)
will idle for a given delay between the next attempt to lock the file,
so that means there is no queue like sequencing, and that if for example
entity A (who created lock object at t0) sleeps for 50 seconds between
delays and entity B (who created lock object at t1) and sleeps for 5
seconds between delays would prefer entity B getting it (since entity B
has a smaller retry delay).
So just fyi, I wouldn't be depending on these for queuing/ordering as is...
-Josh
> -Sean
>
More information about the OpenStack-dev
mailing list