[openstack-dev] [oslo][concurrency] lockutils lock fairness / starvation

Ben Nemec openstack at nemebean.com
Mon May 15 20:48:33 UTC 2017



On 05/15/2017 03:24 PM, Doug Hellmann wrote:
> Excerpts from Legacy, Allain's message of 2017-05-15 19:20:46 +0000:
>>> -----Original Message-----
>>> From: Doug Hellmann [mailto:doug at doughellmann.com]
>>> Sent: Monday, May 15, 2017 2:55 PM
>> <...>
>>>
>>> Excerpts from Legacy, Allain's message of 2017-05-15 18:35:58 +0000:
>>>> import eventlet
>>>> eventlet.monkey_patch
>>>
>>> That's not calling monkey_patch -- there are no '()'. Is that a typo?
>>
>> Yes, sorry, that was a typo when I put it in to the email.  It did have ()
>> at the end.
>>
>>>
>>> lock() claims to work differently when monkey_patch() has been called.
>>> Without doing the monkey patching, I would expect the thread to have to
>>> explicitly yield control.
>>>
>>> Did you see the problem you describe in production code, or just in this
>>> sample program?
>>
>> We see this in production code.   I included the example to boil this down to
>> a simple enough scenario to be understood in this forum without the
>> distraction of superfluous code.
>>
>
> OK. I think from the Oslo team's perspective, this is likely to be
> considered a bug in the application. The concurrency library is not
> aware that it is running in an eventlet thread, so it relies on the
> application to call the monkey patching function to inject the right
> sort of lock class.  If that was done in the wrong order, or not
> at all, that would cause this issue.

Does oslo.concurrency make any fairness promises?  I don't recall that 
it does, so it's not clear to me that this is a bug.  I thought fair 
locking was one of the motivations behind the DLM discussion.  My view 
of the oslo.concurrency locking was that it is solely concerned with 
preventing concurrent access to a resource.  There's no queuing or 
anything that would ensure a given consumer can't grab the same lock 
repeatedly.

I'm also not really surprised that this example serializes all the 
workers.  The operation being performed in each thread is simple enough 
that it probably completes before a context switch could reasonably 
occur, greenthreads or not.  Unfortunately one of the hard parts of 
concurrency is that the "extraneous" details of a use case can end up 
being important.

>
> The next step is to look at which application had the problem, and under
> what circumstances. Can you provide more detail there?

+1, although as I noted above I'm not sure this is actually a "bug".  It 
would be interesting to know what real world use case is causing a 
pathologically bad lock pattern though.

-Ben



More information about the OpenStack-dev mailing list