[openstack-dev] [oslo][concurrency] lockutils lock fairness / starvation
clint at fewbar.com
Mon May 15 21:42:42 UTC 2017
Excerpts from Ben Nemec's message of 2017-05-15 15:48:33 -0500:
> On 05/15/2017 03:24 PM, Doug Hellmann wrote:
> > Excerpts from Legacy, Allain's message of 2017-05-15 19:20:46 +0000:
> >>> -----Original Message-----
> >>> From: Doug Hellmann [mailto:doug at doughellmann.com]
> >>> Sent: Monday, May 15, 2017 2:55 PM
> >> <...>
> >>> Excerpts from Legacy, Allain's message of 2017-05-15 18:35:58 +0000:
> >>>> import eventlet
> >>>> eventlet.monkey_patch
> >>> That's not calling monkey_patch -- there are no '()'. Is that a typo?
> >> Yes, sorry, that was a typo when I put it in to the email. It did have ()
> >> at the end.
> >>> lock() claims to work differently when monkey_patch() has been called.
> >>> Without doing the monkey patching, I would expect the thread to have to
> >>> explicitly yield control.
> >>> Did you see the problem you describe in production code, or just in this
> >>> sample program?
> >> We see this in production code. I included the example to boil this down to
> >> a simple enough scenario to be understood in this forum without the
> >> distraction of superfluous code.
> > OK. I think from the Oslo team's perspective, this is likely to be
> > considered a bug in the application. The concurrency library is not
> > aware that it is running in an eventlet thread, so it relies on the
> > application to call the monkey patching function to inject the right
> > sort of lock class. If that was done in the wrong order, or not
> > at all, that would cause this issue.
> Does oslo.concurrency make any fairness promises? I don't recall that
> it does, so it's not clear to me that this is a bug. I thought fair
> locking was one of the motivations behind the DLM discussion. My view
> of the oslo.concurrency locking was that it is solely concerned with
> preventing concurrent access to a resource. There's no queuing or
> anything that would ensure a given consumer can't grab the same lock
DLM is more about fairness between machines, not threads.
However, I'd agree that oslo.concurrency isn't making fairness
guarantees. It does claim to return a threading.Semaphore or
semaphore.Semaphore, neither of which facilitate fairness (nor would a
full fledged mutex).
In order to implement fairness you'll need every lock request to happen
in a FIFO queue. This is often implemented with a mutex-protected queue
of condition variables. Since the mutex for the queue is only held while
you append to the queue, you will always get the items from the queue
in the order they were written to it.
So you have lockers add themselves to the queue and wait on their
condition variable, and then a thread running all the time that reads
the queue and acts on each condition to make sure only one thread is
activated at a time (or that one thread can just always do all the work
if the arguments are simple enough to put in a queue).
> I'm also not really surprised that this example serializes all the
> workers. The operation being performed in each thread is simple enough
> that it probably completes before a context switch could reasonably
> occur, greenthreads or not. Unfortunately one of the hard parts of
> concurrency is that the "extraneous" details of a use case can end up
> being important.
It also gets hardware sensitive when you have true multi-threading,
since a user on a 2 core box will see different results than a 4 core.
> > The next step is to look at which application had the problem, and under
> > what circumstances. Can you provide more detail there?
> +1, although as I noted above I'm not sure this is actually a "bug". It
> would be interesting to know what real world use case is causing a
> pathologically bad lock pattern though.
I think it makes sense, especially in the greenthread example where
you're immediately seeing activity on the recently active socket and
thus just stay in that greenthread.
More information about the OpenStack-dev