[openstack-dev] [Neutron] DB: transaction isolation and related questions

Jay Pipes jaypipes at gmail.com
Thu Nov 20 01:34:16 UTC 2014


On 11/19/2014 04:27 PM, Eugene Nikanorov wrote:
> Wow, lots of feedback in a matter of hours.
>
> First of all, reading postgres docs I see that READ COMMITTED is the
> same as for mysql, so it should address the issue we're discussing:
>
> "/Read Committed/ is the default isolation level in PostgreSQL. When a
> transaction uses this isolation level, a SELECT query (without a FOR
> UPDATE/SHARE clause) *sees only data committed before the query began
> (not before TX began - Eugene)*; it never sees either uncommitted data
> or changes committed during query execution by concurrent transactions.
> In effect, a SELECT query sees a snapshot of the database as of the
> instant the query begins to run. However, SELECT does see the effects of
> previous updates executed within its own transaction, even though they
> are not yet committed. *Also note that two successive **SELECT commands
> can see different data, even though they are within a single
> transaction, if other transactions commit changes during execution of
> the first SELECT. "*
> http://www.postgresql.org/docs/8.4/static/transaction-iso.html

So while the SELECTs may return different data on successive calls when 
you use the READ COMMITTED isolation level, the UPDATE statements will 
continue to return 0 rows affected **if they attempt to change rows that 
have been changed since the start of the transaction**

The reason that changing the isolation level to READ COMMITTED appears 
to work for the code in question:

https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/helpers.py#L98

is likely because the SELECT ... LIMIT 1 query is returning a different 
row on successive attempts (though since there is no ORDER BY on the 
query, the returned row of the query is entirely unpredictable (line 
112)). Since data from that returned row is used in the UPDATE statement 
(line 118 and 124), *different* rows are actually being changed by 
successive UPDATE statements.

What this means is that for this *very particular* case, setting the 
transaction isolation level to READ COMMITTTED will work presumably most 
of the time on MySQL, but it's not an appropriate solution for the 
generalized problem domain of the SELECT FOR UPDATE. If you need to 
issue a SELECT and an UPDATE in a retry loop, and you are attempting to 
update the same row or rows (for instance, in the quota reservation or 
resource allocation scenarios), this solution will not work, even with 
READ COMMITTED. This is why I say it's not really appropriate, and a 
better general solution is to use separate transactions for each loop in 
the retry mechanic.

> So, in my opinion, unless neutron code has parts that rely on
> 'repeatable read' transaction isolation level (and I believe such code
> is possible, didn't inspected closely yet), switching to READ COMMITTED
> is fine for mysql.

This will introduce more problems than you think, I believe. A better 
strategy is to simply use separate transactions for each loop 
iteration's queries.

> On multi-master scenario: it is not really an advanced use case. It is
> basic, we need to consider it as a basic and build architecture with
> respect to this fact.
> "Retry" approach fits well here, however it either requires proper
> isolation level, or redesign of whole DB access layer.

It's not about the retry approach. I don't think anyone is saying that a 
retry approach is not a good idea. I've been a proponent of the retry 
approach to get around issues with SELECT FOR UPDATE ever since I 
brought up the issue to the mailing list about 7 months ago. :)

The issue is about doing the retry within a single transaction. That's 
not what I recommend doing. I recommend instead doing short separate 
transactions instead of long-lived, multi-statement transactions and 
relying on the behaviour of the DB's isolation level (default or 
otherwise) to "solve" the problem of reading changes to a record that 
you intend to update.

Cheers,
-jay

> Also, thanks Clint for clarification about example scenario described by
> Mike Bayer.
> Initially the issue was discovered with concurrent tests on multi master
> environment with galera as a DB backend.
>
> Thanks,
> Eugene
>
> On Thu, Nov 20, 2014 at 12:20 AM, Mike Bayer <mbayer at redhat.com
> <mailto:mbayer at redhat.com>> wrote:
>
>
>>     On Nov 19, 2014, at 3:47 PM, Ryan Moats <rmoats at us.ibm.com
>>     <mailto:rmoats at us.ibm.com>> wrote:
>>
>>     >
>>     BTW, I view your examples from oslo as helping make my argument for
>>     me (and I don't think that was your intent :) )
>>
>
>     I disagree with that as IMHO the differences in producing MM in the
>     app layer against arbitrary backends (Postgresql vs. DB2 vs. MariaDB
>     vs. ???)  will incur a lot more “bifurcation” than a system that
>     targets only a handful of existing MM solutions.  The example I
>     referred to in oslo.db is dealing with distinct, non MM backends.
>     That level of DB-specific code and more is a given if we are
>     building a MM system against multiple backends generically.
>
>     It’s not possible to say which approach would be better or worse at
>     the level of “how much database specific application logic do we
>     need”, though in my experience, no matter what one is trying to do,
>     the answer is always, “tons”; we’re dealing not just with databases
>     but also Python drivers that have a vast amount of differences in
>     behaviors, at every level.    On top of all of that, hand-rolled MM
>     adds just that much more application code to be developed and
>     maintained, which also claims it will do a better job than mature
>     (ish?) database systems designed to do the same job against a
>     specific backend.
>
>
>
>>
>>     > > My reason for asking this question here is that if the community
>>     > > wants to consider #2, then these problems are the place to start
>>     > > crafting that solution - if we solve the conflicts inherent
>>     with the
>>     > > two conncurrent thread scenarios, then I think we will find that
>>     > > we've solved the multi-master problem essentially "for free”.
>>     >
>>     > Maybe I’m missing something, if we learn how to write out a row such
>>     > that a concurrent transaction against the same row doesn’t throw us
>>     > off, where is the part where that data is replicated to databases
>>     > running concurrently on other IP numbers in a way that is atomic
>>     > come out of that effort “for free” ?   A home-rolled “multi master”
>>     > scenario would have to start with a system that has multiple
>>     > create_engine() calls, since we need to communicate directly to
>>     > multiple database servers. From there it gets really crazy.
>>     Where’sall that ?
>>
>>     Boiled down, what you are talking about here w.r.t. concurrent
>>     transactions is really conflict resolution, which is the hardest
>>     part of implementing multi-master (as a side note, using locking in
>>     this case is the equivalent of option #1).
>>
>>     All I wished to point out is that there are other ways to solve the
>>     conflict resolution that could then be leveraged into a multi-master
>>     scenario.
>>
>>     As for the parts that I glossed over, once conflict resolution is
>>     separated out, replication turns into a much simpler problem with
>>     well understood patterns and so I view that part as coming
>>     "for free."
>>
>>     Ryan
>>
>>     _______________________________________________
>>     OpenStack-dev mailing list
>>     OpenStack-dev at lists.openstack.org
>>     <mailto:OpenStack-dev at lists.openstack.org>
>>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>     _______________________________________________
>     OpenStack-dev mailing list
>     OpenStack-dev at lists.openstack.org
>     <mailto:OpenStack-dev at lists.openstack.org>
>     http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list