[openstack-dev] [Neutron] DB: transaction isolation and related questions

Jay Pipes jaypipes at gmail.com
Fri Nov 21 19:35:03 UTC 2014


Eugene, I just spent about an hour playing around with an example 
scenario in both MySQL and PostgreSQL using READ COMMITTED and 
REPEATABLE READ isolation levels. The scenario I used had a single row 
being updated, with a loop and a check on rows affected.

*You are 100% correct that setting the transaction isolation level to 
READ COMMITTED works in the retry loop*.

I stand corrected, and humbled :) Please accept my apologies.

One thing I did note, though, is that changing the isolation level of an 
*already-started transaction* does not change the current transaction's 
isolation level -- the new isolation level only takes effect once the 
previously started transaction is committed or rolled back. So, on line 
107 in your proposed patch here:

https://review.openstack.org/#/c/129288/5/neutron/plugins/ml2/drivers/helpers.py

 From what I could find out in my research, the setting of the isolation 
level needs to be done *outside* of the session.begin() call, otherwise 
the isolation level will not take effect until that transaction is 
committed or rolled back. Of course, if SQLAlchemy is doing some 
auto-commit or something in the session, then you may not see this 
affect, but I certainly was able to see this in my testing in mysql 
client sessions... so I'm a little perplexed as to how your code works 
on already-started transactions. The documentation on the MySQL site 
backs up what I observed:

http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html

"...the statement sets the default transaction level for all subsequent 
transactions performed within the current session."

All the best, and thanks for the informative lesson of the week!
-jay

On 11/21/2014 03:24 AM, Eugene Nikanorov wrote:
> Comments inline:
>
> On Thu, Nov 20, 2014 at 4:34 AM, Jay Pipes <jaypipes at gmail.com
> <mailto:jaypipes at gmail.com>> wrote:
>
>
>
>     So while the SELECTs may return different data on successive calls
>     when you use the READ COMMITTED isolation level, the UPDATE
>     statements will continue to return 0 rows affected **if they attempt
>     to change rows that have been changed since the start of the
>     transaction**
>
>     The reason that changing the isolation level to READ COMMITTED
>     appears to work for the code in question:
>
>     https://github.com/openstack/__neutron/blob/master/neutron/__plugins/ml2/drivers/helpers.__py#L98
>     <https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/helpers.py#L98>
>
>     is likely because the SELECT ... LIMIT 1 query is returning a
>     different row on successive attempts (though since there is no ORDER
>     BY on the query, the returned row of the query is entirely
>     unpredictable (line 112)). Since data from that returned row is used
>     in the UPDATE statement (line 118 and 124), *different* rows are
>     actually being changed by successive UPDATE statements.
>
>
> Not really, we're updating the same row we've selected. It's ensured by
> 'raw_segment' which actually contains 'gre_id' (or similar) attribute.
> So in each iteration we're working with the same row, but in different
> iterations READ COMMITTED allows us to see different data and hence work
> with a different row.
>
>
>     What this means is that for this *very particular* case, setting the
>     transaction isolation level to READ COMMITTTED will work presumably
>     most of the time on MySQL, but it's not an appropriate solution for
>     the generalized problem domain of the SELECT FOR UPDATE. If you need
>     to issue a SELECT and an UPDATE in a retry loop, and you are
>     attempting to update the same row or rows (for instance, in the
>     quota reservation or resource allocation scenarios), this solution
>     will not work, even with READ COMMITTED. This is why I say it's not
>     really appropriate, and a better general solution is to use separate
>     transactions for each loop in the retry mechanic.
>
> By saying 'this solution will not work' what issues do you mean what
> exactly?
> Btw, I agree on using separate transaction for each loop, the problem is
> that transaction is usually not 'local' to the method where the retry
> loop resides.
>
>
>
>     The issue is about doing the retry within a single transaction.
>     That's not what I recommend doing. I recommend instead doing short
>     separate transactions instead of long-lived, multi-statement
>     transactions and relying on the behaviour of the DB's isolation
>     level (default or otherwise) to "solve" the problem of reading
>     changes to a record that you intend to update.
>
> " instead of long-lived, multi-statement transactions" - that's really
> what would require quite large code redesign.
> So far finding a way to bring retry logic upper to the stack of nesting
> transactions seems more appropriate.
>
> Thanks,
> Eugene.
>
>
>     Cheers,
>     -jay
>
>         Also, thanks Clint for clarification about example scenario
>         described by
>         Mike Bayer.
>         Initially the issue was discovered with concurrent tests on
>         multi master
>         environment with galera as a DB backend.
>
>         Thanks,
>         Eugene
>
>         On Thu, Nov 20, 2014 at 12:20 AM, Mike Bayer <mbayer at redhat.com
>         <mailto:mbayer at redhat.com>
>         <mailto:mbayer at redhat.com <mailto:mbayer at redhat.com>>> wrote:
>
>
>                  On Nov 19, 2014, at 3:47 PM, Ryan Moats
>             <rmoats at us.ibm.com <mailto:rmoats at us.ibm.com>
>                  <mailto:rmoats at us.ibm.com <mailto:rmoats at us.ibm.com>>>
>             wrote:
>
>                  >
>                  BTW, I view your examples from oslo as helping make my
>             argument for
>                  me (and I don't think that was your intent :) )
>
>
>              I disagree with that as IMHO the differences in producing
>         MM in the
>              app layer against arbitrary backends (Postgresql vs. DB2
>         vs. MariaDB
>              vs. ???)  will incur a lot more “bifurcation” than a system
>         that
>              targets only a handful of existing MM solutions.  The example I
>              referred to in oslo.db is dealing with distinct, non MM
>         backends.
>              That level of DB-specific code and more is a given if we are
>              building a MM system against multiple backends generically.
>
>              It’s not possible to say which approach would be better or
>         worse at
>              the level of “how much database specific application logic
>         do we
>              need”, though in my experience, no matter what one is
>         trying to do,
>              the answer is always, “tons”; we’re dealing not just with
>         databases
>              but also Python drivers that have a vast amount of
>         differences in
>              behaviors, at every level.    On top of all of that,
>         hand-rolled MM
>              adds just that much more application code to be developed and
>              maintained, which also claims it will do a better job than
>         mature
>              (ish?) database systems designed to do the same job against a
>              specific backend.
>
>
>
>
>                  > > My reason for asking this question here is that if
>             the community
>                  > > wants to consider #2, then these problems are the
>             place to start
>                  > > crafting that solution - if we solve the conflicts
>             inherent
>                  with the
>                  > > two conncurrent thread scenarios, then I think we
>             will find that
>                  > > we've solved the multi-master problem essentially
>             "for free”.
>                  >
>                  > Maybe I’m missing something, if we learn how to write
>             out a row such
>                  > that a concurrent transaction against the same row
>             doesn’t throw us
>                  > off, where is the part where that data is replicated
>             to databases
>                  > running concurrently on other IP numbers in a way
>             that is atomic
>                  > come out of that effort “for free” ?   A home-rolled
>             “multi master”
>                  > scenario would have to start with a system that has
>             multiple
>                  > create_engine() calls, since we need to communicate
>             directly to
>                  > multiple database servers. From there it gets really
>             crazy.
>                  Where’sall that ?
>
>                  Boiled down, what you are talking about here w.r.t.
>             concurrent
>                  transactions is really conflict resolution, which is
>             the hardest
>                  part of implementing multi-master (as a side note,
>             using locking in
>                  this case is the equivalent of option #1).
>
>                  All I wished to point out is that there are other ways
>             to solve the
>                  conflict resolution that could then be leveraged into a
>             multi-master
>                  scenario.
>
>                  As for the parts that I glossed over, once conflict
>             resolution is
>                  separated out, replication turns into a much simpler
>             problem with
>                  well understood patterns and so I view that part as coming
>                  "for free."
>
>                  Ryan
>
>                  _________________________________________________
>                  OpenStack-dev mailing list
>             OpenStack-dev at lists.openstack.__org
>             <mailto:OpenStack-dev at lists.openstack.org>
>                  <mailto:OpenStack-dev at lists.__openstack.org
>             <mailto:OpenStack-dev at lists.openstack.org>>
>             http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>             <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
>              _________________________________________________
>              OpenStack-dev mailing list
>         OpenStack-dev at lists.openstack.__org
>         <mailto:OpenStack-dev at lists.openstack.org>
>              <mailto:OpenStack-dev at lists.__openstack.org
>         <mailto:OpenStack-dev at lists.openstack.org>>
>         http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
>
>         _________________________________________________
>         OpenStack-dev mailing list
>         OpenStack-dev at lists.openstack.__org
>         <mailto:OpenStack-dev at lists.openstack.org>
>         http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
>         <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>     _________________________________________________
>     OpenStack-dev mailing list
>     OpenStack-dev at lists.openstack.__org
>     <mailto:OpenStack-dev at lists.openstack.org>
>     http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>



More information about the OpenStack-dev mailing list