[openstack-dev] [Neutron] DB: transaction isolation and related questions
Jay Pipes
jaypipes at gmail.com
Fri Nov 21 19:35:03 UTC 2014
Eugene, I just spent about an hour playing around with an example
scenario in both MySQL and PostgreSQL using READ COMMITTED and
REPEATABLE READ isolation levels. The scenario I used had a single row
being updated, with a loop and a check on rows affected.
*You are 100% correct that setting the transaction isolation level to
READ COMMITTED works in the retry loop*.
I stand corrected, and humbled :) Please accept my apologies.
One thing I did note, though, is that changing the isolation level of an
*already-started transaction* does not change the current transaction's
isolation level -- the new isolation level only takes effect once the
previously started transaction is committed or rolled back. So, on line
107 in your proposed patch here:
https://review.openstack.org/#/c/129288/5/neutron/plugins/ml2/drivers/helpers.py
From what I could find out in my research, the setting of the isolation
level needs to be done *outside* of the session.begin() call, otherwise
the isolation level will not take effect until that transaction is
committed or rolled back. Of course, if SQLAlchemy is doing some
auto-commit or something in the session, then you may not see this
affect, but I certainly was able to see this in my testing in mysql
client sessions... so I'm a little perplexed as to how your code works
on already-started transactions. The documentation on the MySQL site
backs up what I observed:
http://dev.mysql.com/doc/refman/5.0/en/set-transaction.html
"...the statement sets the default transaction level for all subsequent
transactions performed within the current session."
All the best, and thanks for the informative lesson of the week!
-jay
On 11/21/2014 03:24 AM, Eugene Nikanorov wrote:
> Comments inline:
>
> On Thu, Nov 20, 2014 at 4:34 AM, Jay Pipes <jaypipes at gmail.com
> <mailto:jaypipes at gmail.com>> wrote:
>
>
>
> So while the SELECTs may return different data on successive calls
> when you use the READ COMMITTED isolation level, the UPDATE
> statements will continue to return 0 rows affected **if they attempt
> to change rows that have been changed since the start of the
> transaction**
>
> The reason that changing the isolation level to READ COMMITTED
> appears to work for the code in question:
>
> https://github.com/openstack/__neutron/blob/master/neutron/__plugins/ml2/drivers/helpers.__py#L98
> <https://github.com/openstack/neutron/blob/master/neutron/plugins/ml2/drivers/helpers.py#L98>
>
> is likely because the SELECT ... LIMIT 1 query is returning a
> different row on successive attempts (though since there is no ORDER
> BY on the query, the returned row of the query is entirely
> unpredictable (line 112)). Since data from that returned row is used
> in the UPDATE statement (line 118 and 124), *different* rows are
> actually being changed by successive UPDATE statements.
>
>
> Not really, we're updating the same row we've selected. It's ensured by
> 'raw_segment' which actually contains 'gre_id' (or similar) attribute.
> So in each iteration we're working with the same row, but in different
> iterations READ COMMITTED allows us to see different data and hence work
> with a different row.
>
>
> What this means is that for this *very particular* case, setting the
> transaction isolation level to READ COMMITTTED will work presumably
> most of the time on MySQL, but it's not an appropriate solution for
> the generalized problem domain of the SELECT FOR UPDATE. If you need
> to issue a SELECT and an UPDATE in a retry loop, and you are
> attempting to update the same row or rows (for instance, in the
> quota reservation or resource allocation scenarios), this solution
> will not work, even with READ COMMITTED. This is why I say it's not
> really appropriate, and a better general solution is to use separate
> transactions for each loop in the retry mechanic.
>
> By saying 'this solution will not work' what issues do you mean what
> exactly?
> Btw, I agree on using separate transaction for each loop, the problem is
> that transaction is usually not 'local' to the method where the retry
> loop resides.
>
>
>
> The issue is about doing the retry within a single transaction.
> That's not what I recommend doing. I recommend instead doing short
> separate transactions instead of long-lived, multi-statement
> transactions and relying on the behaviour of the DB's isolation
> level (default or otherwise) to "solve" the problem of reading
> changes to a record that you intend to update.
>
> " instead of long-lived, multi-statement transactions" - that's really
> what would require quite large code redesign.
> So far finding a way to bring retry logic upper to the stack of nesting
> transactions seems more appropriate.
>
> Thanks,
> Eugene.
>
>
> Cheers,
> -jay
>
> Also, thanks Clint for clarification about example scenario
> described by
> Mike Bayer.
> Initially the issue was discovered with concurrent tests on
> multi master
> environment with galera as a DB backend.
>
> Thanks,
> Eugene
>
> On Thu, Nov 20, 2014 at 12:20 AM, Mike Bayer <mbayer at redhat.com
> <mailto:mbayer at redhat.com>
> <mailto:mbayer at redhat.com <mailto:mbayer at redhat.com>>> wrote:
>
>
> On Nov 19, 2014, at 3:47 PM, Ryan Moats
> <rmoats at us.ibm.com <mailto:rmoats at us.ibm.com>
> <mailto:rmoats at us.ibm.com <mailto:rmoats at us.ibm.com>>>
> wrote:
>
> >
> BTW, I view your examples from oslo as helping make my
> argument for
> me (and I don't think that was your intent :) )
>
>
> I disagree with that as IMHO the differences in producing
> MM in the
> app layer against arbitrary backends (Postgresql vs. DB2
> vs. MariaDB
> vs. ???) will incur a lot more “bifurcation” than a system
> that
> targets only a handful of existing MM solutions. The example I
> referred to in oslo.db is dealing with distinct, non MM
> backends.
> That level of DB-specific code and more is a given if we are
> building a MM system against multiple backends generically.
>
> It’s not possible to say which approach would be better or
> worse at
> the level of “how much database specific application logic
> do we
> need”, though in my experience, no matter what one is
> trying to do,
> the answer is always, “tons”; we’re dealing not just with
> databases
> but also Python drivers that have a vast amount of
> differences in
> behaviors, at every level. On top of all of that,
> hand-rolled MM
> adds just that much more application code to be developed and
> maintained, which also claims it will do a better job than
> mature
> (ish?) database systems designed to do the same job against a
> specific backend.
>
>
>
>
> > > My reason for asking this question here is that if
> the community
> > > wants to consider #2, then these problems are the
> place to start
> > > crafting that solution - if we solve the conflicts
> inherent
> with the
> > > two conncurrent thread scenarios, then I think we
> will find that
> > > we've solved the multi-master problem essentially
> "for free”.
> >
> > Maybe I’m missing something, if we learn how to write
> out a row such
> > that a concurrent transaction against the same row
> doesn’t throw us
> > off, where is the part where that data is replicated
> to databases
> > running concurrently on other IP numbers in a way
> that is atomic
> > come out of that effort “for free” ? A home-rolled
> “multi master”
> > scenario would have to start with a system that has
> multiple
> > create_engine() calls, since we need to communicate
> directly to
> > multiple database servers. From there it gets really
> crazy.
> Where’sall that ?
>
> Boiled down, what you are talking about here w.r.t.
> concurrent
> transactions is really conflict resolution, which is
> the hardest
> part of implementing multi-master (as a side note,
> using locking in
> this case is the equivalent of option #1).
>
> All I wished to point out is that there are other ways
> to solve the
> conflict resolution that could then be leveraged into a
> multi-master
> scenario.
>
> As for the parts that I glossed over, once conflict
> resolution is
> separated out, replication turns into a much simpler
> problem with
> well understood patterns and so I view that part as coming
> "for free."
>
> Ryan
>
> _________________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.__org
> <mailto:OpenStack-dev at lists.openstack.org>
> <mailto:OpenStack-dev at lists.__openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>>
> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
> _________________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.__org
> <mailto:OpenStack-dev at lists.openstack.org>
> <mailto:OpenStack-dev at lists.__openstack.org
> <mailto:OpenStack-dev at lists.openstack.org>>
> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
>
> _________________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.__org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev
> <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
> _________________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.__org
> <mailto:OpenStack-dev at lists.openstack.org>
> http://lists.openstack.org/__cgi-bin/mailman/listinfo/__openstack-dev <http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev>
>
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
More information about the OpenStack-dev
mailing list