[openstack-dev] [Neutron] DB: transaction isolation and related questions
Eugene Nikanorov
enikanorov at mirantis.com
Wed Nov 19 21:27:17 UTC 2014
Wow, lots of feedback in a matter of hours.
First of all, reading postgres docs I see that READ COMMITTED is the same
as for mysql, so it should address the issue we're discussing:
"*Read Committed* is the default isolation level in PostgreSQL. When a
transaction uses this isolation level, a SELECT query (without a FOR
UPDATE/SHARE clause) *sees only data committed before the query began (not
before TX began - Eugene)*; it never sees either uncommitted data or
changes committed during query execution by concurrent transactions. In
effect, a SELECT query sees a snapshot of the database as of the instant
the query begins to run. However, SELECT does see the effects of previous
updates executed within its own transaction, even though they are not yet
committed. *Also note that two successive **SELECT commands can see
different data, even though they are within a single transaction, if other
transactions commit changes during execution of the first SELECT. "*
http://www.postgresql.org/docs/8.4/static/transaction-iso.html
So, in my opinion, unless neutron code has parts that rely on 'repeatable
read' transaction isolation level (and I believe such code is possible,
didn't inspected closely yet), switching to READ COMMITTED is fine for
mysql.
On multi-master scenario: it is not really an advanced use case. It is
basic, we need to consider it as a basic and build architecture with
respect to this fact.
"Retry" approach fits well here, however it either requires proper
isolation level, or redesign of whole DB access layer.
Also, thanks Clint for clarification about example scenario described by Mike
Bayer.
Initially the issue was discovered with concurrent tests on multi master
environment with galera as a DB backend.
Thanks,
Eugene
On Thu, Nov 20, 2014 at 12:20 AM, Mike Bayer <mbayer at redhat.com> wrote:
>
> On Nov 19, 2014, at 3:47 PM, Ryan Moats <rmoats at us.ibm.com> wrote:
>
> >
> BTW, I view your examples from oslo as helping make my argument for
> me (and I don't think that was your intent :) )
>
>
> I disagree with that as IMHO the differences in producing MM in the app
> layer against arbitrary backends (Postgresql vs. DB2 vs. MariaDB vs. ???)
> will incur a lot more “bifurcation” than a system that targets only a
> handful of existing MM solutions. The example I referred to in oslo.db is
> dealing with distinct, non MM backends. That level of DB-specific code
> and more is a given if we are building a MM system against multiple
> backends generically.
>
> It’s not possible to say which approach would be better or worse at the
> level of “how much database specific application logic do we need”, though
> in my experience, no matter what one is trying to do, the answer is always,
> “tons”; we’re dealing not just with databases but also Python drivers that
> have a vast amount of differences in behaviors, at every level. On top
> of all of that, hand-rolled MM adds just that much more application code to
> be developed and maintained, which also claims it will do a better job than
> mature (ish?) database systems designed to do the same job against a
> specific backend.
>
>
>
>
> > > My reason for asking this question here is that if the community
> > > wants to consider #2, then these problems are the place to start
> > > crafting that solution - if we solve the conflicts inherent with the
> > > two conncurrent thread scenarios, then I think we will find that
> > > we've solved the multi-master problem essentially "for free”.
> >
> > Maybe I’m missing something, if we learn how to write out a row such
> > that a concurrent transaction against the same row doesn’t throw us
> > off, where is the part where that data is replicated to databases
> > running concurrently on other IP numbers in a way that is atomic
> > come out of that effort “for free” ? A home-rolled “multi master”
> > scenario would have to start with a system that has multiple
> > create_engine() calls, since we need to communicate directly to
> > multiple database servers. From there it gets really crazy. Where’sall
> that ?
>
> Boiled down, what you are talking about here w.r.t. concurrent
> transactions is really conflict resolution, which is the hardest
> part of implementing multi-master (as a side note, using locking in
> this case is the equivalent of option #1).
>
> All I wished to point out is that there are other ways to solve the
> conflict resolution that could then be leveraged into a multi-master
> scenario.
>
> As for the parts that I glossed over, once conflict resolution is
> separated out, replication turns into a much simpler problem with
> well understood patterns and so I view that part as coming
> "for free."
>
> Ryan
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20141120/95fe3f7e/attachment.html>
More information about the OpenStack-dev
mailing list