[openstack-dev] [all][oslo.db] Repeatable Read considered harmful

Mike Bayer mbayer at redhat.com
Tue Mar 31 15:22:22 UTC 2015

Eugene Nikanorov <enikanorov at mirantis.com> wrote:

> Hi Matthew,
> I'll add just 2c:
> We've tried to move from repeatable-read to read committed in Neutron project.
> This change actually has caused multiple deadlocks during regular tempest test run.
> That is a known problem (the issue with eventlet and currect mysql client library),
> but anyway, at least one major openstack project is not ready to move to read-committed.
> Also, particular transaction isolation level's performance is highly affected by DB usage pattern.
> Is there any research of how read-committed affects performance of openstack projects?

So I would add that I think the altering of transaction isolation level
should be done on a per-method basis; that is, methods that definitely need
the effects of a certain isolation level should run it locally, so that the
change can be made safely without having to deal with moving the entire
application space over to a new mode of operation. I’ve added methods to
SQLAlchemy specifically to make this achievable at the ORM level in response
to [1], documented at [2].

The addition of specific isolation levels to enginefacade [3] will be
straightforward and very clean. The API as it stands now, assuming decorator
use looks like:

def some_api_method(context):

To specify specific isolation level would be like this:

def some_api_method(context):

[1] https://review.openstack.org/#/c/148339/
[2] http://docs.sqlalchemy.org/en/rel_0_9/orm/session_transaction.html#setting-isolation-for-individual-transactions.
[3] https://review.openstack.org/#/c/138215/

> Thanks,
> Eugene.
> On Fri, Feb 6, 2015 at 7:59 PM, Matthew Booth <mbooth at redhat.com> wrote:
> I was surprised recently to discover that MySQL uses repeatable read for
> transactions by default. Postgres uses read committed by default, and
> SQLite uses serializable. We don't set the isolation level explicitly
> anywhere, so our applications are running under different isolation
> levels depending on backend. This doesn't sound like a good idea to me.
> It's one thing to support multiple sql syntaxes, but different isolation
> levels have different semantics. Supporting that is much harder, and
> currently we're not even trying.
> I'm aware that the same isolation level on different databases will
> still have subtly different semantics, but at least they should agree on
> the big things. I think we should pick one, and it should be read committed.
> Also note that 'repeatable read' on both MySQL and Postgres is actually
> snapshot isolation, which isn't quite the same thing. For example, it
> doesn't get phantom reads.
> The most important reason I think we need read committed is recovery
> from concurrent changes within the scope of a single transaction. To
> date, in Nova at least, this hasn't been an issue as transactions have
> had an extremely small scope. However, we're trying to expand that scope
> with the new enginefacade in oslo.db:
> https://review.openstack.org/#/c/138215/ . With this expanded scope,
> transaction failure in a library function can't simply be replayed
> because the transaction scope is larger than the function.
> So, 3 concrete examples of how repeatable read will make Nova worse:
> * https://review.openstack.org/#/c/140622/
> This was committed to Nova recently. Note how it involves a retry in the
> case of concurrent change. This works fine, because the retry is creates
> a new transaction. However, if the transaction was larger than the scope
> of this function this would not work, because each iteration would
> continue to read the old data. The solution to this is to create a new
> transaction. However, because the transaction is outside of the scope of
> this function, the only thing we can do locally is fail. The caller then
> has to re-execute the whole transaction, or fail itself.
> This is a local concurrency problem which can be very easily handled
> locally, but not if we're using repeatable read.
> *
> https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L4749
> Nova has multiple functions of this type which attempt to update a
> key/value metadata table. I'd expect to find multiple concurrency issues
> with this if I stopped to give it enough thought, but concentrating just
> on what's there, notice how the retry loop starts a new transaction. If
> we want to get to a place where we don't do that, with repeatable read
> we're left failing the whole transaction.
> * https://review.openstack.org/#/c/136409/
> This one isn't upstream, yet. It's broken, and I can't currently think
> of a solution if we're using repeatable read.
> The issue is atomic creation of a shared resource. We want to handle a
> creation race safely. This patch:
> * Attempts to reads the default (it will normally exist)
> * Creates a new one if it doesn't exist
> * Goes back to the start if creation failed due to a duplicate
> Seems fine, but it will fail because the re-read will continue to not
> return the new value under repeatable read (no phantom reads). The only
> way to see the new row is a new transaction. Is this will no longer be
> in the scope of this function, the only solution will be to fail. Read
> committed could continue without failing.
> Incidentally, this currently works by using multiple transactions, which
> we are trying to avoid. It has also been suggested that in this specific
> instance the default security group could be created with the project.
> However, that would both be more complicated, because it would require
> putting a hook into another piece of code, and less robust, because it
> wouldn't recover if somebody deleted the default security group.
> To summarise, with repeatable read we're forced to abort the current
> transaction to deal with certain relatively common classes of
> concurrency issue, whereas with read committed we can safely recover. If
> we want to reduce the number of transactions we're using, which we do,
> the impact of this is going to dramatically increase. We should
> standardise on read committed.
> Matt
> --
> Matthew Booth
> Red Hat Engineering, Virtualisation Team
> Phone: +442070094448 (UK)
> GPG ID:  D33C3490
> GPG FPR: 3733 612D 2D05 5458 8A8A 1600 3441 EA19 D33C 3490
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev

More information about the OpenStack-dev mailing list