[openstack-dev] [Oslo] [Ironic] DB migration woes

Devananda van der Veen devananda.vdv at gmail.com
Mon Jun 9 18:57:02 UTC 2014


On Mon, Jun 9, 2014 at 10:49 AM, Jay Pipes <jaypipes at gmail.com> wrote:
> On 06/09/2014 12:50 PM, Devananda van der Veen wrote:
>>
>> There may be some problems with MySQL when testing parallel writes in
>> different non-committing transactions, even in READ COMMITTED mode,
>> due to InnoDB locking, if the queries use non-unique secondary indexes
>> for UPDATE or SELECT..FOR UPDATE queries. This is done by the
>> "with_lockmode('update')" SQLAlchemy phrase, and is used in ~10 places
>> in Nova. So I would not recommend this approach, even though, in
>> principle, I agree it would be a much more efficient way of testing
>> database reads/writes.
>>
>> More details here:
>> http://dev.mysql.com/doc/refman/5.5/en/innodb-locks-set.html and
>> http://dev.mysql.com/doc/refman/5.5/en/innodb-record-level-locks.html
>
>
> Hi Deva,
>
> MySQL/InnoDB's default isolation mode is REPEATABLE_READ, not
> READ_COMMITTED... are you saying that somewhere in the Ironic codebase we
> are setting the isolation mode manually to READ_COMMITTED for some reason?
>
> Best,
> -jay
>

Jay,

Not saying that at all. I was responding to Mike's suggested approach
for testing DB changes (which was actually off topic from my original
post), in which he suggested using READ_COMMITTED.

-Deva

>
>> On Sun, Jun 8, 2014 at 8:46 AM, Roman Podoliaka <rpodolyaka at mirantis.com>
>> wrote:
>>>
>>> Hi Mike,
>>>
>>>>>> However, when testing an application that uses a fixed set of tables,
>>>>>> as should be the case for the majority if not all Openstack apps, there’s no
>>>>>> reason that these tables need to be recreated for every test.
>>>
>>>
>>> This is a very good point. I tried to use the recipe from SQLAlchemy
>>> docs to run Nova DB API tests (yeah, I know, this might sound
>>> confusing, but these are actually methods that access the database in
>>> Nova) on production backends (MySQL and PostgreSQL). The abandoned
>>> patch is here [1]. Julia Varlamova has been working on rebasing this
>>> on master and should upload a new patch set soon.
>>>
>>> Overall, the approach with executing a test within a transaction and
>>> then emitting ROLLBACK worked quite well. The only problem I ran into
>>> were tests doing ROLLBACK on purpose. But you've updated the recipe
>>> since then and this can probably be solved by using of save points. I
>>> used a separate DB per a test running process to prevent race
>>> conditions, but we should definitely give READ COMMITTED approach a
>>> try. If it works, that will awesome.
>>>
>>> With a few tweaks of PostgreSQL config I was able to run Nova DB API
>>> tests in 13-15 seconds, while SQLite in memory took about 7s.
>>>
>>> Action items for me and Julia probably: [2] needs a spec with [1]
>>> updated accordingly. Using of this 'test in a transaction' approach
>>> seems to be a way to go for running all db related tests except the
>>> ones using DDL statements (as any DDL statement commits the current
>>> transaction implicitly on MySQL and SQLite AFAIK).
>>>
>>> Thanks,
>>> Roman
>>>
>>> [1] https://review.openstack.org/#/c/33236/
>>> [2]
>>> https://blueprints.launchpad.net/nova/+spec/db-api-tests-on-all-backends
>>>
>>> On Sat, Jun 7, 2014 at 10:27 PM, Mike Bayer <mbayer at redhat.com> wrote:
>>>>
>>>>
>>>> On Jun 6, 2014, at 8:12 PM, Devananda van der Veen
>>>> <devananda.vdv at gmail.com>
>>>> wrote:
>>>>
>>>> I think some things are broken in the oslo-incubator db migration code.
>>>>
>>>> Ironic moved to this when Juno opened and things seemed fine, until
>>>> recently
>>>> when Lucas tried to add a DB migration and noticed that it didn't run...
>>>> So
>>>> I looked into it a bit today. Below are my findings.
>>>>
>>>> Firstly, I filed this bug and proposed a fix, because I think that tests
>>>> that don't run any code should not report that they passed -- they
>>>> should
>>>> report that they were skipped.
>>>>    https://bugs.launchpad.net/oslo/+bug/1327397
>>>>    "No notice given when db migrations are not run due to missing
>>>> engine"
>>>>
>>>> Then, I edited the test_migrations.conf file appropriately for my local
>>>> mysql service, ran the tests again, and verified that migration tests
>>>> ran --
>>>> and they passed. Great!
>>>>
>>>> Now, a little background... Ironic's TestMigrations class inherits from
>>>> oslo's BaseMigrationTestCase, then "opportunistically" checks each
>>>> back-end,
>>>> if it's available. This opportunistic checking was inherited from Nova
>>>> so
>>>> that tests could pass on developer workstations where not all backends
>>>> are
>>>> present (eg, I have mysql installed, but not postgres), and still
>>>> transparently run on all backends in the gate. I couldn't find such
>>>> opportunistic testing in the oslo db migration test code, unfortunately
>>>> -
>>>> but maybe it's well hidden.
>>>>
>>>> Anyhow. When I stopped the local mysql service (leaving the
>>>> configuration
>>>> unchanged), I expected the tests to be skipped, but instead I got two
>>>> surprise failures:
>>>> - test_mysql_opportunistically() failed because setUp() raises an
>>>> exception
>>>> before the test code could call calling _have_mysql()
>>>> - test_mysql_connect_fail() actually failed! Again, because setUp()
>>>> raises
>>>> an exception before running the test itself
>>>>
>>>> Unfortunately, there's one more problem... when I run the tests in
>>>> parallel,
>>>> they fail randomly because sometimes two test threads run different
>>>> migration tests, and the setUp() for one thread (remember, it calls
>>>> _reset_databases) blows up the other test.
>>>>
>>>> Out of 10 runs, it failed three times, each with different errors:
>>>>    NoSuchTableError: `chassis`
>>>>    ERROR 1007 (HY000) at line 1: Can't create database
>>>> 'test_migrations';
>>>> database exists
>>>>    ProgrammingError: (ProgrammingError) (1146, "Table
>>>> 'test_migrations.alembic_version' doesn't exist")
>>>>
>>>> As far as I can tell, this is all coming from:
>>>>
>>>>
>>>> https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/test_migrations.py#L86;L111
>>>>
>>>>
>>>> Hello -
>>>>
>>>> Just an introduction, I’m Mike Bayer, the creator of SQLAlchemy and
>>>> Alembic
>>>> migrations.     I’ve just joined on as a full time Openstack
>>>> contributor,
>>>> and trying to help improve processes such as these is my primary
>>>> responsibility.
>>>>
>>>> I’ve had several conversations already about how migrations are run
>>>> within
>>>> test suites in various openstack projects.   I’m kind of surprised by
>>>> this
>>>> approach of dropping and recreating the whole database for individual
>>>> tests.
>>>> Running tests in parallel is obviously made very difficult by this
>>>> style,
>>>> but even beyond that, a lot of databases don’t respond well to lots of
>>>> dropping/rebuilding of tables and/or databases in any case; while SQLite
>>>> and
>>>> MySQL are probably the most forgiving of this, a backend like Postgresql
>>>> is
>>>> going to lock tables from being dropped more aggressively, if any open
>>>> transactions or result cursors against those tables remain, and on a
>>>> backend
>>>> like Oracle, the speed of schema operations starts to become
>>>> prohibitively
>>>> slow.   Dropping and creating tables is in general not a very speedy
>>>> task on
>>>> any backend, and on a test suite that runs many tests against a fixed
>>>> schema, I don’t see why a full drop is necessary.
>>>>
>>>> If you look at SQLAlchemy’s own tests, they do in fact create tables on
>>>> each
>>>> test, or just as often for a specific suite of tests.  However, this is
>>>> due
>>>> to the fact that SQLAlchemy tests are testing SQLAlchemy itself, so the
>>>> database schemas used for these tests are typically built explicitly for
>>>> small groups or individual tests, and there are ultimately thousands of
>>>> small “mini schemas” built up and torn down for these tests.   A lot of
>>>> framework code is involved within the test suite to keep more picky
>>>> databases like Postgresql and Oracle happy when building up and dropping
>>>> tables so frequently.
>>>>
>>>> However, when testing an application that uses a fixed set of tables, as
>>>> should be the case for the majority if not all Openstack apps, there’s
>>>> no
>>>> reason that these tables need to be recreated for every test.
>>>> Typically,
>>>> the way I recommend is that the test suite includes a “per suite”
>>>> activity
>>>> which creates the test schema just once (with or without using CREATE
>>>> DATABASE; I’m not a fan of tests running CREATE DATABASE as this is not
>>>> a
>>>> command so easily available in some environments).   The tests
>>>> themselves
>>>> then run within a transactional container, such that each test performs
>>>> all
>>>> of its work within a context that doesn’t actually commit any data to
>>>> the
>>>> database; a test that actually states “session.commit()” runs within a
>>>> container that doesn’t actually emit the COMMIT, and if support is
>>>> needed
>>>> for tests that also emit “session.rollback()”, the container can be
>>>> written
>>>> to support this paradigm as well (I helped some Dropbox devs with such
>>>> an
>>>> approach recently).
>>>>
>>>> In this way, the database migrations are exercised, but only once at the
>>>> beginning in order to build up the schema; the tests can then run with
>>>> very
>>>> low complexity/performance overhead as far as database-level
>>>> setup/teardown,
>>>> and parallel testing is also much easier.   When the test suite
>>>> completes is
>>>> when a drop of the entire set of tables can proceed.  Because tests are
>>>> run
>>>> within transactions, assuming READ COMMITTED isolation is established,
>>>> the
>>>> tests don’t even see the data being incurred by other tests running in
>>>> parallel.
>>>>
>>>> It remains to be seen what aspects of Openstack I’m going to get
>>>> involved
>>>> with first, though this migration and testing issue seems to be a big
>>>> one.
>>>> I’d love to get comments from the community here as to how this process
>>>> might be improved and if a rearrangement of the fixture system in the
>>>> way I
>>>> describe might be helpful and feasible.
>>>>
>>>> - mike
>>>>
>>>>
>>>>
>>>> _______________________________________________
>>>> OpenStack-dev mailing list
>>>> OpenStack-dev at lists.openstack.org
>>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>>>
>>>
>>> _______________________________________________
>>> OpenStack-dev mailing list
>>> OpenStack-dev at lists.openstack.org
>>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list