[openstack-dev] [Oslo] [Ironic] DB migration woes
Devananda van der Veen
devananda.vdv at gmail.com
Mon Jun 9 16:50:15 UTC 2014
There may be some problems with MySQL when testing parallel writes in
different non-committing transactions, even in READ COMMITTED mode,
due to InnoDB locking, if the queries use non-unique secondary indexes
for UPDATE or SELECT..FOR UPDATE queries. This is done by the
"with_lockmode('update')" SQLAlchemy phrase, and is used in ~10 places
in Nova. So I would not recommend this approach, even though, in
principle, I agree it would be a much more efficient way of testing
database reads/writes.
More details here:
http://dev.mysql.com/doc/refman/5.5/en/innodb-locks-set.html and
http://dev.mysql.com/doc/refman/5.5/en/innodb-record-level-locks.html
On Sun, Jun 8, 2014 at 8:46 AM, Roman Podoliaka <rpodolyaka at mirantis.com> wrote:
> Hi Mike,
>
>>>> However, when testing an application that uses a fixed set of tables, as should be the case for the majority if not all Openstack apps, there’s no reason that these tables need to be recreated for every test.
>
> This is a very good point. I tried to use the recipe from SQLAlchemy
> docs to run Nova DB API tests (yeah, I know, this might sound
> confusing, but these are actually methods that access the database in
> Nova) on production backends (MySQL and PostgreSQL). The abandoned
> patch is here [1]. Julia Varlamova has been working on rebasing this
> on master and should upload a new patch set soon.
>
> Overall, the approach with executing a test within a transaction and
> then emitting ROLLBACK worked quite well. The only problem I ran into
> were tests doing ROLLBACK on purpose. But you've updated the recipe
> since then and this can probably be solved by using of save points. I
> used a separate DB per a test running process to prevent race
> conditions, but we should definitely give READ COMMITTED approach a
> try. If it works, that will awesome.
>
> With a few tweaks of PostgreSQL config I was able to run Nova DB API
> tests in 13-15 seconds, while SQLite in memory took about 7s.
>
> Action items for me and Julia probably: [2] needs a spec with [1]
> updated accordingly. Using of this 'test in a transaction' approach
> seems to be a way to go for running all db related tests except the
> ones using DDL statements (as any DDL statement commits the current
> transaction implicitly on MySQL and SQLite AFAIK).
>
> Thanks,
> Roman
>
> [1] https://review.openstack.org/#/c/33236/
> [2] https://blueprints.launchpad.net/nova/+spec/db-api-tests-on-all-backends
>
> On Sat, Jun 7, 2014 at 10:27 PM, Mike Bayer <mbayer at redhat.com> wrote:
>>
>> On Jun 6, 2014, at 8:12 PM, Devananda van der Veen <devananda.vdv at gmail.com>
>> wrote:
>>
>> I think some things are broken in the oslo-incubator db migration code.
>>
>> Ironic moved to this when Juno opened and things seemed fine, until recently
>> when Lucas tried to add a DB migration and noticed that it didn't run... So
>> I looked into it a bit today. Below are my findings.
>>
>> Firstly, I filed this bug and proposed a fix, because I think that tests
>> that don't run any code should not report that they passed -- they should
>> report that they were skipped.
>> https://bugs.launchpad.net/oslo/+bug/1327397
>> "No notice given when db migrations are not run due to missing engine"
>>
>> Then, I edited the test_migrations.conf file appropriately for my local
>> mysql service, ran the tests again, and verified that migration tests ran --
>> and they passed. Great!
>>
>> Now, a little background... Ironic's TestMigrations class inherits from
>> oslo's BaseMigrationTestCase, then "opportunistically" checks each back-end,
>> if it's available. This opportunistic checking was inherited from Nova so
>> that tests could pass on developer workstations where not all backends are
>> present (eg, I have mysql installed, but not postgres), and still
>> transparently run on all backends in the gate. I couldn't find such
>> opportunistic testing in the oslo db migration test code, unfortunately -
>> but maybe it's well hidden.
>>
>> Anyhow. When I stopped the local mysql service (leaving the configuration
>> unchanged), I expected the tests to be skipped, but instead I got two
>> surprise failures:
>> - test_mysql_opportunistically() failed because setUp() raises an exception
>> before the test code could call calling _have_mysql()
>> - test_mysql_connect_fail() actually failed! Again, because setUp() raises
>> an exception before running the test itself
>>
>> Unfortunately, there's one more problem... when I run the tests in parallel,
>> they fail randomly because sometimes two test threads run different
>> migration tests, and the setUp() for one thread (remember, it calls
>> _reset_databases) blows up the other test.
>>
>> Out of 10 runs, it failed three times, each with different errors:
>> NoSuchTableError: `chassis`
>> ERROR 1007 (HY000) at line 1: Can't create database 'test_migrations';
>> database exists
>> ProgrammingError: (ProgrammingError) (1146, "Table
>> 'test_migrations.alembic_version' doesn't exist")
>>
>> As far as I can tell, this is all coming from:
>>
>> https://github.com/openstack/oslo-incubator/blob/master/openstack/common/db/sqlalchemy/test_migrations.py#L86;L111
>>
>>
>> Hello -
>>
>> Just an introduction, I’m Mike Bayer, the creator of SQLAlchemy and Alembic
>> migrations. I’ve just joined on as a full time Openstack contributor,
>> and trying to help improve processes such as these is my primary
>> responsibility.
>>
>> I’ve had several conversations already about how migrations are run within
>> test suites in various openstack projects. I’m kind of surprised by this
>> approach of dropping and recreating the whole database for individual tests.
>> Running tests in parallel is obviously made very difficult by this style,
>> but even beyond that, a lot of databases don’t respond well to lots of
>> dropping/rebuilding of tables and/or databases in any case; while SQLite and
>> MySQL are probably the most forgiving of this, a backend like Postgresql is
>> going to lock tables from being dropped more aggressively, if any open
>> transactions or result cursors against those tables remain, and on a backend
>> like Oracle, the speed of schema operations starts to become prohibitively
>> slow. Dropping and creating tables is in general not a very speedy task on
>> any backend, and on a test suite that runs many tests against a fixed
>> schema, I don’t see why a full drop is necessary.
>>
>> If you look at SQLAlchemy’s own tests, they do in fact create tables on each
>> test, or just as often for a specific suite of tests. However, this is due
>> to the fact that SQLAlchemy tests are testing SQLAlchemy itself, so the
>> database schemas used for these tests are typically built explicitly for
>> small groups or individual tests, and there are ultimately thousands of
>> small “mini schemas” built up and torn down for these tests. A lot of
>> framework code is involved within the test suite to keep more picky
>> databases like Postgresql and Oracle happy when building up and dropping
>> tables so frequently.
>>
>> However, when testing an application that uses a fixed set of tables, as
>> should be the case for the majority if not all Openstack apps, there’s no
>> reason that these tables need to be recreated for every test. Typically,
>> the way I recommend is that the test suite includes a “per suite” activity
>> which creates the test schema just once (with or without using CREATE
>> DATABASE; I’m not a fan of tests running CREATE DATABASE as this is not a
>> command so easily available in some environments). The tests themselves
>> then run within a transactional container, such that each test performs all
>> of its work within a context that doesn’t actually commit any data to the
>> database; a test that actually states “session.commit()” runs within a
>> container that doesn’t actually emit the COMMIT, and if support is needed
>> for tests that also emit “session.rollback()”, the container can be written
>> to support this paradigm as well (I helped some Dropbox devs with such an
>> approach recently).
>>
>> In this way, the database migrations are exercised, but only once at the
>> beginning in order to build up the schema; the tests can then run with very
>> low complexity/performance overhead as far as database-level setup/teardown,
>> and parallel testing is also much easier. When the test suite completes is
>> when a drop of the entire set of tables can proceed. Because tests are run
>> within transactions, assuming READ COMMITTED isolation is established, the
>> tests don’t even see the data being incurred by other tests running in
>> parallel.
>>
>> It remains to be seen what aspects of Openstack I’m going to get involved
>> with first, though this migration and testing issue seems to be a big one.
>> I’d love to get comments from the community here as to how this process
>> might be improved and if a rearrangement of the fixture system in the way I
>> describe might be helpful and feasible.
>>
>> - mike
>>
>>
>>
>> _______________________________________________
>> OpenStack-dev mailing list
>> OpenStack-dev at lists.openstack.org
>> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
>>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list