Open Stack

Fri Aug 22 19:35:37 UTC 2014

Hi all -

I’ve spent many weeks on a series of patches for which the primary goal is to provide very efficient patterns for tests that use databases and schemas within those databases, including compatibility with parallel tests, transactional testing, and scenario-driven testing (e.g. a test that runs multiple times against different databases).

To that end, the current two patches that achieve this behavior in a rudimental fashion are part of oslo.db and are at: https://review.openstack.org/#/c/110486/ and https://review.openstack.org/#/c/113153/.    They have been in the queue for about four weeks now.      The general theory of operation is that within a particular Python process, a fixed database identifier is established (currently via an environment variable).   As tests request the services of databases, such as a Postgresql database or a MySQL database, the system will provision a database within that backend of that fixed identifier and return it.   The test can then request that it make use of a particular “schema” - for example, Nova’s tests may request that they are using the “nova schema”, which means that the schema for Nova’s model will be created within this database, and will them remain permanently across the span of many tests which use this same schema.  Only when a test requests that it wants a different schema, or no schema, will the tables be dropped.    To ensure the schema is “clean” for every test, the provisioning system ensures that each test runs within a transaction, which at test end is rolled back.    In order to accommodate tests that themselves need to roll back, the test additionally runs within the context of a SAVEPOINT.   This system is entirely working, and for those that are wondering, yes it works with SQLite as well (see https://review.openstack.org/#/c/113152/).

And as implied earlier, to ensure the operations upon this schema don’t conflict with parallel test runs, the whole thing is running within a database that is specific to the Python process.

So instead of the current behavior of generating the entire nova schema for every test and being hardcoded to Sqlite, a particular test will be able to run itself against any specific backend or all available backends in series, without needing to do a CREATE for the whole schema on every test.   It will greatly expand database coverage as well as allow database tests to run dramatically faster, using entirely consistent systems for setting up schemas and database connectivity.

The “transactional test” system is one I’ve used extensively in other projects.  SQLAlchemy itself now runs tests against a py.test-specific variant which runs under parallel testing and generates ad-hoc schemas per Python process.    The patches above achieve these patterns successfully and transparently in the context of Openstack tests, only the “scenarios” support for a single test to run repeatedly against multiple backends is still a todo.

However, the first patch has just been -1’ed by Robert Collins, the publisher of many of the various “testtools” libraries that are prevalent within Openstack projects.

Robert suggests that the approach integrate with the testresources library: https://pypi.python.org/pypi/testresources.   I’ve evaluated this system and after some initial resistance I can see that it would in fact work very nicely with the system I have, in that it provides the OptimisingTestSuite - a special unittest test suite that will take tests like the above which are marked needing particular resources, and then sort them such that individual resources are set up and torn down a minimal number of times.    It has heavy algorithmic logic to accomplish this which is certainly far beyond what would be appropriate to home-roll within oslo.db.

I like the idea of integrating this optimization a lot, however it runs into a particular issue which I also hit upon with my more simplistic approach.   

The issue is that being able to use a resource like a database schema across many tests requires that some kind of logic has access to the test run as a whole.    At the very least, a hook that indicates “the tests are done, lets tear down these ad-hoc databases” is needed.

For my first iteration, I observed that Openstack tests are generally run either via testr, or via a shell script.  So to that end I expanded upon an approach that was already present in oslo.db, that is to use scripts which provision the names of databases to create, and then drop them at the end of all tests run.   For testr, I used the “instance_execute”, “instance_dispose”, and “instance_provision” hooks in testr.conf to call upon these sub-scripts:

    instance_provision=${PYTHON:-python} -m oslo.db.sqlalchemy.provision echo $INSTANCE_COUNT
    instance_dispose=${PYTHON:-python} -m oslo.db.sqlalchemy.provision drop --conditional $INSTANCE_IDS
    instance_execute=OSLO_SCHEMA_TOKEN=$INSTANCE_ID $COMMAND

That is, the provisioning system within tests looks only at OSLO_SCHEMA_TOKEN to determine what “name” it is running on.   The final “teardown” is given by instance_dispose which emits a DROP for the databases that were created.  The “echo” command does *not* create a database, it only generates identifiers - the databases themselves are created lazily on an as-needed basis.

For systems that use shell scripts, the approach is the same.   Those systems would integrate the above three commands into the shell script directly, because again all that’s needed is that OSLO_SCHEMA_TOKEN environment variable and then the “drop” step at the end.

Lots of people have complained that I put those hooks into .testr.conf.  Even though they do not preclude the use of other systems, people don’t like it there, so OK, so let’s take them out.    Which leaves us with, what system *should* we use?   

Robert’s suggestion of OptimisingTestSuite sounds great, so let’s see how that works.    We have to in fact use the unittest “load_tests()” hook: https://docs.python.org/2/library/unittest.html#load-tests-protocol.  It says “new in Python 2.7”, I’m not sure that perhaps testrepository honors this system in Python 2.6 as well (which is my first question).    The hook does not seem to be supported by nose, and py.test is totally out the window already, it already doesn’t integrate with testscenarios and probably not with testresources either.

It also means, unless I’m totally misunderstanding (please clarify for me!)  that the system of integrating the transactional test provisioning system means that projects will have to add a load_tests() function to all of their test modules, or at least the ones that include classes which subclass DbTestCase.   I grepped around and found the keystone-pythonclient project using this method - there are load_tests() functions in many modules each of which specify OptimisingTestSuite separately.   This already seems less than ideal in that it no longer is enough for a test case to subclass a particular base, like DbTestCase; the whole thing still won’t work unless this magic load_tests() function is also present in the module with explicit callouts to OptimisingTestSuite, or some oslo.db hook to do something similar.

Additionally, in order to get the optimization to work across multiple test modules, the load_tests() functions would have to coordinate such that the same OptimisingTestSuite is used across all of them- I have not seen an example of this, though again perhaps an oslo.db helper can coordinate.   But it does mean that thousands of tests will now be reordered and controlled by OptimisingTestSuite, even those tests within the module that are not actually in need of it, unless the system can be made more specific to only put certain kinds of test cases into that particular suite.

Anyway, this seems like an invasive and verbose change to be made across hundreds of modules, unless there is a better way I’m not familiar with (which is my second question).  My approach of using .testr.conf / shell scripts allowed the system to work transparently for any test case that subclassed DbTestCase, but again, I need an approach that will be approved of by the community or else it is obviously useless effort.   If you all are OK with putting lots of load_tests() functions throughout your test modules, just let me know and I’ll go with that.

So to generalize among questions one and two above, the overarching question I have to the Openstack developer community: please tell me what system(s) are acceptable here in order to provide either per-test-run fixtures or otherwise be able to instrument the collection of tests!   Would you all be amenable to OptimisingTestSuite being injected across all of your test modules that include database-backed tests with explicit code, or should some other system be devised,  and if so, what is nature of that system?    Are there more examples I should be looking at?   To be clear, the system I am working on here will be the official “oslo.db” system of writing database-enabled tests, which can run very efficiently in parallel using per-process schemas and transaction-rollback tests.    Ideally we’re going to want this system to be available anywhere a test decides to subclass the DB test fixture.

I can make this approach work in many ways, but at this point I’m not willing to spend another four weeks on an approach only to have it -1’ed.      Can the community please advise on what methodology is acceptable so that I may spend my efforts on a system that can be approved?  Thanks!

- mike

Open Stack

[openstack-dev] [all] Acceptable methods for establishing per-test-suite behaviors

OpenStack

Community

Documentation

Branding & Legal