[OpenStack-Infra] Using MySQL for unit tests in storyboard

James E. Blair jeblair at openstack.org
Wed Mar 26 21:31:38 UTC 2014


Clint Byrum <clint at fewbar.com> writes:

>>  * The model does not specify an engine for MySQL (this is a difference
>>    between the model and the alembic migrations). So when the unit tests
>>    are run on MySQL, they may end up on MyISAM (they do in the gate).
>> 
>
> Why MyISAM? The default has been InnoDB since MySQL 5.5.

We override the default storage engine on our test nodes to be MyISAM
specifically so that we catch errors like this.  It used to be a big
problem with Nova.  Perhaps enough time has passed that we can say that
it is inconceivable that anyone anywhere would be running on a system
where myisam is the default.  I'm not prepared to vouch for that.  :)

>>  * The two errors that show up in the test results for my patch relate
>>    to how timestamps are stored differently between MySQL and SQLite.
>> 
>
> Isn't this a bug in sqlalchemy? I'd expect any data type that is a common
> name in the two databases (and thus the same in the model definition)
> to produce the same data in the python representation.

After reading the SQLite docs, I know less than I did before.
Apparently DATETIME might be stored as TEXT, REAL, or INTEGER values.
In practice, the sqlite databases created by the current unit tests seem
to end up with TEXT values, because they store values like "2014-03-26
21:11:13.304460".  According to the docs, MySQL accepts but discards
sub-second values.  The error is that the value it gets back from MySQL
lacks the part after the decimal point; when the test is run with
SQLite, it is retained.

>>  * The errors that you can not see in the test report from Jenkins but
>>    would if you ran in an environment with InnoDB as the default engine
>>    for MySQL are several instances of foreign key constraint errors.
>
> Side note: IMHO: Just drop the FK constraints if you want the app to
> be scalable.  FK's just add extra reads for every write, when you could
> be doing that verification / orphan clean up offline later on.

I would like to do that.  That might even be my preferred way of
addressing this problem.  I just haven't dug into it yet figuring we
could probably answer the present question "should we require mysql for
tests" independently; perhaps they will end up being more related than I
thought.

>> I think it is desirable to run the unit tests with MySQL largely because
>> I think we should be testing it in the way we intend to use it in
>> production.  Note especially the last point above -- our unit tests have
>> significant problems with MySQL.  That suggests one of two things to me:
>> either the tests aren't testing real behavior, or our system is actually
>> broken in production.  Neither one seems like a good situation.
>> 
>
> Wouldn't integration tests be the place to find production vs. code path
> issues? Tying unit tests to the creation of mysql databases introduces a
> really big piece of machinery into the test fixture setup, which you
> want to be fast and lean so developers can iterate on it while working.

Then I'm not sure what the database unit tests are testing.  Basically,
Storyboard's short history of CD is that it frequently gets wedged
because it's never tested under a production configuration.  If you are
arguing that an integration test in this context is "integrate
storyboard and mysql and see if they work together", that does make
sense.  However, I would argue that storyboard itself doesn't (and
shouldn't) work _without_ mysql It certainly requires a database.
Instead our "unit tests" are currently "integration" tests where we
integrate storyboard with a component (sqlite) that is never used when
running it.

Regardless of semantics, I see it as a fairly clear dichotomy: we either
test with a component we don't use, or we test with the one we do use.
I would prefer the latter.

>> That's a good point, but I personally find testing as close to
>> production as possible desirable -- MySQL and SQLite are not compatible
>> so we shouldn't pretend that they are.  If we prove this is a viable
>> strategy, then I think we should recommend OpenStack projects adopt it
>> as well.
>
> I just wonder if SQLAlchemy is intended to "smooth out" the differences
> between the two in such a way where you can at least be confident that
> your code is doing what you think it is.

I'm not sure whether it's intended to do that, but I don't believe it
does that very well.  In practice, to write a good application you have
to know a lot about exactly what your database is doing.  If you write
against two databases, then you still need to know twice as much.  For a
while now, I have considered SQLAlchemy (and ORMs in general) as very
useful tools to map between an application model and databases, but I've
ceased considering them to be an abstraction layer.

-Jim



More information about the OpenStack-Infra mailing list