[openstack-dev] [cinder] Volume Drivers unit tests

yang, xing xing.yang at emc.com
Thu Jul 21 20:00:33 UTC 2016


Hi Ivan,

Thanks for sending this out.  Regarding the issue in the EMC VNX driver unit tests, it is tracked by this bug https://bugs.launchpad.net/cinder/+bug/1578986.  The driver was recently refactored so this is probably a new issue introduced by the refactor.  We are investigating this issue.

Thanks,
Xing


________________________________
From: Ivan Kolodyazhny [e0ne at e0ne.info]
Sent: Thursday, July 21, 2016 1:02 PM
To: OpenStack Development Mailing List
Subject: [openstack-dev] [cinder] Volume Drivers unit tests

Hi team,

First of all, I would like to apologize, if my mail is be too emotional. I spent too much of time to fix it and failed.

TL;DR;

What I want to say is: "Let's spend some time to make our tests better and fix all issues". Patch [1] is still unstable. Unit tests can pass or fail in a in a random order. Also, I've disabled some tests to pass CI.


Long version:

While I was working on patch "Move drivers unit tests to unit.volume.drivers directory" [1] I've found a lot of issues with our unit tests :(. Not all of them are already fixed, so that patch is still in progress

What did I found and what should we have to fix:

1) Execution time [2]. I don't want to argue what it unit tests, but 2-4 seconds per tests should be non-acceptable, IMO.

2) Execution order. Seriously, do you know that our tests will fail or hang if execution order will change? Even if one test for diver A failed, some tests for driver B will fail too.

3) Lack of mock. It's a root cause for #2. We didn't mock sleeps and event loops right. We don't mock RPC call well too [3]. We don't have 'cinder.openstack.common.rpc.impl_fake' module in Cinder tree.

In some drivers, we use oslo_service.loopingcall.FixedIntervalLoopingCall [4]. We've go ZeroIntervalLoopingCall [5] class in Cinder. Do we use it everywhere or mock FixedIntervalLoopingCall right? I don't think so, I've hacked oslo_service in my env to rise an exception if interval > 0. 297 tests failed. It means, our tests use sleep. We have to get rid of this. TBH, not only volume drivers unit tests failed. E.g. some API unit tests failed too.


4) Due to #3, sometimes unit tests hangs even on master branch with a minor changes.If I stop execution of such tests, usually I see something like [6]. In most of cases I see that following drivers' tests hangs: EMC, Huawei, Dell and RBD.

It's hard to debug such failures because the lack of tooling for eventlet debugging. Eventlet backdoors and gdb-python helps a bit. Maybe somebody know better solution for it.

[1] https://review.openstack.org/#/c/320148/
[2] http://paste.openstack.org/show/539081/
[3] https://github.com/openstack/cinder/search?utf8=%E2%9C%93&q=impl_fake
[4] use https://github.com/openstack/oslo.service/blob/master/oslo_service/loopingcall.py#L162
[5] https://github.com/openstack/cinder/blob/cfbb5bde4d9b37c39f6813fe685f987f8a990483/cinder/tests/unit/utils.py#L289
[6] http://paste.openstack.org/show/539090/


Regards,
Ivan Kolodyazhny,
http://blog.e0ne.info/
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20160721/39672b4e/attachment.html>


More information about the OpenStack-dev mailing list