<div dir="ltr">FakeUI, which is based on fake threads, is obviously needed for development purposes.<div>Ideally we need to refactor our integration tests, so that we don't run whole pipeline in every test. To start, I suggest that we switch from threads to synchronous runs of test cases (while keeping threads for fakeUI).</div><div>Please take a look & comment in this draft: <a href="https://review.openstack.org/#/c/294976/">https://review.openstack.org/#/c/294976/</a></div><div><br></div><div>Thanks,</div></div><br><div class="gmail_quote"><div dir="ltr">On Wed, Mar 16, 2016 at 7:30 AM Igor Kalnitsky <<a href="mailto:ikalnitsky@mirantis.com">ikalnitsky@mirantis.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Hey Vitaly,<br>

<br>

Thanks for your feedback, it's an important notice. However, I think<br>

you didn't get the problem quite well so let me explain it again.<br>

<br>

You see, Nailgun unit tests are failing due to races or deadlocks<br>

happened by two transactions: test transaction and fake thread<br>

transaction, and we must face it and fix it. This problem has nothing<br>

to do with the problem you're encountering in UI tests. Besides,<br>

removing them from test doesn't mean removing them from Nailgun code<br>

base.<br>

<br>

So your problem must be addressed, but it's kinda another story.<br>

<br>

Thanks,<br>

Igor<br>

<br>

On Wed, Mar 16, 2016 at 4:21 PM, Vitaly Kramskikh<br>

<<a href="mailto:vkramskikh@mirantis.com" target="_blank">vkramskikh@mirantis.com</a>> wrote:<br>

> Igor,<br>

><br>

> We have UI and CLI integration tests which use fake mode of Nailgun, and we<br>

> can't avoid using fake threads for them. So I think we need to think how to<br>

> fix fake threads instead. There is a critical bug which is the main reason<br>

> of randomly failing UI tests. To fix it, we need to fix fake threads<br>

> behaviour.<br>

><br>

> 2016-03-16 17:06 GMT+03:00 Igor Kalnitsky <<a href="mailto:ikalnitsky@mirantis.com" target="_blank">ikalnitsky@mirantis.com</a>>:<br>

>><br>

>> Hey Fuelers,<br>

>><br>

>> As you might know recently we encounter a lot of random test failures<br>

>> on CI, and they are still there (likely with a bit less probability).<br>

>> A nature of that random failures is actually not a random, they are<br>

>> happened because of so called fake threads.<br>

>><br>

>> Fake threads, actually, ain't fake at all. They are native OS threads<br>

>> that are designed to emulate Astute behaviour (i.e. catch RPC call and<br>

>> respond with appropriate message). Since they are native threads and<br>

>> we use SQLAlchemy's scoped_session, fake threads are using a separate<br>

>> database session, hence - transaction. That leads to the following<br>

>> issues:<br>

>><br>

>> * Races. We don't know when threads are switched, therefore, we don't<br>

>> know what's committed and what's not. Some Nailgun tests sends<br>

>> something via RPC (catched by fake threads) and immediately checks<br>

>> something. The issue is, we can't guarantee fake threads is already<br>

>> committed produced result. That could be avoided by waiting for<br>

>> 'ready' status of created nailgun task, however, it's better to simply<br>

>> do not use fake threads in that case and simply call appropriate<br>

>> Nailgun receiver's method directly in the test.<br>

>><br>

>> * Deadlocks. It's incredibly hard to ensure the same order of database<br>

>> locks in test + business code on one hand and fake thread code on<br>

>> other hand. That's why we can (and we do) encounter deadlocks on CI,<br>

>> when test case waits for lock acquired by fake thread, and fake thread<br>

>> waits for lock acquired by test case.<br>

>><br>

>> Fake threads are became a bottleneck of landing patches to master in<br>

>> time, and we can't ignore it anymore. We have ~190 tests that use fake<br>

>> threads, and fixing them all at once is a boring routine. So I kindly<br>

>> ask Nailgun contrubitors to fix them as soon as we face them. Let's<br>

>> file a bug on each file in CI, and quicly prepare a separate patch<br>

>> that removes fake thread from failed test.<br>

>><br>

>> Thanks in advance,<br>

>> Igor<br>

>><br>

>> __________________________________________________________________________<br>

>> OpenStack Development Mailing List (not for usage questions)<br>

>> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

>> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

><br>

><br>

><br>

><br>

> --<br>

> Vitaly Kramskikh,<br>

> Fuel UI Tech Lead,<br>

> Mirantis, Inc.<br>

><br>

> __________________________________________________________________________<br>

> OpenStack Development Mailing List (not for usage questions)<br>

> Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

> <a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

><br>

<br>

__________________________________________________________________________<br>

OpenStack Development Mailing List (not for usage questions)<br>

Unsubscribe: <a href="http://OpenStack-dev-request@lists.openstack.org?subject:unsubscribe" rel="noreferrer" target="_blank">OpenStack-dev-request@lists.openstack.org?subject:unsubscribe</a><br>

<a href="http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev" rel="noreferrer" target="_blank">http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev</a><br>

</blockquote></div><div dir="ltr">-- <br></div><div dir="ltr">Mike Scherbakov<br>#mihgen</div>