[openstack-dev] [oslo.messaging] State wrapping in the MessageHandlingServer

Joshua Harlow harlowja at fastmail.com
Tue Nov 10 18:46:00 UTC 2015


Matthew Booth wrote:
> My patch to MessageHandlingServer is currently being reverted because it
> broke Nova tests:
>
> https://review.openstack.org/#/c/235347/
>
> Specifically it causes a number of tests to take a very long time to
> execute, which ultimately results in the total build time limit being
> exceeded. This is very easy to replicate. The
> test nova.tests.functional.test_server_group.ServerGroupTest.test_boot_servers_with_affinity
> is an example test which will always hit this issue. The problem is
> that ServerGroupTest.setUp() does:
>
>          self.compute2 = self.start_service('compute', host='host2')
>          self.addCleanup(self.compute2.kill)
>
> The problem with this is that start_service() adds a fixture which also
> adds kill as a cleanup method. kill does stop(), wait(). This means that
> the resulting call order is: start, stop, wait, stop, wait. The
> redundant call to kill is obviously a wart, but I feel we should have
> handled it anyway.
>
> The problem is that we decided it should be possible to restart a
> server. There are some unit tests in oslo.messaging that do this. It's
> not clear to me that there are any projects which do this, but after
> this experience I feel like it would be good to check before changing it :)
>
> The implication of that is that after wait() the state wraps, and we're
> now waiting on start() again. Consequently, when the second cleanup call
> hangs.
>
> We could fix Nova (at least the usage we have seen) by removing the
> wrapping. After wait() if you want to start a server again you need to
> create a new one.
>
> So, to be specific, lets consider the following 2 call sequences:
>
> 1. start stop wait stop wait
> 2. start stop wait start stop wait
>
> What should they do? The behaviours with and without wrapping are:
>
> 1. start stop wait stop wait
> WRAP: start stop wait HANG HANG
> NO WRAP: start stop wait NO-OP NO-OP
>
> 2. start stop wait start stop wait
> WRAP: start stop wait start stop wait
> NO WRAP: start stop wait NO-OP NO-OP NO-OP
>
> I'll refresh my memory on what they did before my change in the morning.
> Perhaps it might be simpler to codify the current behaviour, but iirc I
> proposed this because it was previously undefined due to races.

I personally prefer not allowing restarting, its needless code 
complexity imho and a feature that people imho probably aren't using 
anyway (just create a new server object if u are doing this), so I'd be 
fine with doing the above NO WRAP and turning those into NO-OPs (and for 
example raising a runtime error in the case of start stop wait start ... 
to denote that restarting isn't recommended/possible). If we have a 
strong enough reason to really to start stop wait start ...

I might be convinced the code complexity is worth it but for now I'm not 
convinced...

>
> Matt
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



More information about the OpenStack-dev mailing list