[nova] critical bug around reload/upgrades

Mohammed Naser mnaser at vexxhost.com
Fri Mar 29 00:42:54 UTC 2019


On Mon, Mar 25, 2019 at 6:02 AM Mark Goddard <mark at stackhpc.com> wrote:
>
>
>
> On Sun, 24 Mar 2019 at 00:35, Mohammed Naser <mnaser at vexxhost.com> wrote:
>>
>> Hello:
>>
>> I've discussed this for quite sometime with Dan over IRC and a bit
>> with Zane as well, but basically, Nova thinks that when it gets a
>> reload (aka SIGHUP), nothing else has occurred.
>>
>> However, oslo.service actually calls stop(), reload() then start()
>> again, which potentially kills all RPC.  This has caused a pretty big
>> issue in our gates and it also means that the whole idea behind
>> 'reload nova-compute while upgrading to refresh info' concept is
>> fundamentally broken.
>>
>> I tried to do some work on this here, however, I wasn't really able to
>> get to the bottom of it.  There seems to be a decision that needs to
>> be taken in terms of .. do we change what reload() actually means in
>> oslo_service (it actually is more like a restart, not a reload) or
>> does Nova (and other projects) change their implementation in assuming
>> what reload() does?
>>
>> https://review.openstack.org/#/c/641907/
>>
>> This seems to have been floating around for a really long time, so I'd
>> be happy to work with someone to find the fix (and we can totally test
>> it inside OpenStack Ansible by reloading instead of restarting).
>>
> Thanks for bringing this up Mohammed, I would also like to see a solution for this. We go with a hard restart of the service in kolla-ansible as a workaround. It would be nice it we could do a more lightweight HUP.

Looks like some progress has been made but we're pretty confident that this
is more and more an Oslo.service bug:

Matt & Dan have both left ideas around this with possible solutions on how to
make a change like this back portable..

https://review.openstack.org/#/c/641907/

Thanks.

>> Thanks!
>> Mohammed
>>
>> --
>> Mohammed Naser — vexxhost
>> -----------------------------------------------------
>> D. 514-316-8872
>> D. 800-910-1726 ext. 200
>> E. mnaser at vexxhost.com
>> W. http://vexxhost.com
>>


-- 
Mohammed Naser — vexxhost
-----------------------------------------------------
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser at vexxhost.com
W. http://vexxhost.com



More information about the openstack-discuss mailing list