[nova] critical bug around reload/upgrades

24 Mar 2019

      Hello:

I've discussed this for quite sometime with Dan over IRC and a bit
with Zane as well, but basically, Nova thinks that when it gets a
reload (aka SIGHUP), nothing else has occurred.

However, oslo.service actually calls stop(), reload() then start()
again, which potentially kills all RPC.  This has caused a pretty big
issue in our gates and it also means that the whole idea behind
'reload nova-compute while upgrading to refresh info' concept is
fundamentally broken.

I tried to do some work on this here, however, I wasn't really able to
get to the bottom of it.  There seems to be a decision that needs to
be taken in terms of .. do we change what reload() actually means in
oslo_service (it actually is more like a restart, not a reload) or
does Nova (and other projects) change their implementation in assuming
what reload() does?

https://review.openstack.org/#/c/641907/

This seems to have been floating around for a really long time, so I'd
be happy to work with someone to find the fix (and we can totally test
it inside OpenStack Ansible by reloading instead of restarting).

Thanks!
Mohammed

-- 
Mohammed Naser — vexxhost
-----------------------------------------------------
D. 514-316-8872
D. 800-910-1726 ext. 200
E. mnaser@vexxhost.com
W. http://vexxhost.com

Mohammed Naser

Mark Goddard

Mohammed Naser

Matt Riedemann

Eric Fried

tags

participants (4)