[openstack-dev] [Nova] Recent change breaks manual control of service enabled / disabled status - suggest it is backed out and re-worked

Day, Phil philip.day at hp.com
Mon Nov 11 11:34:10 UTC 2013


Hi Folks,

I'd like to get some eyes on a bug I just filed:  https://bugs.launchpad.net/nova/+bug/1250049

A recent change (https://review.openstack.org/#/c/52189/9 ) introduced the automatic disable / re-enable of nova-compute when connection to libvirt is lost and recovered.   The problem is that it doesn't take any account of the fact that a cloud administrator may have other reasons for disabling a service, and always put nova-compute back into an enabled state.

The impact of this is pretty big for us - at any point in time we have a number of servers disabled for various operational reasons, and there are times when we need to restart libvirt as part of a deployment.  With this change in place all of those hosts are returned to an enabled state, and the reason that they were disabled is lost.

While I like the concept that an error condition like this should disable the host from a scheduling perspective, I think it needs to be implemented as an additional form of disablement (i.e a separate value kept in the ServiceGroup API), not an override of the current one.

I'd like to propose that the current change is reverted as a priority, and a new approach then submitted as a second step that works alongside the current enable /disable reason.

Sorry for not catching this in the review stage - I didn't notice this one at all.

Phil
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-dev/attachments/20131111/1c053e5c/attachment.html>


More information about the OpenStack-dev mailing list