[openstack-dev] [nova] Networks are not cleaned up in build failure
brian.haley at hp.com
Thu Jan 15 14:33:39 UTC 2015
On 01/14/2015 02:15 PM, Andrew Laski wrote:
> On 01/14/2015 12:57 PM, Murray, Paul (HP Cloud) wrote:
>> Hi All,
>> I recently experienced failures getting images from Glance while spawning
>> instances. This step comes after building the networks in the guild sequence.
>> When the Glance failure occurred the instance was cleaned up and rescheduled
>> as expected, but the networks were not cleaned up. On investigation I found
>> that the cleanup code for the networks is in the compute manager’s
>> _/do_build_run/_instance() method as follows:
>> # NOTE(comstud): Deallocate networks if the driver wants
>> # us to do so.
>> if self.driver.deallocate_networks_on_reschedule(instance):
>> self._cleanup_allocated_networks(context, instance,
>> The default behavior in for the deallocate_networks_on_schedule() method
>> defined in ComputeDriver is:
>> def deallocate_networks_on_reschedule(self, instance):
>> """Does the driver want networks deallocated on reschedule?"""
>> return False
>> Only the Ironic driver over rides this method to return True, so I think this
>> means the networks will not be cleaned up for any other virt driver.
>> Is this really the desired behavior?
> Yes. Other than when using Ironic there is nothing specific to a particular
> host in the networking setup. This means it is not necessary to deallocate and
> reallocate networks when an instance is rescheduled, so we can avoid the
> unnecessary work of doing it.
That's either not true any more, or not true when DVR is enabled in Neutron,
since in this case the port['binding:host_id'] value has been initialized to a
compute node, and won't get updated when nova-conductor re-schedules the VM
This causes the neutron port to stay on the original compute node, and any
neutron operations (like floatingip-associate) happen on the "old" port, leaving
the VM unreachable.
> If the instance goes to ERROR then the network will get cleaned up when the
> instance is deleted.
I think we need to clean-up even in this case now too.
More information about the OpenStack-dev