[openstack-dev] [nova] Networks are not cleaned up in build failure

Brian Haley brian.haley at hp.com
Thu Jan 15 14:33:39 UTC 2015


On 01/14/2015 02:15 PM, Andrew Laski wrote:
> 
> On 01/14/2015 12:57 PM, Murray, Paul (HP Cloud) wrote:
>>
>> Hi All, 
>>
>> I recently experienced failures getting images from Glance while spawning
>> instances. This step comes after building the networks in the guild sequence.
>> When the Glance failure occurred the instance was cleaned up and rescheduled
>> as expected, but the networks were not cleaned up. On investigation I found
>> that the cleanup code for the networks is in the compute manager’s
>> _/do_build_run/_instance() method as follows:
>>
>>             # NOTE(comstud): Deallocate networks if the driver wants
>>             # us to do so.
>>             if self.driver.deallocate_networks_on_reschedule(instance):
>>                 self._cleanup_allocated_networks(context, instance,
>>                         requested_networks)
>>
>> The default behavior in for the deallocate_networks_on_schedule() method
>> defined in ComputeDriver is:
>>
>>     def deallocate_networks_on_reschedule(self, instance):
>>         """Does the driver want networks deallocated on reschedule?"""
>>         return False
>>
>> Only the Ironic driver over rides this method to return True, so I think this
>> means the networks will not be cleaned up for any other virt driver.
>>
>>  
>>
>> Is this really the desired behavior?
>>
> 
> Yes.  Other than when using Ironic there is nothing specific to a particular
> host in the networking setup.  This means it is not necessary to deallocate and
> reallocate networks when an instance is rescheduled, so we can avoid the
> unnecessary work of doing it.

That's either not true any more, or not true when DVR is enabled in Neutron,
since in this case the port['binding:host_id'] value has been initialized to a
compute node, and won't get updated when nova-conductor re-schedules the VM
elsewhere.

This causes the neutron port to stay on the original compute node, and any
neutron operations (like floatingip-associate) happen on the "old" port, leaving
the VM unreachable.

> If the instance goes to ERROR then the network will get cleaned up when the
> instance is deleted.

I think we need to clean-up even in this case now too.

-Brian



More information about the OpenStack-dev mailing list