[openstack-dev] [nova] Networks are not cleaned up in build failure
Andrew Laski
andrew.laski at rackspace.com
Thu Jan 15 17:55:57 UTC 2015
On 01/15/2015 09:33 AM, Brian Haley wrote:
> On 01/14/2015 02:15 PM, Andrew Laski wrote:
>> On 01/14/2015 12:57 PM, Murray, Paul (HP Cloud) wrote:
>>> Hi All,
>>>
>>> I recently experienced failures getting images from Glance while spawning
>>> instances. This step comes after building the networks in the guild sequence.
>>> When the Glance failure occurred the instance was cleaned up and rescheduled
>>> as expected, but the networks were not cleaned up. On investigation I found
>>> that the cleanup code for the networks is in the compute manager’s
>>> _/do_build_run/_instance() method as follows:
>>>
>>> # NOTE(comstud): Deallocate networks if the driver wants
>>> # us to do so.
>>> if self.driver.deallocate_networks_on_reschedule(instance):
>>> self._cleanup_allocated_networks(context, instance,
>>> requested_networks)
>>>
>>> The default behavior in for the deallocate_networks_on_schedule() method
>>> defined in ComputeDriver is:
>>>
>>> def deallocate_networks_on_reschedule(self, instance):
>>> """Does the driver want networks deallocated on reschedule?"""
>>> return False
>>>
>>> Only the Ironic driver over rides this method to return True, so I think this
>>> means the networks will not be cleaned up for any other virt driver.
>>>
>>>
>>>
>>> Is this really the desired behavior?
>>>
>> Yes. Other than when using Ironic there is nothing specific to a particular
>> host in the networking setup. This means it is not necessary to deallocate and
>> reallocate networks when an instance is rescheduled, so we can avoid the
>> unnecessary work of doing it.
> That's either not true any more, or not true when DVR is enabled in Neutron,
> since in this case the port['binding:host_id'] value has been initialized to a
> compute node, and won't get updated when nova-conductor re-schedules the VM
> elsewhere.
>
> This causes the neutron port to stay on the original compute node, and any
> neutron operations (like floatingip-associate) happen on the "old" port, leaving
> the VM unreachable.
Gotcha. Then we should be rebinding that port on a reschedule or go
back to de/reallocating. I'm assuming there's some way to handle the
port being moved or resizes would be broken for the same reason.
If we do need to move back to de/reallocation of networks I think it
would be better to remove the conditional nature of it and just do it.
If the deallocate_networks_on_reschedule method defaults to True I don't
see a case where it would be overridden by a driver given the
information above.
>> If the instance goes to ERROR then the network will get cleaned up when the
>> instance is deleted.
> I think we need to clean-up even in this case now too.
>
> -Brian
>
> __________________________________________________________________________
> OpenStack Development Mailing List (not for usage questions)
> Unsubscribe: OpenStack-dev-request at lists.openstack.org?subject:unsubscribe
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list