[nova][neutron][ptg] Summary: Leaking resources when ports are deleted out-of-band
Balázs Gibizer
balazs.gibizer at ericsson.com
Sat May 4 16:57:37 UTC 2019
On Sat, May 4, 2019 at 10:25 AM, Michael Johnson <johnsomor at gmail.com>
wrote:
> I think this will have implications for Octavia, but we can work
> through those.
>
> There are cases during cleanup from an error where we delete ports
> owned by "Octavia" that have not yet be attached to a nova instance.
> My understanding of the above discussion is that this would not be an
> issue under this change.
If the port is owned by Octavia then the resource leak does not happen.
However the propose neutron code / policy change affects this case as
well.
>
> However....
>
> We also, currently, manipulate the ports we have hot-plugged
> (attached) to nova instances where the port "device_owner" has become
> "compute:nova", mostly for failover scenarios and cases where nova
> detach fails and we have to revert the action.
>
> Now, if the "proper" new procedure is to first detach before deleting
> the port, we can look at attempting that. But, in the common failure
> scenarios we see nova failing to complete this, if for example the
> compute host has been powered off. In this scenario we still need to
> delete the neutron port for both resource cleanup and quota reasons.
> This so we can create a new port and attach it to a new instance to
> recover.
If Octavai also deletes the VM then force deleting the port is OK from
placement resource prespective as the VM delete will make sure we are
deleting the leaked port resources.
>
> I think this change will impact our current port manage flows, so we
> should proceed cautiously, test heavily, and potentially address some
> of the nova failure scenarios at the same time.
After talking to rm_work on #openstack-nova [1] it feels that the
policy based solution would work for Octavia. So Octavia with the extra
policy can still delete the bound port in Neutron safely as Ocatavia
also deletes the VM that the port was bound to. That VM delete will
reclaim the leaked port resource.
The failure to detach a port via nova while the nova-compute is down
could be a bug on nova side.
cheers,
gibi
[1]
http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2019-05-04.log.html#t2019-05-04T16:15:52
>
> Michael
>
> On Fri, May 3, 2019 at 5:23 PM Akihiro Motoki <amotoki at gmail.com>
> wrote:
>>
>>
>>
>> On Fri, May 3, 2019 at 4:11 PM Matt Riedemann <mriedemos at gmail.com>
>> wrote:
>>>
>>> On 5/3/2019 3:35 PM, Balázs Gibizer wrote:
>>> > 2) Matt had a point after the session that if Neutron enforces
>>> that
>>> > only unbound port can be deleted then not only Nova needs to be
>>> changed
>>> > to unbound a port before delete it, but possibly other Neutron
>>> > consumers (Octavia?).
>>>
>>> And potentially Zun, there might be others, Magnum, Heat, idk?
>>>
>>> Anyway, this is a thing that has been around forever which admins
>>> shouldn't do, do we need to prioritize making this change in both
>>> neutron and nova to make two requests to delete a bound port? Or
>>> is just
>>> logging the ERROR that you've leaked allocations, tsk tsk, enough?
>>> I
>>> tend to think the latter is fine until someone comes along saying
>>> this
>>> is really hurting them and they have a valid use case for deleting
>>> bound
>>> ports out of band from nova.
>>
>>
>> neutron deines a special role called "advsvc" for advanced network
>> services [1].
>> I think we can change neutron to block deletion of bound ports for
>> regular users and
>> allow users with "advsvc" role to delete bound ports.
>> I haven't checked which projects currently use "advsvc".
>>
>> [1]
>> https://protect2.fireeye.com/url?k=e82c8753-b4a78c60-e82cc7c8-865bb277df6a-a57d1b5660e0038e&u=https://opendev.org/openstack/neutron/src/branch/master/neutron/conf/policies/port.py#L53-L59
>>
>>>
>>>
>>> --
>>>
>>> Thanks,
>>>
>>> Matt
>>>
>
More information about the openstack-discuss
mailing list