On Sat, May 4, 2019 at 10:25 AM, Michael Johnson <johnsomor@gmail.com> wrote:
I think this will have implications for Octavia, but we can work through those.
There are cases during cleanup from an error where we delete ports owned by "Octavia" that have not yet be attached to a nova instance. My understanding of the above discussion is that this would not be an issue under this change.
If the port is owned by Octavia then the resource leak does not happen. However the propose neutron code / policy change affects this case as well.
However....
We also, currently, manipulate the ports we have hot-plugged (attached) to nova instances where the port "device_owner" has become "compute:nova", mostly for failover scenarios and cases where nova detach fails and we have to revert the action.
Now, if the "proper" new procedure is to first detach before deleting the port, we can look at attempting that. But, in the common failure scenarios we see nova failing to complete this, if for example the compute host has been powered off. In this scenario we still need to delete the neutron port for both resource cleanup and quota reasons. This so we can create a new port and attach it to a new instance to recover.
If Octavai also deletes the VM then force deleting the port is OK from placement resource prespective as the VM delete will make sure we are deleting the leaked port resources.
I think this change will impact our current port manage flows, so we should proceed cautiously, test heavily, and potentially address some of the nova failure scenarios at the same time.
After talking to rm_work on #openstack-nova [1] it feels that the policy based solution would work for Octavia. So Octavia with the extra policy can still delete the bound port in Neutron safely as Ocatavia also deletes the VM that the port was bound to. That VM delete will reclaim the leaked port resource. The failure to detach a port via nova while the nova-compute is down could be a bug on nova side. cheers, gibi [1] http://eavesdrop.openstack.org/irclogs/%23openstack-nova/%23openstack-nova.2...
Michael
On Fri, May 3, 2019 at 5:23 PM Akihiro Motoki <amotoki@gmail.com> wrote:
On Fri, May 3, 2019 at 4:11 PM Matt Riedemann <mriedemos@gmail.com> wrote:
2) Matt had a point after the session that if Neutron enforces
On 5/3/2019 3:35 PM, Balázs Gibizer wrote: that
only unbound port can be deleted then not only Nova needs to be changed to unbound a port before delete it, but possibly other Neutron consumers (Octavia?).
And potentially Zun, there might be others, Magnum, Heat, idk?
Anyway, this is a thing that has been around forever which admins shouldn't do, do we need to prioritize making this change in both neutron and nova to make two requests to delete a bound port? Or is just logging the ERROR that you've leaked allocations, tsk tsk, enough? I tend to think the latter is fine until someone comes along saying this is really hurting them and they have a valid use case for deleting bound ports out of band from nova.
neutron deines a special role called "advsvc" for advanced network services [1]. I think we can change neutron to block deletion of bound ports for regular users and allow users with "advsvc" role to delete bound ports. I haven't checked which projects currently use "advsvc".
--
Thanks,
Matt