[neutron] parent ports for trunks being claimed by instances
Nate Johnston
nate.johnston at redhat.com
Tue May 12 14:05:25 UTC 2020
Neutron developers,
I am currently working on an issue with trunk ports that has come up a few times
in my direct experience, and I hope that we can create a long term solution. I
am hoping that developers with experience in trunk ports can validate my
approach here, especially regarding fixing current behavior without introducing
an API regression.
By way of introduction to the specifics of the issue, let me blockquote from the
LP bug I raised for this [1]:
----
When you create a trunk in Neutron you create a parent port for the trunk and
attach the trunk to the parent. Then subports can be created on the trunk. When
instances are created on the trunk, first a port is created and then an instance
is associated with a free port. It looks to me that's this is the oversight in
the logic.
From the perspective of the code, the parent port looks like any other port
attached to the trunk bridge. It doesn't have an instance attached to it so it
looks like it's not being used for anything (which is technically correct). So
it becomes an eligible port for an instance to bind to. That is all fine and
dandy until you go to delete the instance and you get the "Port [port-id] is
currently a parent port for trunk [trunk-id]" exception just as happened here.
Anecdotally, it's seems rare that an instance will actually bind to it, but that
is what happened for the user in this case and I have had several pings over the
past year about people in a similar state.
I propose that when a port is made parent port for a trunk, that the trunk be
established as the owner of the port. That way it will be ineligible for
instances seeking to bind to the port.
----
Clearly the above behavior indicates buggy issue that should be rectified in
master and stable branches. Nobody wants a VM that can't be fully deleted
because the port can't ever be deleted. This is especially egregious when it
causes heat stack deletion failures.
I am mostly concerned that by adding the trunk as an owner of the parent port,
then the trunk will need to be deleted before the parent port can be deleted,
otherwise a PortInUse error will occur when the port is deleted (i.e. on tempest
test teardown). That to me seems indicative of an inadvertent API change. Do
you think it's all right to say that if you delete a port that is a parent port
of a trunk, and that trunk has no other subports, that the trunk deletion is
implicit? Is that the lowest impact to the API that we can incur to resolve
this issue?
Your wisdom is appreciated,
Nate
[1] https://bugs.launchpad.net/neutron/+bug/1878031
More information about the openstack-discuss
mailing list