[neutron][ml2/ovn] unintended changes in trunk status ACTIVE
Hi Neutron Team, A heads up to OVN folks: We believe we have a problem in Neutron that we move a trunk to ACTIVE status too early (changing the meaning of status=ACTIVE) - already after its parent port is updated to ACTIVE instead of waiting for all of its ports (parent and sub) to be processed. We discovered this while working on this bug: https://bugs.launchpad.net/neutron/+bug/2095152 The problem was introduced by this patch: https://review.opendev.org/c/openstack/neutron/+/853779 Which was intended to be a fix for ml2/ovn leaving a trunk in DOWN after live migration. However that patch affected all backends. And affected not just live migration, but create too. We have proposed the following patch to limit the effect to ml2/ovn (but did not yet find a way to avoid affecting create): https://review.opendev.org/c/openstack/neutron/+/949217 However due to ml2/ovn's handling of trunk subports (not cascading the parent's binding:host update to all subports in the neutron api layer) I don't know how this should be fixed for ml2/ovn. Maybe a clever wait in the OVN databases could solve this. People with better OVN knowledge may want look into this. You're also welcome to review our proposed patches, including this os-vif change to prevent a resource leak in ml2/ovs when a trunk is already deleted on server side when the agent is still processing its create: https://review.opendev.org/c/openstack/os-vif/+/949736 Thanks, Bence (rubasov)
participants (1)
-
Bence Romsics