we did not backport it due to the db migration bug but its fixed form stein on upstream. given we have not had issue backporting https://review.opendev.org/#/c/591607/ without https://review.opendev.org/#/c/614167/20 downstream i think it would be resonable to do upstream.
If it could be backported to Rocky and maybe even Queens, for those who still run Queens, I’m sure it would be strongly appreciated (at least we would since we wouldn’t have to patch manually when we update packages)
Couldn’t it just have a configuration option to enable it? While I’m not convinced it can fix the root cause of our problem, it could at least contribute to the stability of our and other people’s Openstack cluster. so this is a subtel thing. its not really a nova bug. its an issue where invalid data is returned by neuton and that currupts the nova database. The force refesh will heal nova if and only if the neutron issue that casue the issue in the first place is resovled. if the neutron issue is not fix then the force refresh will contiune to force update the nova networking info cache with incomplete data.
so if you never have a netuon issue that returns invalid data then you will never need this patch if you do for say because you broke the neutron policy file then this backprot will fix the nova database only once the policy issue is corrected. we have had several large customer that have had issue with neutron due to misconfiging the polify file or due to a third part sdn contol who maintianed port information in an external db seperate form neutron. in the case of the policy file customer this self healing worked once they corrected the issue. in the case of the sdn contoler customer it did not until the sdn vendor fix the sdn contols db. once it returned correct data again the periodic task healed nova.
That’s interesting because we run a very basic neutron + openvswitch setup with default policies. Additionally, we have tested the nova patch I mentioned earlier for a long while and it seemed to at least prevent the instances from losing their port. Doesn’t that imply that neutron has consistently returned correct data in our setup in particular? So our issue could be elsewhere? I could be wrong and it’s not a hill I’m willing to die on, I’m just pointing out my own observations. Jean-Philippe Méthot Senior Openstack system administrator Administrateur système Openstack sénior PlanetHoster inc. 4414-4416 Louis B Mayer Laval, QC, H7P 0G1, Canada TEL : +1.514.802.1644 - Poste : 2644 FAX : +1.514.612.0678 CA/US : 1.855.774.4678 FR : 01 76 60 41 43 UK : 0808 189 0423