Hi, You can find logs from controller0 and compute0 in attachment (other controllers and computes were turned off for this test). Thank you, Michal Arbet Openstack Engineer Ultimum Technologies a.s. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic +420 604 228 897 michal.arbet@ultimum.io *https://ultimum.io <https://ultimum.io/>* LinkedIn <https://www.linkedin.com/company/ultimum-technologies> | Twitter <https://twitter.com/ultimumtech> | Facebook <https://www.facebook.com/ultimumtechnologies/timeline> čt 25. 11. 2021 v 8:22 odesílatel Slawek Kaplonski <skaplons@redhat.com> napsal:
Hi,
Basically in ML2/OVS case it may be one of 2 reasons why port isn't provisioned properly quickly: - neutron-ovs-agent is somehow slow with provisioning it or - neutron-dhcp-agent is slow provisioning that port.
To check which of those happens really, You can enable debug logs in You neutron-server and look there for logs like "Port xxx provisioning completed by entity L2/DHCP" (or something similar, I don't remember it now exactly).
If it works much faster with noop firewall driver, then it seems that it is more likely to be on the neutron-ovs-agent's side. In such case couple of things to check: - are You using l2population (it's required with DVR for example), - are You using SG with rules which references "remote_group_id" (like default SG for each tenant does)? If so, can You try to remove from You SG such rules and use rules with CIDRs instead? We know that using SG with remote_group_id don't scale well and if You have many ports using same SG, it may slow down neutron-ovs-agent a lot. - do You maybe have any other errors in the neutron-ovs-agent logs? Like rpc message communication errors or something else? Such errors will trigger doing fullsync of all ports on the node so it may take long time to get to actually provisioning Your new port sometimes. - what exactly version of Neutron are You using there?
On sobota, 20 listopada 2021 11:05:16 CET Michal Arbet wrote:
Hi,
Has anyone seen issue which I am currently facing ?
When launching heat stack ( but it's same if I launch several of instances ) vif plugged in timeouts an I don't know why, sometimes it is OK ..sometimes is failing.
Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes it's 100 and more seconds, it seems there is some race condition but I can't find out where the problem is. But on the end every instance is spawned ok (retry mechanism worked).
Another finding is that it has to do something with security group, if noop driver is used ..everything is working good.
Firewall security setup is openvswitch .
Test env is wallaby.
I will attach some logs when I will be near PC ..
Thank you, Michal Arbet (Kevko)
-- Slawek Kaplonski Principal Software Engineer Red Hat