Re: [neutron][nova] [kolla] vif plugged timeout

25 Nov 2021

      Hi,

You can find logs from controller0 and compute0 in attachment (other
controllers and computes were turned off for this test).

Thank you,
Michal Arbet
Openstack Engineer

Ultimum Technologies a.s.
Na Poříčí 1047/26, 11000 Praha 1
Czech Republic

+420 604 228 897
michal.arbet@ultimum.io
*https://ultimum.io <https://ultimum.io/>*

LinkedIn <https://www.linkedin.com/company/ultimum-technologies> | Twitter
<https://twitter.com/ultimumtech> | Facebook
<https://www.facebook.com/ultimumtechnologies/timeline>

čt 25. 11. 2021 v 8:22 odesílatel Slawek Kaplonski <skaplons@redhat.com>
napsal:
...
Hi,
Basically in ML2/OVS case it may be one of 2 reasons why port isn't
provisioned properly quickly:
- neutron-ovs-agent is somehow slow with provisioning it or
- neutron-dhcp-agent is slow provisioning that port.
To check which of those happens really, You can enable debug logs in You
neutron-server and look there for logs like "Port xxx provisioning
completed
by entity L2/DHCP" (or something similar, I don't remember it now exactly).
If it works much faster with noop firewall driver, then it seems that it
is
more likely to be on the neutron-ovs-agent's side.
In such case couple of things to check:
- are You using l2population (it's required with DVR for example),
- are You using SG with rules which references "remote_group_id" (like
default
SG for each tenant does)? If so, can You try to remove from You SG such
rules
and use rules with CIDRs instead? We know that using SG with
remote_group_id
don't scale well and if You have many ports using same SG, it may slow
down
neutron-ovs-agent a lot.
- do You maybe have any other errors in the neutron-ovs-agent logs? Like
rpc
message communication errors or something else? Such errors will trigger
doing
fullsync of all ports on the node so it may take long time to get to
actually
provisioning Your new port sometimes.
- what exactly version of Neutron are You using there?
On sobota, 20 listopada 2021 11:05:16 CET Michal Arbet wrote:
...
Hi,
Has anyone seen issue which I am currently facing ?
When launching heat stack ( but it's same if I launch several of
instances
) vif plugged in timeouts an I don't know why, sometimes it is OK
..sometimes is failing.
Sometimes neutron reports vif plugged in < 10 sec ( test env ) sometimes
it's 100 and more seconds, it seems there is some race condition but I
can't find out where the problem is. But on the end every instance is
spawned ok (retry mechanism worked).
Another finding is that it has to do something with security group, if
noop
driver is used ..everything is working good.
Firewall security setup is openvswitch .
Test env is wallaby.
I will attach some logs when I will be near PC ..
Thank you,
Michal Arbet (Kevko)
--
Slawek Kaplonski
Principal Software Engineer
Red Hat