On Thu, Nov 25 2021 at 11:57:45 AM +0100, Slawek Kaplonski <skaplons@redhat.com> wrote:
Hi,
On czwartek, 25 listopada 2021 10:41:14 CET Rodolfo Alonso Hernandez wrote:
Hello Michal:
I think this thread is related to OVN. In any case, I've analyzed your logs from the Neutron point of view. Those are basically the key points in your logs: [1]. The Neutron server receives the request to create and bind a new port. Both the DHCP agent and the OVS agent provision the port (8a472e87-4a7a-4ad4-9fbb-fd9785136611).
Neutron server sends the "network-vif-plugged" event at 08:51:05.635 and receives the ACK from Nova at 08:51:05.704. Nova server creates the corresponding event for the instance on the compute0: 2021-11-25 08:51:05.692 23 INFO nova.api.openstack.compute.server_external_events [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 7d16babef0524b7dade9a59b0a3569e1 - default default] Creating event network-vif-plugged:8a472e87-4a7a-4ad4-9fbb-fd9785136611 for instance 09eff2ce-356f-430f-ab30-5de58f58d698 on compute0
Nova compute agent receives this server event: 2021-11-25 08:51:05.705 8 DEBUG nova.compute.manager [req-393fcb9a-5c68-4b0a-af1b-661e73d947cb 01a75e3a9a9148218916d3beafae2120 7d16babef0524b7dade9a59b0a3569e1 - default default] [instance: 09eff2ce-356f-430f-ab30-5de58f58d698] Processing event network-vif-plugged-8a472e87-4a7a-4ad4-9fbb-fd9785136611 _process_instance_event
/var/lib/kolla/venv/lib/python3.9/site-packages/nova/compute/manager.py: 10205
Further triagging and log analysis should be done by Nova folks to understand why Nova compute didn't boot this VM. I fail to understand some parts.
Thx Rodolfo. I also took a look at those logs (for server 09eff2ce-356f-430f- ab30-5de58f58d698 which had port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 but I can confirm what You found actually. Port was pretty quickly provisioned and it was reported to Nova. I don't know why Nova didn't unpause that vm.
I took a look at the logs (thanks for providing it!). Nova did not unpause the VM a there was more than one port to be plugged an from one, bf0f9dd2-4fd3-4028-a3dc-e2581ff081d0, there was no vif-plugged event received. See my grepping of the log to show this https://paste.opendev.org/show/811348/ Cheers, gibi
Port creation in Nova: 2021-11-25 08:50:24.250 8 INFO os_vif [req-6eab0f2a- d7f4-4251-96f4-177fc0c62212 647a328aef9b4a8b890f102d8f018576 49231caa42f34da8a9d10229b5c7a5d8 - default default] Successfully plugged vif VIFOpenVSwitch(active=False,address=fa:16:3e:3d:cd:b9,bridge_name='br- int',has_traffic_filtering=True,id=0761ff2f-ff6a-40a5-85b7- c9b40386cbe0,network=Network(cb4e6be4-9bc7-4c34-95dd- dfc3c8e9d93c),plugin='ovs',port_profile=VIFPortProfileOpenVSwitch,preserve_on_delete=True,vif_name='tap0761ff2f- ff')
Neutron finished provisioning of that port and sent notification to nova about 34 seconds later: 2021-11-25 08:51:00.403 30 DEBUG neutron.plugins.ml2.rpc [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Device 0761ff2f- ff6a-40a5-85b7-c9b40386cbe0 up at agent ovs-agent-compute0 update_device_up / var/lib/kolla/venv/lib/python3.9/site-packages/neutron/plugins/ml2/rpc.py:278 2021-11-25 08:51:00.503 29 DEBUG neutron.db.provisioning_blocks [req- a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Provisioning complete for port 67b32ddf-dddd-439e-92be-6a89df49b4af triggered by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:139 2021-11-25 08:51:00.504 29 DEBUG neutron_lib.callbacks.manager [req- a5711e9f-9b15-4a1a-a58d-07952fb8ac34 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] for port, provisioning_complete _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.505 30 DEBUG neutron.db.provisioning_blocks [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning for port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 completed by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:133 2021-11-25 08:51:00.515 30 DEBUG neutron.db.provisioning_blocks [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Provisioning complete for port 0761ff2f-ff6a-40a5-85b7-c9b40386cbe0 triggered by entity L2. provisioning_complete /var/lib/kolla/venv/lib/python3.9/site-packages/neutron/ db/provisioning_blocks.py:139 2021-11-25 08:51:00.516 30 DEBUG neutron_lib.callbacks.manager [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.Ml2Plugin._port_provisioned-8372308'] for port, provisioning_complete _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.701 30 DEBUG neutron_lib.callbacks.manager [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] Notify callbacks ['neutron.plugins.ml2.plugin.SecurityGroupDbMixin._ensure_default_security_group_handler-7972195'] for port, before_update _notify_loop /var/lib/kolla/venv/lib/python3.9/site- packages/neutron_lib/callbacks/manager.py:192 2021-11-25 08:51:00.746 30 DEBUG neutron.notifiers.nova [-] Sending events: [{'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': 'network-vif- plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- c9b40386cbe0'}] send_events /var/lib/kolla/venv/lib/python3.9/site-packages/ neutron/notifiers/nova.py:262 2021-11-25 08:51:00.762 30 DEBUG neutron.plugins.ml2.db [req- df864eb8-0c2e-48ca-a3e7-d90dc70e2aa0 - - - - -] For port 0761ff2f- ff6a-40a5-85b7-c9b40386cbe0, host compute0, got binding levels [PortBindingLevel(driver='openvswitch',host='compute0',level=0,port_id=0761ff2f- ff6a-40a5-85b7-c9b40386cbe0,segment=NetworkSegment(ea3a27b6- c3fa-40e3-9937-71b652cba855),segment_id=ea3a27b6-c3fa-40e3-9937-71b652cba855)] get_binding_level_objs /var/lib/kolla/venv/lib/python3.9/site-packages/ neutron/plugins/ml2/db.py:75 2021-11-25 08:51:00.818 30 INFO neutron.notifiers.nova [-] Nova event response: {'server_uuid': '09eff2ce-356f-430f-ab30-5de58f58d698', 'name': 'network-vif-plugged', 'status': 'completed', 'tag': '0761ff2f-ff6a-40a5-85b7- c9b40386cbe0', 'code': 200}
Regards.
[1]https://paste.opendev.org/show/811273/
On Thu, Nov 25, 2021 at 10:15 AM Michal Arbet <michal.arbet@ultimum.io>
Hello,
In attachment you can find logs for compute0 and controller0 (other computes and controllers were turned off for this test). No OVN used, this stack is based on OVS.
Thank you, Michal Arbet Openstack Engineer
Ultimum Technologies a.s. Na Poříčí 1047/26, 11000 Praha 1 Czech Republic
+420 604 228 897 michal.arbet@ultimum.io *https://ultimum.io <https://ultimum.io/>*
LinkedIn <https://www.linkedin.com/company/ultimum-technologies> | Twitter <https://twitter.com/ultimumtech> | Facebook <https://www.facebook.com/ultimumtechnologies/timeline>
st 24. 11. 2021 v 15:59 odesílatel Rodolfo Alonso Hernandez <
ralonsoh@redhat.com> napsal:
Hello Tony:
Do you have the Neutron server logs? Do you see the "PortBindingChassisUpdateEvent"? When a port is bound, a Port_Bound (SB) event is issued and captured in the Neutron server. That will
wrote: trigger the
port binding process and the "vif-plugged" event. This OVN SB event
should
call "set_port_status_up" and that should write "OVN reports status up for port: %s" in the logs.
Of course, this Neutron method will be called only if the logical switch port is UP.
Regards.
On Tue, Nov 23, 2021 at 11:59 PM Tony Liu <tonyliu0592@hotmail.com>
Hi,
I see such problem from time to time. It's not consistently reproduceable. ====================== 2021-11-23 22:16:28.532 7 INFO nova.compute.manager [req-50817b01-e7ae-4991-94fe-e4c5672c481b - - - - -] [instance: 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] During sync_power_state
wrote: the instance
has a pending task (spawning). Skip. 2021-11-23 22:21:28.342 7 WARNING nova.virt.libvirt.driver [req-814c98a4-3fd9-4607-9abb-5fbe5cef8650 3a4c320d64664d9cb6784b7ea52d618a 21384fe21ebb4339b1a5f2a8f9000ea3 - default default] [instance: 7021bf37-348d-4dfa-b0c5-6a251fceb6cd] Timeout waiting for [('network-vif-plugged', '735d97a3-db7c-42ed-b3be-1596d4cc7f4b')] for instance with vm_state building and task_state spawning.: eventlet.timeout.Timeout: 300 seconds ====================== The VIF/port is activated by OVN ovn-controller to ovn-sb. It seems that, either Neutron didn't capture the update or didn't send message back to nova-compute. Is there any known fix for this problem?
Thanks! Tony
-- Slawek Kaplonski Principal Software Engineer Red Hat