<div dir="ltr">Then I suggest looking at nova api and nova compute logs correlated with these ERROR timestamps.<div><br></div><div>-yoctozepto</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Mon, 6 Dec 2021 at 21:11, Rodrigo Lima <<a href="mailto:rodrigo.lima@o2sistemas.com">rodrigo.lima@o2sistemas.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">Hi!<div><br></div><div>It´s a lot of information. From engine:</div><div>2021-12-06 20:02:33.502 7 INFO masakari.engine.manager [req-034e0a9e-1b34-4db3-a2e0-6d61a369ba6c 6c128366b66346d38fb5493adf0cf666 e39a7ea7d17046b5b97b2253bf8195bc - - -] Processing notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST<br>2021-12-06 20:02:34.052 7 INFO masakari.compute.nova [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Disable nova-compute on <a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a><br>2021-12-06 20:02:34.113 7 INFO masakari.engine.drivers.taskflow.host_failure [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Sleeping 180 sec before starting recovery thread until nova recognizes the node down.<br>2021-12-06 20:05:34.130 7 INFO masakari.compute.nova [req-c570bdef-0786-47bd-94fc-e6c1396eabb1 nova - - - -] Fetch Server list on <a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a><br>2021-12-06 20:05:35.166 7 INFO masakari.compute.nova [req-adec6e54-15ff-4a28-98c3-01070ba2cf87 nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:35.949 7 INFO masakari.compute.nova [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:35.955 7 INFO masakari.compute.nova [req-e548525d-74f7-42c4-8045-2f63d645f76f nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:36.739 7 INFO masakari.compute.nova [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call lock server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:36.747 7 INFO masakari.compute.nova [req-bc3bdea3-e923-4c08-bf84-bea711465dce nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:37.391 7 INFO masakari.compute.nova [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call evacuate command for instance 618e44e8-248f-4f50-a760-581972352af8 on host None<br>2021-12-06 20:05:37.478 7 INFO masakari.compute.nova [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call lock server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:38.059 7 INFO masakari.compute.nova [req-d57bd685-2e42-48e0-bf7d-9978ac516451 nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:38.102 7 INFO masakari.compute.nova [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call evacuate command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887 on host None<br>2021-12-06 20:05:38.734 7 INFO masakari.compute.nova [req-25b4bb9c-2e05-409a-bae8-fd7c8348ba5b nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:39.062 7 INFO masakari.compute.nova [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.732 7 INFO masakari.compute.nova [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Fixed interval looping call 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation' failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall Traceback (most recent call last):<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     result = func(*self.args, **<a href="http://self.kw" target="_blank">self.kw</a>)<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 207, in _wait_for_evacuation_confirmation<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     raise exception.InstanceEvacuateFailed(<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall<br>2021-12-06 20:05:39.790 7 WARNING masakari.engine.drivers.taskflow.host_failure [req-d537ed5e-f5c8-4ba5-b559-0156883ca02b nova - - - -] Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.793 7 INFO masakari.compute.nova [req-c02a47f3-fc23-4d5b-9167-cf21bf8923db nova - - - -] Call unlock server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Fixed interval looping call 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation' failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall Traceback (most recent call last):<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     result = func(*self.args, **<a href="http://self.kw" target="_blank">self.kw</a>)<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 207, in _wait_for_evacuation_confirmation<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     raise exception.InstanceEvacuateFailed(<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall<br>2021-12-06 20:05:40.423 7 WARNING masakari.engine.drivers.taskflow.host_failure [req-f8061ba7-353b-487c-8813-30f85216335f nova - - - -] Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:40.425 7 INFO masakari.compute.nova [req-7f0c16fe-8eaa-485b-a414-945fcd84a3c6 nova - - - -] Call unlock server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:41.080 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned into state 'FAILURE' from state 'RUNNING'<br>4 predecessors (most recent first):<br>  Flow 'post_tasks'<br>  |__Flow 'main_tasks'<br>     |__Flow 'pre_tasks'<br>        |__Flow 'instance_evacuate_engine': masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>'<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver Traceback (most recent call last):<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver   File "/var/lib/kolla/venv/lib/python3.8/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver     result = task.execute(**arguments)<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 396, in execute<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver     _do_evacuate(self.context, host_name, instance_list)<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 376, in _do_evacuate<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver     raise exception.HostRecoveryFailureException(<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>'<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver<br>2021-12-06 20:05:41.088 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned into state 'REVERTED' from state 'REVERTING'<br>2021-12-06 20:05:41.091 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'PrepareHAEnabledInstancesTask' (21c32e80-7521-44bb-bd28-53f88c3d13da) transitioned into state 'REVERTED' from state 'REVERTING'<br>2021-12-06 20:05:41.093 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'DisableComputeServiceTask' (b3522750-c7a6-4f71-ad8c-e2a61a40e2b8) transitioned into state 'REVERTED' from state 'REVERTING'<br>2021-12-06 20:05:41.095 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Flow 'instance_evacuate_engine' (3c8c5def-39b8-4956-8b2c-62a218e612ee) transitioned into state 'REVERTED' from state 'RUNNING'<br>2021-12-06 20:05:41.096 7 ERROR masakari.engine.manager [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Failed to process notification '2468563d-70c5-41ef-ad04-a51b4ad3dd4d'. Reason: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>': masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>'<br>2021-12-06 20:05:41.099 7 INFO masakari.engine.manager [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d exits with status: error.<br>2021-12-06 20:07:08.803 7 INFO masakari.engine.manager [req-d2c18313-253f-4bb2-8abb-5ca5d8ac248a nova - - - -] Processing notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST<br>2021-12-06 20:07:09.398 7 INFO masakari.compute.nova [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Disable nova-compute on <a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a><br>2021-12-06 20:07:09.467 7 INFO masakari.engine.drivers.taskflow.host_failure [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Sleeping 180 sec before starting recovery thread until nova recognizes the node down.2021-12-06 20:02:33.502 7 INFO masakari.engine.manager [req-034e0a9e-1b34-4db3-a2e0-6d61a369ba6c 6c128366b66346d38fb5493adf0cf666 e39a7ea7d17046b5b97b2253bf8195bc - - -] Processing notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST<br>2021-12-06 20:02:34.052 7 INFO masakari.compute.nova [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Disable nova-compute on <a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a><br>2021-12-06 20:02:34.113 7 INFO masakari.engine.drivers.taskflow.host_failure [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Sleeping 180 sec before starting recovery thread until nova recognizes the node down.<br>2021-12-06 20:05:34.130 7 INFO masakari.compute.nova [req-c570bdef-0786-47bd-94fc-e6c1396eabb1 nova - - - -] Fetch Server list on <a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a><br>2021-12-06 20:05:35.166 7 INFO masakari.compute.nova [req-adec6e54-15ff-4a28-98c3-01070ba2cf87 nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:35.949 7 INFO masakari.compute.nova [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:35.955 7 INFO masakari.compute.nova [req-e548525d-74f7-42c4-8045-2f63d645f76f nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:36.739 7 INFO masakari.compute.nova [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call lock server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:36.747 7 INFO masakari.compute.nova [req-bc3bdea3-e923-4c08-bf84-bea711465dce nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:37.391 7 INFO masakari.compute.nova [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call evacuate command for instance 618e44e8-248f-4f50-a760-581972352af8 on host None<br>2021-12-06 20:05:37.478 7 INFO masakari.compute.nova [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call lock server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:38.059 7 INFO masakari.compute.nova [req-d57bd685-2e42-48e0-bf7d-9978ac516451 nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:38.102 7 INFO masakari.compute.nova [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call evacuate command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887 on host None<br>2021-12-06 20:05:38.734 7 INFO masakari.compute.nova [req-25b4bb9c-2e05-409a-bae8-fd7c8348ba5b nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:39.062 7 INFO masakari.compute.nova [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Call get server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.732 7 INFO masakari.compute.nova [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Call get server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Fixed interval looping call 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation' failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall Traceback (most recent call last):<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     result = func(*self.args, **<a href="http://self.kw" target="_blank">self.kw</a>)<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 207, in _wait_for_evacuation_confirmation<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     raise exception.InstanceEvacuateFailed(<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall<br>2021-12-06 20:05:39.790 7 WARNING masakari.engine.drivers.taskflow.host_failure [req-d537ed5e-f5c8-4ba5-b559-0156883ca02b nova - - - -] Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:39.793 7 INFO masakari.compute.nova [req-c02a47f3-fc23-4d5b-9167-cf21bf8923db nova - - - -] Call unlock server command for instance 618e44e8-248f-4f50-a760-581972352af8<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Fixed interval looping call 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation' failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall Traceback (most recent call last):<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py", line 150, in _run_loop<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     result = func(*self.args, **<a href="http://self.kw" target="_blank">self.kw</a>)<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 207, in _wait_for_evacuation_confirmation<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     raise exception.InstanceEvacuateFailed(<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall<br>2021-12-06 20:05:40.423 7 WARNING masakari.engine.drivers.taskflow.host_failure [req-f8061ba7-353b-487c-8813-30f85216335f nova - - - -] Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887: masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:40.425 7 INFO masakari.compute.nova [req-7f0c16fe-8eaa-485b-a414-945fcd84a3c6 nova - - - -] Call unlock server command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887<br>2021-12-06 20:05:41.080 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned into state 'FAILURE' from state 'RUNNING'<br>4 predecessors (most recent first):<br>  Flow 'post_tasks'<br>  |__Flow 'main_tasks'<br>     |__Flow 'pre_tasks'<br>        |__Flow 'instance_evacuate_engine': masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>'<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver Traceback (most recent call last):<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver   File "/var/lib/kolla/venv/lib/python3.8/site-packages/taskflow/engines/action_engine/executor.py", line 53, in _execute_task<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver     result = task.execute(**arguments)<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 396, in execute<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver     _do_evacuate(self.context, host_name, instance_list)<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver   File "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py", line 376, in _do_evacuate<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver     raise exception.HostRecoveryFailureException(<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>'<br>2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver<br>2021-12-06 20:05:41.088 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned into state 'REVERTED' from state 'REVERTING'<br>2021-12-06 20:05:41.091 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'PrepareHAEnabledInstancesTask' (21c32e80-7521-44bb-bd28-53f88c3d13da) transitioned into state 'REVERTED' from state 'REVERTING'<br>2021-12-06 20:05:41.093 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task 'DisableComputeServiceTask' (b3522750-c7a6-4f71-ad8c-e2a61a40e2b8) transitioned into state 'REVERTED' from state 'REVERTING'<br>2021-12-06 20:05:41.095 7 WARNING masakari.engine.drivers.taskflow.driver [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Flow 'instance_evacuate_engine' (3c8c5def-39b8-4956-8b2c-62a218e612ee) transitioned into state 'REVERTED' from state 'RUNNING'<br>2021-12-06 20:05:41.096 7 ERROR masakari.engine.manager [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Failed to process notification '2468563d-70c5-41ef-ad04-a51b4ad3dd4d'. Reason: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>': masakari.exception.HostRecoveryFailureException: Failed to evacuate instances '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887' from host '<a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a>'<br>2021-12-06 20:05:41.099 7 INFO masakari.engine.manager [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d exits with status: error.<br>2021-12-06 20:07:08.803 7 INFO masakari.engine.manager [req-d2c18313-253f-4bb2-8abb-5ca5d8ac248a nova - - - -] Processing notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST<br>2021-12-06 20:07:09.398 7 INFO masakari.compute.nova [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Disable nova-compute on <a href="http://ctl02-hml.amt.net.br" target="_blank">ctl02-hml.amt.net.br</a><br>2021-12-06 20:07:09.467 7 INFO masakari.engine.drivers.taskflow.host_failure [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Sleeping 180 sec before starting recovery thread until nova recognizes the node down.</div><div><br></div><div>I tested forcing a kernel panic on target node</div><div><br clear="all"><div><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><img><br></div></div></div></div></div><br></div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">Em sáb., 4 de dez. de 2021 às 08:48, Radosław Piliszek <<a href="mailto:radoslaw.piliszek@gmail.com" target="_blank">radoslaw.piliszek@gmail.com</a>> escreveu:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">What do the Masakari API and Masakari Engine logs show?<div><br></div><div>-yoctozepto</div></div><br><div class="gmail_quote"><div dir="ltr" class="gmail_attr">On Fri, 3 Dec 2021 at 19:57, Rodrigo Lima <<a href="mailto:rodrigo.lima@o2sistemas.com" target="_blank">rodrigo.lima@o2sistemas.com</a>> wrote:<br></div><blockquote class="gmail_quote" style="margin:0px 0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><div dir="ltr">I'm working on upgrading an openstack farm from victoria to wallaby. After successful upgrade, I would like to enable hacluster and masakari to test HA between failing compute nodes. Everything seems to be running (pacemaker with the remote nodes OK, corosync without errors, masakari-monitors detects the 2 compute nodes online), but... when I simulate the failure of a node with shutdown, the failure and notification appears in the hostmonitor log, but the instances that were on the failed node don't evacuate, and I couldn't find documentation that explains how to do this specific configuration, if necessary. Does anyone have any ideas?<br clear="all"><div><div dir="ltr"><div dir="ltr"><div><div dir="ltr"><img><br></div></div></div></div></div></div>
</blockquote></div>
</blockquote></div>
</blockquote></div>