[kolla-ansible] Default masakari configuration should evacuate instances across nodes in wallaby?

Radosław Piliszek radoslaw.piliszek at gmail.com
Mon Dec 6 20:41:03 UTC 2021


Then I suggest looking at nova api and nova compute logs correlated with
these ERROR timestamps.

-yoctozepto

On Mon, 6 Dec 2021 at 21:11, Rodrigo Lima <rodrigo.lima at o2sistemas.com>
wrote:

> Hi!
>
> It´s a lot of information. From engine:
> 2021-12-06 20:02:33.502 7 INFO masakari.engine.manager
> [req-034e0a9e-1b34-4db3-a2e0-6d61a369ba6c 6c128366b66346d38fb5493adf0cf666
> e39a7ea7d17046b5b97b2253bf8195bc - - -] Processing notification
> 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST
> 2021-12-06 20:02:34.052 7 INFO masakari.compute.nova
> [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Disable
> nova-compute on ctl02-hml.amt.net.br
> 2021-12-06 20:02:34.113 7 INFO
> masakari.engine.drivers.taskflow.host_failure
> [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Sleeping 180 sec
> before starting recovery thread until nova recognizes the node down.
> 2021-12-06 20:05:34.130 7 INFO masakari.compute.nova
> [req-c570bdef-0786-47bd-94fc-e6c1396eabb1 nova - - - -] Fetch Server list
> on ctl02-hml.amt.net.br
> 2021-12-06 20:05:35.166 7 INFO masakari.compute.nova
> [req-adec6e54-15ff-4a28-98c3-01070ba2cf87 nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:35.949 7 INFO masakari.compute.nova
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:35.955 7 INFO masakari.compute.nova
> [req-e548525d-74f7-42c4-8045-2f63d645f76f nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:36.739 7 INFO masakari.compute.nova
> [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call lock server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:36.747 7 INFO masakari.compute.nova
> [req-bc3bdea3-e923-4c08-bf84-bea711465dce nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:37.391 7 INFO masakari.compute.nova
> [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call evacuate
> command for instance 618e44e8-248f-4f50-a760-581972352af8 on host None
> 2021-12-06 20:05:37.478 7 INFO masakari.compute.nova
> [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call lock server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:38.059 7 INFO masakari.compute.nova
> [req-d57bd685-2e42-48e0-bf7d-9978ac516451 nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:38.102 7 INFO masakari.compute.nova
> [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call evacuate
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887 on host None
> 2021-12-06 20:05:38.734 7 INFO masakari.compute.nova
> [req-25b4bb9c-2e05-409a-bae8-fd7c8348ba5b nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:39.062 7 INFO masakari.compute.nova
> [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.732 7 INFO masakari.compute.nova
> [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall
> [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Fixed interval
> looping call
> 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation'
> failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate
> instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall Traceback (most
> recent call last):
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py",
> line 150, in _run_loop
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     result =
> func(*self.args, **self.kw)
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 207, in _wait_for_evacuation_confirmation
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     raise
> exception.InstanceEvacuateFailed(
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall
> 2021-12-06 20:05:39.790 7 WARNING
> masakari.engine.drivers.taskflow.host_failure
> [req-d537ed5e-f5c8-4ba5-b559-0156883ca02b nova - - - -] Failed to evacuate
> instance 618e44e8-248f-4f50-a760-581972352af8:
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.793 7 INFO masakari.compute.nova
> [req-c02a47f3-fc23-4d5b-9167-cf21bf8923db nova - - - -] Call unlock server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall
> [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Fixed interval
> looping call
> 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation'
> failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate
> instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall Traceback (most
> recent call last):
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py",
> line 150, in _run_loop
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     result =
> func(*self.args, **self.kw)
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 207, in _wait_for_evacuation_confirmation
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     raise
> exception.InstanceEvacuateFailed(
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall
> 2021-12-06 20:05:40.423 7 WARNING
> masakari.engine.drivers.taskflow.host_failure
> [req-f8061ba7-353b-487c-8813-30f85216335f nova - - - -] Failed to evacuate
> instance 746178b2-14ce-4ce2-83f6-2bf9d613a887:
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:40.425 7 INFO masakari.compute.nova
> [req-7f0c16fe-8eaa-485b-a414-945fcd84a3c6 nova - - - -] Call unlock server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:41.080 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned
> into state 'FAILURE' from state 'RUNNING'
> 4 predecessors (most recent first):
>   Flow 'post_tasks'
>   |__Flow 'main_tasks'
>      |__Flow 'pre_tasks'
>         |__Flow 'instance_evacuate_engine':
> masakari.exception.HostRecoveryFailureException: Failed to evacuate
> instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br'
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> Traceback (most recent call last):
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/taskflow/engines/action_engine/executor.py",
> line 53, in _execute_task
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
>   result = task.execute(**arguments)
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 396, in execute
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
>   _do_evacuate(self.context, host_name, instance_list)
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 376, in _do_evacuate
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
>   raise exception.HostRecoveryFailureException(
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> masakari.exception.HostRecoveryFailureException: Failed to evacuate
> instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br'
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> 2021-12-06 20:05:41.088 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned
> into state 'REVERTED' from state 'REVERTING'
> 2021-12-06 20:05:41.091 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'PrepareHAEnabledInstancesTask' (21c32e80-7521-44bb-bd28-53f88c3d13da)
> transitioned into state 'REVERTED' from state 'REVERTING'
> 2021-12-06 20:05:41.093 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'DisableComputeServiceTask' (b3522750-c7a6-4f71-ad8c-e2a61a40e2b8)
> transitioned into state 'REVERTED' from state 'REVERTING'
> 2021-12-06 20:05:41.095 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Flow
> 'instance_evacuate_engine' (3c8c5def-39b8-4956-8b2c-62a218e612ee)
> transitioned into state 'REVERTED' from state 'RUNNING'
> 2021-12-06 20:05:41.096 7 ERROR masakari.engine.manager
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Failed to process
> notification '2468563d-70c5-41ef-ad04-a51b4ad3dd4d'. Reason: Failed to
> evacuate instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br':
> masakari.exception.HostRecoveryFailureException: Failed to evacuate
> instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br'
> 2021-12-06 20:05:41.099 7 INFO masakari.engine.manager
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Notification
> 2468563d-70c5-41ef-ad04-a51b4ad3dd4d exits with status: error.
> 2021-12-06 20:07:08.803 7 INFO masakari.engine.manager
> [req-d2c18313-253f-4bb2-8abb-5ca5d8ac248a nova - - - -] Processing
> notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST
> 2021-12-06 20:07:09.398 7 INFO masakari.compute.nova
> [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Disable
> nova-compute on ctl02-hml.amt.net.br
> 2021-12-06 20:07:09.467 7 INFO
> masakari.engine.drivers.taskflow.host_failure
> [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Sleeping 180 sec
> before starting recovery thread until nova recognizes the node
> down.2021-12-06 20:02:33.502 7 INFO masakari.engine.manager
> [req-034e0a9e-1b34-4db3-a2e0-6d61a369ba6c 6c128366b66346d38fb5493adf0cf666
> e39a7ea7d17046b5b97b2253bf8195bc - - -] Processing notification
> 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST
> 2021-12-06 20:02:34.052 7 INFO masakari.compute.nova
> [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Disable
> nova-compute on ctl02-hml.amt.net.br
> 2021-12-06 20:02:34.113 7 INFO
> masakari.engine.drivers.taskflow.host_failure
> [req-882d4151-0549-4cb2-acbb-90509073179c nova - - - -] Sleeping 180 sec
> before starting recovery thread until nova recognizes the node down.
> 2021-12-06 20:05:34.130 7 INFO masakari.compute.nova
> [req-c570bdef-0786-47bd-94fc-e6c1396eabb1 nova - - - -] Fetch Server list
> on ctl02-hml.amt.net.br
> 2021-12-06 20:05:35.166 7 INFO masakari.compute.nova
> [req-adec6e54-15ff-4a28-98c3-01070ba2cf87 nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:35.949 7 INFO masakari.compute.nova
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:35.955 7 INFO masakari.compute.nova
> [req-e548525d-74f7-42c4-8045-2f63d645f76f nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:36.739 7 INFO masakari.compute.nova
> [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call lock server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:36.747 7 INFO masakari.compute.nova
> [req-bc3bdea3-e923-4c08-bf84-bea711465dce nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:37.391 7 INFO masakari.compute.nova
> [req-9cea2ff2-0d32-42a1-8605-b889d95cadc0 nova - - - -] Call evacuate
> command for instance 618e44e8-248f-4f50-a760-581972352af8 on host None
> 2021-12-06 20:05:37.478 7 INFO masakari.compute.nova
> [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call lock server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:38.059 7 INFO masakari.compute.nova
> [req-d57bd685-2e42-48e0-bf7d-9978ac516451 nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:38.102 7 INFO masakari.compute.nova
> [req-84d38844-ae87-4767-b822-0099bc40aa16 nova - - - -] Call evacuate
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887 on host None
> 2021-12-06 20:05:38.734 7 INFO masakari.compute.nova
> [req-25b4bb9c-2e05-409a-bae8-fd7c8348ba5b nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:39.062 7 INFO masakari.compute.nova
> [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Call get server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.732 7 INFO masakari.compute.nova
> [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Call get server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall
> [req-3d69363b-aa6c-4752-be52-285bc36f25c2 nova - - - -] Fixed interval
> looping call
> 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation'
> failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate
> instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall Traceback (most
> recent call last):
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py",
> line 150, in _run_loop
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     result =
> func(*self.args, **self.kw)
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 207, in _wait_for_evacuation_confirmation
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall     raise
> exception.InstanceEvacuateFailed(
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.778 7 ERROR oslo.service.loopingcall
> 2021-12-06 20:05:39.790 7 WARNING
> masakari.engine.drivers.taskflow.host_failure
> [req-d537ed5e-f5c8-4ba5-b559-0156883ca02b nova - - - -] Failed to evacuate
> instance 618e44e8-248f-4f50-a760-581972352af8:
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:39.793 7 INFO masakari.compute.nova
> [req-c02a47f3-fc23-4d5b-9167-cf21bf8923db nova - - - -] Call unlock server
> command for instance 618e44e8-248f-4f50-a760-581972352af8
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall
> [req-425752db-0599-42a8-bf2a-0229148410cc nova - - - -] Fixed interval
> looping call
> 'masakari.engine.drivers.taskflow.host_failure.EvacuateInstancesTask._evacuate_and_confirm.<locals>._wait_for_evacuation_confirmation'
> failed: masakari.exception.InstanceEvacuateFailed: Failed to evacuate
> instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall Traceback (most
> recent call last):
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/oslo_service/loopingcall.py",
> line 150, in _run_loop
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     result =
> func(*self.args, **self.kw)
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall   File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 207, in _wait_for_evacuation_confirmation
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall     raise
> exception.InstanceEvacuateFailed(
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:40.422 7 ERROR oslo.service.loopingcall
> 2021-12-06 20:05:40.423 7 WARNING
> masakari.engine.drivers.taskflow.host_failure
> [req-f8061ba7-353b-487c-8813-30f85216335f nova - - - -] Failed to evacuate
> instance 746178b2-14ce-4ce2-83f6-2bf9d613a887:
> masakari.exception.InstanceEvacuateFailed: Failed to evacuate instance
> 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:40.425 7 INFO masakari.compute.nova
> [req-7f0c16fe-8eaa-485b-a414-945fcd84a3c6 nova - - - -] Call unlock server
> command for instance 746178b2-14ce-4ce2-83f6-2bf9d613a887
> 2021-12-06 20:05:41.080 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned
> into state 'FAILURE' from state 'RUNNING'
> 4 predecessors (most recent first):
>   Flow 'post_tasks'
>   |__Flow 'main_tasks'
>      |__Flow 'pre_tasks'
>         |__Flow 'instance_evacuate_engine':
> masakari.exception.HostRecoveryFailureException: Failed to evacuate
> instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br'
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> Traceback (most recent call last):
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/taskflow/engines/action_engine/executor.py",
> line 53, in _execute_task
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
>   result = task.execute(**arguments)
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 396, in execute
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
>   _do_evacuate(self.context, host_name, instance_list)
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> File
> "/var/lib/kolla/venv/lib/python3.8/site-packages/masakari/engine/drivers/taskflow/host_failure.py",
> line 376, in _do_evacuate
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
>   raise exception.HostRecoveryFailureException(
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> masakari.exception.HostRecoveryFailureException: Failed to evacuate
> instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br'
> 2021-12-06 20:05:41.080 7 ERROR masakari.engine.drivers.taskflow.driver
> 2021-12-06 20:05:41.088 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'EvacuateInstancesTask' (e49cc088-a9d7-4dd8-a6db-75339c6eaa4d) transitioned
> into state 'REVERTED' from state 'REVERTING'
> 2021-12-06 20:05:41.091 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'PrepareHAEnabledInstancesTask' (21c32e80-7521-44bb-bd28-53f88c3d13da)
> transitioned into state 'REVERTED' from state 'REVERTING'
> 2021-12-06 20:05:41.093 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Task
> 'DisableComputeServiceTask' (b3522750-c7a6-4f71-ad8c-e2a61a40e2b8)
> transitioned into state 'REVERTED' from state 'REVERTING'
> 2021-12-06 20:05:41.095 7 WARNING masakari.engine.drivers.taskflow.driver
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Flow
> 'instance_evacuate_engine' (3c8c5def-39b8-4956-8b2c-62a218e612ee)
> transitioned into state 'REVERTED' from state 'RUNNING'
> 2021-12-06 20:05:41.096 7 ERROR masakari.engine.manager
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Failed to process
> notification '2468563d-70c5-41ef-ad04-a51b4ad3dd4d'. Reason: Failed to
> evacuate instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br':
> masakari.exception.HostRecoveryFailureException: Failed to evacuate
> instances
> '618e44e8-248f-4f50-a760-581972352af8,746178b2-14ce-4ce2-83f6-2bf9d613a887'
> from host 'ctl02-hml.amt.net.br'
> 2021-12-06 20:05:41.099 7 INFO masakari.engine.manager
> [req-49414eed-fdcf-4fd0-b803-e5cc5f2f9227 nova - - - -] Notification
> 2468563d-70c5-41ef-ad04-a51b4ad3dd4d exits with status: error.
> 2021-12-06 20:07:08.803 7 INFO masakari.engine.manager
> [req-d2c18313-253f-4bb2-8abb-5ca5d8ac248a nova - - - -] Processing
> notification 2468563d-70c5-41ef-ad04-a51b4ad3dd4d of type: COMPUTE_HOST
> 2021-12-06 20:07:09.398 7 INFO masakari.compute.nova
> [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Disable
> nova-compute on ctl02-hml.amt.net.br
> 2021-12-06 20:07:09.467 7 INFO
> masakari.engine.drivers.taskflow.host_failure
> [req-5b595769-209f-422b-a71f-ef4507187a61 nova - - - -] Sleeping 180 sec
> before starting recovery thread until nova recognizes the node down.
>
> I tested forcing a kernel panic on target node
>
>
>
>
> Em sáb., 4 de dez. de 2021 às 08:48, Radosław Piliszek <
> radoslaw.piliszek at gmail.com> escreveu:
>
>> What do the Masakari API and Masakari Engine logs show?
>>
>> -yoctozepto
>>
>> On Fri, 3 Dec 2021 at 19:57, Rodrigo Lima <rodrigo.lima at o2sistemas.com>
>> wrote:
>>
>>> I'm working on upgrading an openstack farm from victoria to wallaby.
>>> After successful upgrade, I would like to enable hacluster and masakari to
>>> test HA between failing compute nodes. Everything seems to be running
>>> (pacemaker with the remote nodes OK, corosync without errors,
>>> masakari-monitors detects the 2 compute nodes online), but... when I
>>> simulate the failure of a node with shutdown, the failure and notification
>>> appears in the hostmonitor log, but the instances that were on the failed
>>> node don't evacuate, and I couldn't find documentation that explains how to
>>> do this specific configuration, if necessary. Does anyone have any ideas?
>>>
>>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20211206/7ece74f0/attachment-0001.htm>


More information about the openstack-discuss mailing list