[queens][nova] nova host-evacuate errot

Ignazio Cassano ignaziocassano at gmail.com
Fri Jul 12 11:18:01 UTC 2019


Ok. But your virtual machine are using a root volume on cinder or are
ephemeral ?

Anycase when you try a  live migration, look at nova compute log on the kvm
node where the instance is migrate from

Il giorno ven 12 lug 2019 alle ore 12:48 Jay See <jayachander.it at gmail.com>
ha scritto:

> Yes, cinder is running.
>
> root at h017:~$ service --status-all | grep cinder
> [ + ]  cinder-volume
>
> On Fri, Jul 12, 2019 at 11:53 AM Ignazio Cassano <ignaziocassano at gmail.com>
> wrote:
>
>> Sorry ...the question was : how many compute nodes do you have ?
>> instead of how many compute nodes do gli have...
>>
>>
>> Anycase;
>> Did you configured cinder ?
>>
>> Il giorno ven 12 lug 2019 alle ore 11:26 Jay See <
>> jayachander.it at gmail.com> ha scritto:
>>
>>> Ignazio,
>>>
>>> One instance is stuck in error state not able to recover it. All other
>>> instances are running now.
>>>
>>> root at h004:~$ nova reset-state --all-tenants my-instance-1-2
>>> Reset state for server my-instance-1-2 succeeded; new state is error
>>>
>>> I have several compute nodes (14). I am not sure what is gli?
>>> Live migration is not working, i have tried it was not throwing any
>>> errors. But nothing seems to happen.
>>> I am not completely sure, I haven't heard about gli before. (This setup
>>> is deployed by someone else).
>>>
>>> ~Jay.
>>>
>>> On Fri, Jul 12, 2019 at 6:12 AM Ignazio Cassano <
>>> ignaziocassano at gmail.com> wrote:
>>>
>>>> Jay,  for recovering vm state use the command nova reset-state....
>>>>
>>>> nova help reset-state to check the command requested parameters.
>>>>
>>>> Ad far as evacuation la concerned, how many compute nodes do gli have
>>>> ?
>>>> Instance live migration works?
>>>> Are gli using shared cinder storage?
>>>> Ignazio
>>>>
>>>> Il Gio 11 Lug 2019 20:51 Jay See <jayachander.it at gmail.com> ha scritto:
>>>>
>>>>> Thanks for explanation Ignazio.
>>>>>
>>>>> I have tried same same by trying to put the compute node on a failure
>>>>> (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not
>>>>> able connect to it.
>>>>> All the VMs are now in Error state.
>>>>>
>>>>> Running the host-evacaute was successful on controller node, but now I
>>>>> am not able to use the VMs. Because they are all in error state now.
>>>>>
>>>>> root at h004:~$ nova host-evacuate h017
>>>>>
>>>>> +--------------------------------------+-------------------+---------------+
>>>>> | Server UUID                          | Evacuate Accepted | Error
>>>>> Message |
>>>>>
>>>>> +--------------------------------------+-------------------+---------------+
>>>>> | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True              |
>>>>>     |
>>>>> | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True              |
>>>>>     |
>>>>> | abe7075b-ac22-4168-bf3d-d302ba37d80e | True              |
>>>>>     |
>>>>> | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True              |
>>>>>     |
>>>>> | ffd983bb-851e-4314-9d1d-375303c278f3 | True              |
>>>>>     |
>>>>>
>>>>> +--------------------------------------+-------------------+---------------+
>>>>>
>>>>> Now I have restarted the compute node manually , now I am able to
>>>>> connect to the compute node but VMs are still in Error state.
>>>>> 1. Any ideas, how to recover the VMs?
>>>>> 2. Are there any other methods to evacuate, as this method seems to be
>>>>> not working in mitaka version.
>>>>>
>>>>> ~Jay.
>>>>>
>>>>> On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano <
>>>>> ignaziocassano at gmail.com> wrote:
>>>>>
>>>>>> Ok Jay,
>>>>>> let me to describe my environment.
>>>>>> I have an openstack made up of 3 controllers nodes ad several compute
>>>>>> nodes.
>>>>>> The controller nodes services are controlled by pacemaker and the
>>>>>> compute nodes services are controlled by remote pacemaker.
>>>>>> My hardware is Dell so I am using ipmi fencing device .
>>>>>> I wrote a service controlled by pacemaker:
>>>>>> this service controls if a compude node fails and for avoiding split
>>>>>> brains if a compute node does nod respond on the management network and on
>>>>>> storage network the stonith poweroff the node and then execute a nova
>>>>>> host-evacuate.
>>>>>>
>>>>>> Anycase to have a simulation before writing the service I described
>>>>>> above you can do as follows:
>>>>>>
>>>>>> connect on one compute node where some virtual machines are running
>>>>>> run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately
>>>>>> the node like in case of failure)
>>>>>> On a controller node run:  nova host-evacuate "name of failed compute
>>>>>> node"
>>>>>> Instances running on the failed compute node should be restarted on
>>>>>> another compute node
>>>>>>
>>>>>>
>>>>>> Ignazio
>>>>>>
>>>>>> Il giorno gio 11 lug 2019 alle ore 11:57 Jay See <
>>>>>> jayachander.it at gmail.com> ha scritto:
>>>>>>
>>>>>>> Hi ,
>>>>>>>
>>>>>>> I have tried on a failed compute node which is in power off state
>>>>>>> now.
>>>>>>> I have tried on a running compute node, no errors. But
>>>>>>> nothing happens.
>>>>>>> On running compute node - Disabled the compute service and tried
>>>>>>> migration also.
>>>>>>>
>>>>>>> May be I might have not followed proper steps. Just wanted to know
>>>>>>> the steps you have followed. Otherwise, I was planning to manual migration
>>>>>>> also if possible.
>>>>>>> ~Jay.
>>>>>>>
>>>>>>> On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano <
>>>>>>> ignaziocassano at gmail.com> wrote:
>>>>>>>
>>>>>>>> Hi Jay,
>>>>>>>> would you like to evacuate a failed compute node or evacuate a
>>>>>>>> running compute node ?
>>>>>>>>
>>>>>>>> Ignazio
>>>>>>>>
>>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:48 Jay See <
>>>>>>>> jayachander.it at gmail.com> ha scritto:
>>>>>>>>
>>>>>>>>> Hi Ignazio,
>>>>>>>>>
>>>>>>>>> I am trying to evacuate the compute host on older version (mitaka).
>>>>>>>>> Could please share the process you followed. I am not able to
>>>>>>>>> succeed with openstack live-migration fails with error message (this is
>>>>>>>>> known issue in older versions) and nova live-ligration - nothing happens
>>>>>>>>> even after initiating VM migration. It is almost 4 days.
>>>>>>>>>
>>>>>>>>> ~Jay.
>>>>>>>>>
>>>>>>>>> On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano <
>>>>>>>>> ignaziocassano at gmail.com> wrote:
>>>>>>>>>
>>>>>>>>>> I am sorry.
>>>>>>>>>> For simulating an host crash I used a wrong procedure.
>>>>>>>>>> Using  "echo 'c' > /proc/sysrq-trigger" all work fine
>>>>>>>>>>
>>>>>>>>>> Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano <
>>>>>>>>>> ignaziocassano at gmail.com> ha scritto:
>>>>>>>>>>
>>>>>>>>>>> Hello All,
>>>>>>>>>>> on ocata when I  poweroff a node with active instance , doing a
>>>>>>>>>>> nova host-evacuate works  fine
>>>>>>>>>>> and instances are restartd on an active node.
>>>>>>>>>>> On queens it does non evacuate instances but nova-api reports
>>>>>>>>>>> for each instance the following:
>>>>>>>>>>>
>>>>>>>>>>> 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi
>>>>>>>>>>> [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9
>>>>>>>>>>> c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown:
>>>>>>>>>>> Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is
>>>>>>>>>>> in task_state powering-off
>>>>>>>>>>>
>>>>>>>>>>> So it poweroff all instance on the failed node but does not
>>>>>>>>>>> start them on active nodes
>>>>>>>>>>>
>>>>>>>>>>> What is changed ?
>>>>>>>>>>> Ignazio
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>> --
>>>>>>>>>>>>>>>>>> P  *SAVE PAPER – Please do not print this e-mail unless
>>>>>>>>> absolutely necessary.*
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>> --
>>>>>>>>>>>>>> P  *SAVE PAPER – Please do not print this e-mail unless absolutely
>>>>>>> necessary.*
>>>>>>>
>>>>>>
>>>>>
>>>>> --
>>>>>>>>>> P  *SAVE PAPER – Please do not print this e-mail unless absolutely
>>>>> necessary.*
>>>>>
>>>>
>>>
>>> --
>>>>>> P  *SAVE PAPER – Please do not print this e-mail unless absolutely
>>> necessary.*
>>>
>>
>
> --
>> P  *SAVE PAPER – Please do not print this e-mail unless absolutely
> necessary.*
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.openstack.org/pipermail/openstack-discuss/attachments/20190712/702d68e7/attachment-0001.html>


More information about the openstack-discuss mailing list