Jay, for recovering vm state use the command nova reset-state....
nova help reset-state to check the command requested parameters.
Ad far as evacuation la concerned, how many compute nodes do gli have ? Instance live migration works? Are gli using shared cinder storage? Ignazio
Il Gio 11 Lug 2019 20:51 Jay See jayachander.it@gmail.com ha scritto:
Thanks for explanation Ignazio.
I have tried same same by trying to put the compute node on a failure (echo 'c' > /proc/sysrq-trigger ). Compute node was stuck and I was not able connect to it. All the VMs are now in Error state.
Running the host-evacaute was successful on controller node, but now I am not able to use the VMs. Because they are all in error state now.
root@h004:~$ nova host-evacuate h017
+--------------------------------------+-------------------+---------------+ | Server UUID | Evacuate Accepted | Error Message |
+--------------------------------------+-------------------+---------------+ | f3545f7d-b85e-49ee-b407-333a4c5b5ab9 | True | | | 9094494b-cfa3-459b-8d51-d9aae0ea9636 | True | | | abe7075b-ac22-4168-bf3d-d302ba37d80e | True | | | c9919371-5f2e-4155-a01a-5f41d9c8b0e7 | True | | | ffd983bb-851e-4314-9d1d-375303c278f3 | True | |
+--------------------------------------+-------------------+---------------+
Now I have restarted the compute node manually , now I am able to connect to the compute node but VMs are still in Error state.
- Any ideas, how to recover the VMs?
- Are there any other methods to evacuate, as this method seems to be not
working in mitaka version.
~Jay.
On Thu, Jul 11, 2019 at 1:33 PM Ignazio Cassano ignaziocassano@gmail.com wrote:
Ok Jay, let me to describe my environment. I have an openstack made up of 3 controllers nodes ad several compute nodes. The controller nodes services are controlled by pacemaker and the compute nodes services are controlled by remote pacemaker. My hardware is Dell so I am using ipmi fencing device . I wrote a service controlled by pacemaker: this service controls if a compude node fails and for avoiding split brains if a compute node does nod respond on the management network and on storage network the stonith poweroff the node and then execute a nova host-evacuate.
Anycase to have a simulation before writing the service I described above you can do as follows:
connect on one compute node where some virtual machines are running run the command: echo 'c' > /proc/sysrq-trigger (it stops immediately the node like in case of failure) On a controller node run: nova host-evacuate "name of failed compute node" Instances running on the failed compute node should be restarted on another compute node
Ignazio
Il giorno gio 11 lug 2019 alle ore 11:57 Jay See < jayachander.it@gmail.com> ha scritto:
Hi ,
I have tried on a failed compute node which is in power off state now. I have tried on a running compute node, no errors. But nothing happens. On running compute node - Disabled the compute service and tried migration also.
May be I might have not followed proper steps. Just wanted to know the steps you have followed. Otherwise, I was planning to manual migration also if possible. ~Jay.
On Thu, Jul 11, 2019 at 11:52 AM Ignazio Cassano < ignaziocassano@gmail.com> wrote:
Hi Jay, would you like to evacuate a failed compute node or evacuate a running compute node ?
Ignazio
Il giorno gio 11 lug 2019 alle ore 11:48 Jay See < jayachander.it@gmail.com> ha scritto:
Hi Ignazio,
I am trying to evacuate the compute host on older version (mitaka). Could please share the process you followed. I am not able to succeed with openstack live-migration fails with error message (this is known issue in older versions) and nova live-ligration - nothing happens even after initiating VM migration. It is almost 4 days.
~Jay.
On Thu, Jul 11, 2019 at 11:31 AM Ignazio Cassano < ignaziocassano@gmail.com> wrote:
I am sorry. For simulating an host crash I used a wrong procedure. Using "echo 'c' > /proc/sysrq-trigger" all work fine
Il giorno gio 11 lug 2019 alle ore 11:01 Ignazio Cassano < ignaziocassano@gmail.com> ha scritto:
> Hello All, > on ocata when I poweroff a node with active instance , doing a nova > host-evacuate works fine > and instances are restartd on an active node. > On queens it does non evacuate instances but nova-api reports for > each instance the following: > > 2019-07-11 10:19:54.745 13811 INFO nova.api.openstack.wsgi > [req-daad0a7d-87ce-41bf-b096-a70fc306db5c 0c7a2d6006614fe2b3e81e47377dd2a9 > c26f8d35f85547c4add392a221af1aab - default default] HTTP exception thrown: > Cannot 'evacuate' instance e8485a5e-3623-4184-bcce-cafd56fa60b3 while it is > in task_state powering-off > > So it poweroff all instance on the failed node but does not start > them on active nodes > > What is changed ? > Ignazio > > >
-- P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.*
-- P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.*
-- P *SAVE PAPER – Please do not print this e-mail unless absolutely necessary.*