Openstack Instance Snapshot and Rescue Problem

newer
[release] Release countdown for...

meteccinar＠gmail.com

22 Aug 2024 22 Aug '24

12:13 p.m.

Hi everyone, We use Ceph in Cinder as Block Storage. Firsly, we create a VM with volume (volume create automatically) and we write a data in the VM and take an snapshot after wrote data. Then we write data in main VM again and take second snapshot. But, when we try to rescue instance related main VM with first snapshot, there was an error occured. Nova Log: 2024-08-22 08:23:25.594 19 WARNING nova.compute.api [None req-eb3b4856-c8be-4926-be51-99aaec4de0e0 41040b4a21e34904b6906c12b228d8e9 67ddd9f771be452a808ba3458af544f8 - - default default] Unable to rescue an instance using a volume snapshot image with img_block_device_mapping image properties set Horizon Output: Requested rescue image 'ddc2df28-c4f5-434b-9f9a-030e56425fc5' is not supported (HTTP 400) (Request-ID: req-bd8037a1-3cd4-43fe-a945-e809b1f8ce16) Ceph Block: https://imgtr.ee/images/2024/08/22/2108d8aacee12eaf800377593a229659.png Instance Snapshot Image: https://imgtr.ee/images/2024/08/22/ca1fa81683acc0518ac161e7f97fda0e.jpeg Could it be any configuration wrong ? We will wait for your answers. Thanks.

Show replies by date

Tobias Urdin

22 Aug 22 Aug

12:24 p.m.

Hello, The comment in the code [1] says the following: # FIXME(lyarwood): There is currently no support for rescuing # instances using a volume snapshot so fail here before we cast to # the compute. So I think what you’re trying to do is not supported yet. [1] https://github.com/openstack/nova/blob/master/nova/compute/api.py#L4770-L477...

...

On 22 Aug 2024, at 11:13, meteccinar@gmail.com wrote:

Hi everyone,

We use Ceph in Cinder as Block Storage.

Firsly, we create a VM with volume (volume create automatically) and we write a data in the VM and take an snapshot after wrote data. Then we write data in main VM again and take second snapshot.

But, when we try to rescue instance related main VM with first snapshot, there was an error occured.

Nova Log: 2024-08-22 08:23:25.594 19 WARNING nova.compute.api [None req-eb3b4856-c8be-4926-be51-99aaec4de0e0 41040b4a21e34904b6906c12b228d8e9 67ddd9f771be452a808ba3458af544f8 - - default default] Unable to rescue an instance using a volume snapshot image with img_block_device_mapping image properties set

Horizon Output: Requested rescue image 'ddc2df28-c4f5-434b-9f9a-030e56425fc5' is not supported (HTTP 400) (Request-ID: req-bd8037a1-3cd4-43fe-a945-e809b1f8ce16)

Ceph Block: https://imgtr.ee/images/2024/08/22/2108d8aacee12eaf800377593a229659.png

Instance Snapshot Image: https://imgtr.ee/images/2024/08/22/ca1fa81683acc0518ac161e7f97fda0e.jpeg

Could it be any configuration wrong ?

We will wait for your answers. Thanks.

Metehan Cinar

12:36 p.m.

Other Openstack users how to rescue their attached volume instances? What is the best practice this issue? We are already rolling it back by creating new volume and new instance from volume snapshot/backup. How to rollback by don't change instance? Do you have any idea? If you have ideas, we wait your ideas. Thanks for answer.

Eugen Block

1:38 p.m.

I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:

...

Rescuing a volume-backed instance is not supported with this mode.

I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure. [1] https://docs.openstack.org/nova/latest/user/rescue.html Zitat von Metehan Cinar <meteccinar@gmail.com>:

...

Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?

We are already rolling it back by creating new volume and new instance from volume snapshot/backup.

How to rollback by don't change instance?

Do you have any idea? If you have ideas, we wait your ideas.

Thanks for answer.

smooney＠redhat.com

5:42 p.m.

On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:

...

I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:

...
Rescuing a volume-backed instance is not supported with this mode.

I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure.

rescue as you pointed out is not intendded for restoring form a backup its intened to be sued to boot form a resuce disk like Knoppix so that you can run fdisk or similar utils to fix the filesystem. rescuiing volume backed instance has been supported since ussuri https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi... i know at least one operator used that functionality to fix windows guests after the recent cloudstrink issues so that functionality does work. for libvirt configured with iamges_type=rbd rescue shoudl jsut work like any other "local" storage but ill admit i dothink think i have ever checked if addtional deata voluems are attach to the rescuded instace i thought they were but perhaps not.

...

[1] https://docs.openstack.org/nova/latest/user/rescue.html

Zitat von Metehan Cinar <meteccinar@gmail.com>:

...
Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?

We are already rolling it back by creating new volume and new instance from volume snapshot/backup.

How to rollback by don't change instance?

Do you have any idea? If you have ideas, we wait your ideas.

Thanks for answer.

Eugen Block

23 Aug 23 Aug

9:09 a.m.

Hi Sean, Zitat von smooney@redhat.com:

...

On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:

...
I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:

...
Rescuing a volume-backed instance is not supported with this mode.

I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure.

rescue as you pointed out is not intendded for restoring form a backup its intened to be sued to boot form a resuce disk like Knoppix so that you can run fdisk or similar utils to fix the filesystem.

thanks for the confirmation.

...

rescuiing volume backed instance has been supported since ussuri https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

i know at least one operator used that functionality to fix windows guests after the recent cloudstrink issues so that functionality does work.

Interesting, because I just retried that in a Victoria Cloud (haven't in a long time) and it fails with this error: Unable to rescue instance Details Instance bf0a04eb-27ae-4979-86b4-8b5522ede1de cannot be rescued: Cannot rescue a volume-backed instance (HTTP 400)

...

for libvirt configured with iamges_type=rbd rescue shoudl jsut work like any other "local" storage but ill admit i dothink think i have ever checked if addtional deata voluems are attach to the rescuded instace

i thought they were but perhaps not.

I just tried it with another instance with 2 attached volumes (root disk is ephemeral), the attached volumes are not available during rescue mode. We use ceph as backend for all services.

...

...
[1] https://docs.openstack.org/nova/latest/user/rescue.html

Zitat von Metehan Cinar <meteccinar@gmail.com>:

...
Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?

We are already rolling it back by creating new volume and new instance from volume snapshot/backup.

How to rollback by don't change instance?

Do you have any idea? If you have ideas, we wait your ideas.

Thanks for answer.

smooney＠redhat.com

11:13 a.m.

On Fri, 2024-08-23 at 06:09 +0000, Eugen Block wrote:

...

Hi Sean,

Zitat von smooney@redhat.com:

...
On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:

...
I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:

...
Rescuing a volume-backed instance is not supported with this mode.

I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure.

rescue as you pointed out is not intendded for restoring form a backup its intened to be sued to boot form a resuce disk like Knoppix so that you can run fdisk or similar utils to fix the filesystem.

thanks for the confirmation.

...
rescuiing volume backed instance has been supported since ussuri https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

i know at least one operator used that functionality to fix windows guests after the recent cloudstrink issues so that functionality does work.

Interesting, because I just retried that in a Victoria Cloud (haven't in a long time) and it fails with this error:

Unable to rescue instance Details Instance bf0a04eb-27ae-4979-86b4-8b5522ede1de cannot be rescued: Cannot rescue a volume-backed instance (HTTP 400) you have to us the relevant micorversion(2.87) or later and out docs have a slight bug, to rescue a boot form volume instance you must configure the stable device rescue feature on the rescue image.

in our docs we say you have to set hw_rescue_bus or hw_rescue_device when it actully should say the hw_rescue_bus should always be set on the rescue image and hw_rescue_device should be set if its an iso. i.e. hw_rescue_bus=usb hw_rescue_device=cdrom if the rescue image is an iso. but hw_rescue_bus=usb is enough if the image is say a qcow https://docs.openstack.org/nova/latest/user/rescue.html#stable-device-instan... that dependency is briefly called out in https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi... the stable rescue device feaure was added in the same release https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi... thtas what adds hw_rescue_bus and hw_rescue_device

...

...
for libvirt configured with iamges_type=rbd rescue shoudl jsut work like any other "local" storage but ill admit i dothink think i have ever checked if addtional deata voluems are attach to the rescuded instace

i thought they were but perhaps not.

I just tried it with another instance with 2 attached volumes (root disk is ephemeral),

for the recored (root disk is ephemeral) is wrong. in nova an ephemeral disk is an addition nova provisioned disk that is separate form teh root disk i know what you mean but i really hate when people refer to root disk of nova instants as ephemeral when then mean its a nova created volume/disk not a cinder volume. the nova root disk is no more or less ephemeral then a cidner voluem. but have a life time tied to the api object used to create them. if you delete a nova instance you delete its correstponding storage, if you delete a cinder volume you delete its corresponeidn storage. cidner sotrage is not by definiton any more resiliant then nova sotrage, if your using a san or other stoage srever it has not ha gurantess byond that of simple raid, ceph is the expction where it implictly has fault tolleragnce unless you configure it for replcia 1 but you can also back nova by ceph so that a wash. so form my point of view cidner volumes are equaly ephemeral ad nova storage and since we actuly have a thign in our api called ephermal_disk in the flavor api and its something diffent form the root disk i really would like to stop using that the term ephemeral to refer to the root disk incorrectly.

...

the attached volumes are not available during rescue mode. ack. im not sure if that should be unexpected or not. its perhaps something we could make configurable as part fo the rescue call. im not sure why we chosse not to attach them i had assumed they would be attach but just not mounted automatically so you coudl mount them if needed. i guess the logic was you can use rescue to make the vm bootable normally and then boot into the instance normally after unrescue to fix any issue with the voluem.

you can alos just detach the voume and attach it to another vm to fix it but i can see haveign a --with-volumes option when doing rescuse in the furture if there was a need for that.

...

We use ceph as backend for all services.

...
...
[1] https://docs.openstack.org/nova/latest/user/rescue.html

Zitat von Metehan Cinar <meteccinar@gmail.com>:

...
Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?

We are already rolling it back by creating new volume and new instance from volume snapshot/backup.

How to rollback by don't change instance?

Do you have any idea? If you have ideas, we wait your ideas.

Thanks for answer.

Eugen Block

11:28 a.m.

Zitat von smooney@redhat.com:

...

On Fri, 2024-08-23 at 06:09 +0000, Eugen Block wrote:

...
Hi Sean,

Zitat von smooney@redhat.com:

...
On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:

...
I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:

...
Rescuing a volume-backed instance is not supported with this mode.

I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure.

rescue as you pointed out is not intendded for restoring form a backup its intened to be sued to boot form a resuce disk like Knoppix so that you can run fdisk or similar utils to fix the filesystem.

thanks for the confirmation.

...
rescuiing volume backed instance has been supported since ussuri

https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

...
i know at least one operator used that functionality to fix windows guests after the recent cloudstrink issues so that functionality does work.

Interesting, because I just retried that in a Victoria Cloud (haven't in a long time) and it fails with this error:

Unable to rescue instance Details Instance bf0a04eb-27ae-4979-86b4-8b5522ede1de cannot be rescued: Cannot rescue a volume-backed instance (HTTP 400) you have to us the relevant micorversion(2.87) or later and out docs have a slight bug, to rescue a boot form volume instance you must configure the stable device rescue feature on the rescue image.

in our docs we say you have to set hw_rescue_bus or hw_rescue_device when it actully should say the hw_rescue_bus should always be set on the rescue image and hw_rescue_device should be set if its an iso.

i.e. hw_rescue_bus=usb hw_rescue_device=cdrom if the rescue image is an iso. but hw_rescue_bus=usb is enough if the image is say a qcow

https://docs.openstack.org/nova/latest/user/rescue.html#stable-device-instan...

that dependency is briefly called out in https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

the stable rescue device feaure was added in the same release https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi... thtas what adds hw_rescue_bus and hw_rescue_device

Thanks for pointing that out, I will read through the docs and see if I get it right.

...

...
...
for libvirt configured with iamges_type=rbd rescue shoudl jsut work like any other "local" storage but ill admit i dothink think i have ever checked if addtional deata voluems are attach to the rescuded instace

i thought they were but perhaps not.

I just tried it with another instance with 2 attached volumes (root disk is ephemeral),

for the recored (root disk is ephemeral) is wrong. in nova an ephemeral disk is an addition nova provisioned disk that is separate form teh root disk

i know what you mean but i really hate when people refer to root disk of nova instants as ephemeral when then mean its a nova created volume/disk not a cinder volume.

the nova root disk is no more or less ephemeral then a cidner voluem. but have a life time tied to the api object used to create them.

if you delete a nova instance you delete its correstponding storage, if you delete a cinder volume you delete its corresponeidn storage. cidner sotrage is not by definiton any more resiliant then nova sotrage, if your using a san or other stoage srever it has not ha gurantess byond that of simple raid, ceph is the expction where it implictly has fault tolleragnce unless you configure it for replcia 1 but you can also back nova by ceph so that a wash.

so form my point of view cidner volumes are equaly ephemeral ad nova storage and since we actuly have a thign in our api called ephermal_disk in the flavor api and its something diffent form the root disk i really would like to stop using that the term ephemeral to refer to the root disk incorrectly.

I understand, I will try my best to not use it like that anymore. I learned that in my very early days of dealing with openstack, so please forgive my ignorance. :-) I guess I'll have to read what "real" ephemeral disks are then... But anyway, thanks for pointing that out.

...

...
the attached volumes are not available during rescue mode. ack. im not sure if that should be unexpected or not. its perhaps something we could make configurable as part fo the rescue call. im not sure why we chosse not to attach them i had assumed they would be attach but just not mounted automatically so you coudl mount them if needed. i guess the logic was you can use rescue to make the vm bootable normally and then boot into the instance normally after unrescue to fix any issue with the voluem.

you can alos just detach the voume and attach it to another vm to fix it but i can see haveign a --with-volumes option when doing rescuse in the furture if there was a need for that.

It's not unusual (for us) to have VMs with multiple disks which were attached during its lifetime to increase the volume group(s) if disk space is running out. What we usually do is to attach the volumes to different machines to fix stuff. But fortunately, it doesn't happen often, so the workaround has been sufficient for us. But it would definitely be beneficial to have the entire VM in rescue mode, including all volumes. I feel like I highjacked this thread, but hopefully it helps other users as well to get some clarification.

...

...
We use ceph as backend for all services.

...
...
[1] https://docs.openstack.org/nova/latest/user/rescue.html

Zitat von Metehan Cinar <meteccinar@gmail.com>:

...
Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?

We are already rolling it back by creating new volume and new instance from volume snapshot/backup.

How to rollback by don't change instance?

Do you have any idea? If you have ideas, we wait your ideas.

Thanks for answer.

Eugen Block

12:02 p.m.

Oh nice, just to confirm what you stated earlier, if I specify --property hw_rescue_device=disk --property hw_rescue_bus=virtio for my rescue image (other options probably work as well), I do see all attached volumes. :-) Awesome, thanks for your valuable input, Sean! I know I'm repeating myself, but you're incredibly helpful! Zitat von Eugen Block <eblock@nde.ag>:

...

Zitat von smooney@redhat.com:

...
On Fri, 2024-08-23 at 06:09 +0000, Eugen Block wrote:

...
Hi Sean,

Zitat von smooney@redhat.com:

...
On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:

...
I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:

...
Rescuing a volume-backed instance is not supported with this mode.

I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure.

rescue as you pointed out is not intendded for restoring form a backup its intened to be sued to boot form a resuce disk like Knoppix so that you can run fdisk or similar utils to fix the filesystem.

thanks for the confirmation.

...
rescuiing volume backed instance has been supported since ussuri https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

i know at least one operator used that functionality to fix windows guests after the recent cloudstrink issues so that functionality does work.

Interesting, because I just retried that in a Victoria Cloud (haven't in a long time) and it fails with this error:

Unable to rescue instance Details Instance bf0a04eb-27ae-4979-86b4-8b5522ede1de cannot be rescued: Cannot rescue a volume-backed instance (HTTP 400) you have to us the relevant micorversion(2.87) or later and out docs have a slight bug, to rescue a boot form volume instance you must configure the stable device rescue feature on the rescue image.

in our docs we say you have to set hw_rescue_bus or hw_rescue_device when it actully should say the hw_rescue_bus should always be set on the rescue image and hw_rescue_device should be set if its an iso.

i.e. hw_rescue_bus=usb hw_rescue_device=cdrom if the rescue image is an iso. but hw_rescue_bus=usb is enough if the image is say a qcow

https://docs.openstack.org/nova/latest/user/rescue.html#stable-device-instan...

that dependency is briefly called out in https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

the stable rescue device feaure was added in the same release https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi... thtas what adds hw_rescue_bus and hw_rescue_device

Thanks for pointing that out, I will read through the docs and see if I get it right.

...
...
...
for libvirt configured with iamges_type=rbd rescue shoudl jsut work like any other "local" storage but ill admit i dothink think i have ever checked if addtional deata voluems are attach to the rescuded instace

i thought they were but perhaps not.

I just tried it with another instance with 2 attached volumes (root disk is ephemeral),

for the recored (root disk is ephemeral) is wrong. in nova an ephemeral disk is an addition nova provisioned disk that is separate form teh root disk

i know what you mean but i really hate when people refer to root disk of nova instants as ephemeral when then mean its a nova created volume/disk not a cinder volume.

the nova root disk is no more or less ephemeral then a cidner voluem. but have a life time tied to the api object used to create them.

if you delete a nova instance you delete its correstponding storage, if you delete a cinder volume you delete its corresponeidn storage. cidner sotrage is not by definiton any more resiliant then nova sotrage, if your using a san or other stoage srever it has not ha gurantess byond that of simple raid, ceph is the expction where it implictly has fault tolleragnce unless you configure it for replcia 1 but you can also back nova by ceph so that a wash.

so form my point of view cidner volumes are equaly ephemeral ad nova storage and since we actuly have a thign in our api called ephermal_disk in the flavor api and its something diffent form the root disk i really would like to stop using that the term ephemeral to refer to the root disk incorrectly.

I understand, I will try my best to not use it like that anymore. I learned that in my very early days of dealing with openstack, so please forgive my ignorance. :-) I guess I'll have to read what "real" ephemeral disks are then... But anyway, thanks for pointing that out.

...
...
the attached volumes are not available during rescue mode. ack. im not sure if that should be unexpected or not. its perhaps something we could make configurable as part fo the rescue call. im not sure why we chosse not to attach them i had assumed they would be attach but just not mounted automatically so you coudl mount them if needed. i guess the logic was you can use rescue to make the vm bootable normally and then boot into the instance normally after unrescue to fix any issue with the voluem.

you can alos just detach the voume and attach it to another vm to fix it but i can see haveign a --with-volumes option when doing rescuse in the furture if there was a need for that.

It's not unusual (for us) to have VMs with multiple disks which were attached during its lifetime to increase the volume group(s) if disk space is running out. What we usually do is to attach the volumes to different machines to fix stuff. But fortunately, it doesn't happen often, so the workaround has been sufficient for us. But it would definitely be beneficial to have the entire VM in rescue mode, including all volumes.

I feel like I highjacked this thread, but hopefully it helps other users as well to get some clarification.

...
...
We use ceph as backend for all services.

...
...
[1] https://docs.openstack.org/nova/latest/user/rescue.html

Zitat von Metehan Cinar <meteccinar@gmail.com>:

...
Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?

We are already rolling it back by creating new volume and new instance from volume snapshot/backup.

How to rollback by don't change instance?

Do you have any idea? If you have ideas, we wait your ideas.

Thanks for answer.

337

Age (days ago)

338

Last active (days ago)

List overview

Download

8 comments

5 participants

participants (5)

Eugen Block
meteccinar＠gmail.com
Metehan Cinar
smooney＠redhat.com
Tobias Urdin

Openstack Instance Snapshot and Rescue Problem

tags

participants (5)