Oh nice, just to confirm what you stated earlier, if I specify --property hw_rescue_device=disk --property hw_rescue_bus=virtio for my rescue image (other options probably work as well), I do see all attached volumes. :-) Awesome, thanks for your valuable input, Sean! I know I'm repeating myself, but you're incredibly helpful! Zitat von Eugen Block <eblock@nde.ag>:
Zitat von smooney@redhat.com:
On Fri, 2024-08-23 at 06:09 +0000, Eugen Block wrote:
Hi Sean,
Zitat von smooney@redhat.com:
On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:
I wonder if this could be a misunderstanding about the word "rescue". Maybe I'm the one who didn't understand it properly, but in the past we used the rescue mode to make instances bootable again if the bootloader hadn't been properly regenerated after an upgrade, or other issues. So we used 'openstack server rescue <UUID>' to get into a chroot environment and "rescue" the machine. Only the instance's root disk (ephemeral, not volume) is attached (+ optional config-drive) to the instance, the docs [1] contain this note:
Rescuing a volume-backed instance is not supported with this mode.
I never considered it as a method to restore an instance from backup. I assume OP is asking how to restore an instance from backup, but I'm not really sure.
rescue as you pointed out is not intendded for restoring form a backup its intened to be sued to boot form a resuce disk like Knoppix so that you can run fdisk or similar utils to fix the filesystem.
thanks for the confirmation.
rescuiing volume backed instance has been supported since ussuri https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...
i know at least one operator used that functionality to fix windows guests after the recent cloudstrink issues so that functionality does work.
Interesting, because I just retried that in a Victoria Cloud (haven't in a long time) and it fails with this error:
Unable to rescue instance Details Instance bf0a04eb-27ae-4979-86b4-8b5522ede1de cannot be rescued: Cannot rescue a volume-backed instance (HTTP 400) you have to us the relevant micorversion(2.87) or later and out docs have a slight bug, to rescue a boot form volume instance you must configure the stable device rescue feature on the rescue image.
in our docs we say you have to set hw_rescue_bus or hw_rescue_device when it actully should say the hw_rescue_bus should always be set on the rescue image and hw_rescue_device should be set if its an iso.
i.e. hw_rescue_bus=usb hw_rescue_device=cdrom if the rescue image is an iso. but hw_rescue_bus=usb is enough if the image is say a qcow
https://docs.openstack.org/nova/latest/user/rescue.html#stable-device-instan...
that dependency is briefly called out in https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...
the stable rescue device feaure was added in the same release https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi... thtas what adds hw_rescue_bus and hw_rescue_device
Thanks for pointing that out, I will read through the docs and see if I get it right.
for libvirt configured with iamges_type=rbd rescue shoudl jsut work like any other "local" storage but ill admit i dothink think i have ever checked if addtional deata voluems are attach to the rescuded instace
i thought they were but perhaps not.
I just tried it with another instance with 2 attached volumes (root disk is ephemeral),
for the recored (root disk is ephemeral) is wrong. in nova an ephemeral disk is an addition nova provisioned disk that is separate form teh root disk
i know what you mean but i really hate when people refer to root disk of nova instants as ephemeral when then mean its a nova created volume/disk not a cinder volume.
the nova root disk is no more or less ephemeral then a cidner voluem. but have a life time tied to the api object used to create them.
if you delete a nova instance you delete its correstponding storage, if you delete a cinder volume you delete its corresponeidn storage. cidner sotrage is not by definiton any more resiliant then nova sotrage, if your using a san or other stoage srever it has not ha gurantess byond that of simple raid, ceph is the expction where it implictly has fault tolleragnce unless you configure it for replcia 1 but you can also back nova by ceph so that a wash.
so form my point of view cidner volumes are equaly ephemeral ad nova storage and since we actuly have a thign in our api called ephermal_disk in the flavor api and its something diffent form the root disk i really would like to stop using that the term ephemeral to refer to the root disk incorrectly.
I understand, I will try my best to not use it like that anymore. I learned that in my very early days of dealing with openstack, so please forgive my ignorance. :-) I guess I'll have to read what "real" ephemeral disks are then... But anyway, thanks for pointing that out.
the attached volumes are not available during rescue mode. ack. im not sure if that should be unexpected or not. its perhaps something we could make configurable as part fo the rescue call. im not sure why we chosse not to attach them i had assumed they would be attach but just not mounted automatically so you coudl mount them if needed. i guess the logic was you can use rescue to make the vm bootable normally and then boot into the instance normally after unrescue to fix any issue with the voluem.
you can alos just detach the voume and attach it to another vm to fix it but i can see haveign a --with-volumes option when doing rescuse in the furture if there was a need for that.
It's not unusual (for us) to have VMs with multiple disks which were attached during its lifetime to increase the volume group(s) if disk space is running out. What we usually do is to attach the volumes to different machines to fix stuff. But fortunately, it doesn't happen often, so the workaround has been sufficient for us. But it would definitely be beneficial to have the entire VM in rescue mode, including all volumes.
I feel like I highjacked this thread, but hopefully it helps other users as well to get some clarification.
We use ceph as backend for all services.
[1] https://docs.openstack.org/nova/latest/user/rescue.html
Zitat von Metehan Cinar <meteccinar@gmail.com>:
Other Openstack users how to rescue their attached volume instances? What is the best practice this issue?
We are already rolling it back by creating new volume and new instance from volume snapshot/backup.
How to rollback by don't change instance?
Do you have any idea? If you have ideas, we wait your ideas.
Thanks for answer.