Re: Openstack Instance Snapshot and Rescue Problem

23 Aug 2024

      On Fri, 2024-08-23 at 06:09 +0000, Eugen Block wrote:
...
Hi Sean,
Zitat von smooney@redhat.com:
...
On Thu, 2024-08-22 at 10:38 +0000, Eugen Block wrote:
...
I wonder if this could be a misunderstanding about the word "rescue". 
Maybe I'm the one who didn't understand it properly, but in the past 
we used the rescue mode to make instances bootable again if the 
bootloader hadn't been properly regenerated after an upgrade, or other 
issues. So we used 'openstack server rescue <UUID>' to get into a 
chroot environment and "rescue" the machine. Only the instance's root 
disk (ephemeral, not volume) is attached (+ optional config-drive) to 
the instance, the docs [1] contain this note:
...
Rescuing a volume-backed instance is not supported with this mode.
I never considered it as a method to restore an instance from backup. 
I assume OP is asking how to restore an instance from backup, but I'm 
not really sure.
rescue as you pointed out is not intendded for restoring form a backup
its intened to be sued to boot form a resuce disk like Knoppix so  
that you can run fdisk
or similar utils to fix the filesystem.
thanks for the confirmation.
...
rescuiing volume backed instance has been supported since ussuri
https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...
i know at least one operator used that functionality to fix windows  
guests after the recent cloudstrink
issues so that functionality does work.
Interesting, because I just retried that in a Victoria Cloud (haven't  
in a long time) and it fails with this error:
Unable to rescue instance Details
Instance bf0a04eb-27ae-4979-86b4-8b5522ede1de cannot be rescued:  
Cannot rescue a volume-backed instance (HTTP 400)
you have to us the relevant micorversion(2.87) or later and out docs have a slight bug,
to rescue a boot form volume instance you must configure the stable device rescue feature
on the rescue image.
in our docs we say you have to set hw_rescue_bus or hw_rescue_device
when it actully should say the hw_rescue_bus should always be set on the rescue image
and hw_rescue_device should be set if its an iso.

i.e. hw_rescue_bus=usb hw_rescue_device=cdrom if the rescue image is an iso.
but hw_rescue_bus=usb is enough if the image is say a qcow

https://docs.openstack.org/nova/latest/user/rescue.html#stable-device-instan...

that dependency is briefly called out in
https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...

the stable rescue device feaure was added in the same release
https://specs.openstack.org/openstack/nova-specs/specs/ussuri/implemented/vi...
thtas what adds hw_rescue_bus and hw_rescue_device
...
...
for libvirt configured with iamges_type=rbd rescue shoudl jsut work  
like any other "local" storage
but ill admit i dothink think i have ever checked if addtional deata  
voluems are attach to the rescuded instace
i thought they were but perhaps not.
I just tried it with another instance with 2 attached volumes (root  
disk is ephemeral),
for the recored (root disk is ephemeral) is wrong.
in nova an ephemeral disk is an addition nova provisioned disk that is separate form teh root disk

i know what you mean but i really hate when people refer to root disk of nova instants as ephemeral when
then mean its a nova created volume/disk not a cinder volume.

the nova root disk is no more or less ephemeral then a cidner voluem.
but have a life time tied to the api object used to create them.

if you delete a nova instance you delete its correstponding storage, if you delete a cinder volume you delete its
corresponeidn storage. cidner sotrage is not by definiton any more resiliant then nova sotrage, if your using a san
or other stoage srever it has not ha gurantess byond that of simple raid, ceph is the expction where it implictly has
fault tolleragnce unless you configure it for replcia 1 but you can also back nova by ceph so that a wash.

so form my point of view cidner volumes are equaly ephemeral ad nova storage and since we actuly have a thign in our api
called ephermal_disk in the flavor api and its something diffent form the root disk i really would like to stop using
that the term ephemeral to refer to the root disk incorrectly.
...
the attached volumes are not available during  
rescue mode.
ack. im not sure if that should be unexpected or not.
its perhaps something we could make configurable as part fo the rescue call.
im not sure why we chosse not to attach them i had assumed they would be attach
but just not mounted automatically so you coudl mount them if needed.
i guess the logic was you can use rescue to make the vm bootable normally and then
boot into the instance normally after unrescue to fix any issue with the voluem.
you can alos just detach the voume and attach it to another vm to fix it but i can see haveign a 
--with-volumes option when doing rescuse in the furture if there was a need for  that.
...
We use ceph as backend for all services.
...
...
[1] https://docs.openstack.org/nova/latest/user/rescue.html
Zitat von Metehan Cinar <meteccinar@gmail.com>:
...
Other Openstack users how to rescue their attached volume instances? 
 What is the best practice this issue?
We are already rolling it back by creating new volume and new 
instance from volume snapshot/backup.
How to rollback by don't change instance?
Do you have any idea? If you have ideas, we wait your ideas.
Thanks for answer.

Re: Openstack Instance Snapshot and Rescue Problem

smooney＠redhat.com