On Thu, 2023-03-16 at 13:35 +0100, Dmitriy Rabotyagov wrote:
Maybe I'm missing something, but what are the reasons you would want to rebuild an instance without ... rebuilding it?
I think it might be the case of rescheduling the VM to other compute without a long-lasting shelve/unshelve and when you don't need to change the flavor. So kind of self-service when the user does detect some weirdness, but before bothering the tech team will attempt to reschedule to another compute on their own.
rebuild is __not__ a move operation the curernt special case is a hack to alow image metadata properties to be updated for an exsitng vm but it will not reschedule the vm to another host. we talks about this in paste ptg where i propsoed adding a recreate api. i do not think we should ever make rebuilt a move oepratation but we could supprot a new api to enable the vm to recreate (keeping its data) on a new host with updated flavor/image extra specs based on teh current value of either. i really wish we coudl remvoe the current rebuild beahvior but when we discussed doing that before we decied it woudl break to many people.
ср, 15 мар. 2023 г. в 19:57, Dan Smith <dms@danplanet.com>:
We have users who use 'rebuild' on volume booted servers before nova microversion 2.93, relying on the behavior that it keeps the volume as is. And they would like to keep doing this even after the openstack distro moves to a(n at least) zed base (sometime in the future).
Maybe I'm missing something, but what are the reasons you would want to rebuild an instance without ... rebuilding it?
I assume it's because you want to redefine the metadata or name or something. There's a reason why those things are not easily mutable today, and why we had a lot of discussion on how to make user metadata mutable on an existing instance in the last cycle. However, I would really suggest that we not override "recreate the thing" to "maybe recreate the thing or just update a few fields". Instead, for things we think really should be mutable on a server at runtime, we should probably just do that.
Imagine if the way you changed permissions recursively was to run 'rm -Rf --no-delete-just-change-ownership'. That would be kinda crazy, but that is (IMHO) what "recreate but don't just change $name" means to a user.
As a naive user, it seems to me both behaviors make sense. I can easily imagine use cases for rebuild with and without reimaging.
I think that's because you're already familiar with the difference. For users not already in that mindset, I think it probably seems very weird that rebuild is destructive in one case and not the other.
Then there are a few hypothetical situations like: a) Rebuild gets a new api feature (in a new microversion) which can never be combined with the do-not-reimage behavior. b) Rebuild may have a bug, whose fix requires a microversion bump. This again can never be combined with the old behavior.
What do you think, are these concerns purely theoretical or real? If we would like to keep having rebuild without reimaging, can we rely on the old microversion indefinitely? Alternatively shall we propose and implement a nova spec to explicitly expose the choice in the rebuild api (just to express the idea: osc server rebuild --reimage|--no-reimage)?
I'm not opposed to challenge the usecases in a spec, for sure.
I really want to know what the use-case is for "rebuild but not really". And also what "rebuild" means to a user if --no-reimage is passed. What's being rebuilt? The docs[0] for the API say very clearly:
"This operation recreates the root disk of the server."
That was a lie for volume-backed instances for technical reasons. It was a bug, not a feature.
I also strongly believe that if we're going to add a "but not really" flag, it needs to apply to volume-backed and regular instances identically. Because that's what the change here was doing - unifying the behavior for a single API operation. Going the other direction does not seem useful to me.
--Dan
[0] https://docs.openstack.org/api-ref/compute/?expanded=rebuild-server-rebuild-...