On 22 Apr 2024, at 21:19, smooney@redhat.com wrote:
if you have a server on host A and that host has a failure,
when the operator evacuates it to host B there is a a period of time
where a user could issue api actiosn like stop that will be queue for
host A.
when the operator restores Host A (i.e. replaces the power supply)
when that service start up it may dequeue a message that was sent before
teh evauate was done.
nova needs to check that when we dequeue an message that the instance is activly
managed on this host and avoid processing request for instance it nolonger manages.
in this case it should avoid doing na instance.save to record the instance should be powered off.
The last section very accurately describes the issue, we process messages even if we still
don’t manage the instance.
We also handle the power state and hit a save to the database which makes the destination
where the instance was placed when evacuated to erroneously power-off the instance causing
an “outage” on that instance again after the fact.
Best regards
Tobias