[Openstack-operators] [openstack-dev] [nova] Is there any reason to exclude originally failed build hosts during live migration?
melwittt at gmail.com
Wed Sep 20 20:15:27 UTC 2017
On Wed, 20 Sep 2017 13:47:18 -0500, Matt Riedemann wrote:
> Presumably there was a good reason why the instance failed to build on a
> host originally, but that could be for any number of reasons: resource
> claim failed during a race, configuration issues, etc. Since we don't
> really know what originally happened, it seems reasonable to not exclude
> originally attempted build targets since the scheduler filters should
> still validate them during live migration (this is all assuming you're
> not using the 'force' flag with live migration - and if you are, all
> bets are off).
Yeah, I think because an original failure to build could have been a
failed claim during a race, config issue, or just been a very long time
ago, we shouldn't continue to exclude those hosts forever.
> If people agree with doing this fix, then we also have to consider
> making a similar fix for other move operations like cold migrate,
> evacuate and unshelve. However, out of those other move operations, only
> cold migrate attempts any retries. If evacuate or unshelve fail on the
> target host, there is no retry.
I agree with doing that fix for all of the move operations.
More information about the OpenStack-operators