[openstack-dev] [nova] Is there any reason to exclude originally failed build hosts during live migration?

melanie witt melwittt at gmail.com
Wed Sep 20 20:15:27 UTC 2017


On Wed, 20 Sep 2017 13:47:18 -0500, Matt Riedemann wrote:
> Presumably there was a good reason why the instance failed to build on a 
> host originally, but that could be for any number of reasons: resource 
> claim failed during a race, configuration issues, etc. Since we don't 
> really know what originally happened, it seems reasonable to not exclude 
> originally attempted build targets since the scheduler filters should 
> still validate them during live migration (this is all assuming you're 
> not using the 'force' flag with live migration - and if you are, all 
> bets are off).

Yeah, I think because an original failure to build could have been a 
failed claim during a race, config issue, or just been a very long time 
ago, we shouldn't continue to exclude those hosts forever.

> If people agree with doing this fix, then we also have to consider 
> making a similar fix for other move operations like cold migrate, 
> evacuate and unshelve. However, out of those other move operations, only 
> cold migrate attempts any retries. If evacuate or unshelve fail on the 
> target host, there is no retry.

I agree with doing that fix for all of the move operations.

-melanie



More information about the OpenStack-dev mailing list