All VMs fail when --max exceeds available resources

Matt Riedemann mriedemos at gmail.com
Wed Nov 20 23:51:51 UTC 2019


On 11/20/2019 5:16 PM, Albert Braden wrote:
> The expected result (that I was seeing last week) is that, if my cluster has capacity for 4 VMs and I use --max 5, 4 will go active and 1 will go to error. This week all 5 are going to error. I can still build 4 VMs of that flavor, one at a time, or use --max 4, but if I use --max 5, then all 5 will fail. If I use smaller VMs, the --max numbers get bigger but I still see the same symptom.
> 
> The --max thing is pretty useful and we use it a lot; it allows us to use up the cluster without knowing exactly how much space we have.

OK so I think you're hitting this with the NoValidHost error:

https://github.com/openstack/nova/blob/18.0.0/nova/conductor/manager.py#L1209

And that's putting all of the instances into ERROR status even though 4 
out of the 5 did successfully allocate resources in the scheduler. The 
scheduler would have rolled back the allocations here if it couldn't fit 
everything:

https://github.com/openstack/nova/blob/18.0.0/nova/scheduler/filter_scheduler.py#L276

Which release did you say that the --max 5 scenario worked where 4 would 
be successfully built but the remaining one would go to ERROR status? 
I'm just trying to figure out where/when the regression in behavior 
occurred.

-- 

Thanks,

Matt



More information about the openstack-discuss mailing list