On 11/20/2019 5:16 PM, Albert Braden wrote:
The expected result (that I was seeing last week) is that, if my cluster has capacity for 4 VMs and I use --max 5, 4 will go active and 1 will go to error. This week all 5 are going to error. I can still build 4 VMs of that flavor, one at a time, or use --max 4, but if I use --max 5, then all 5 will fail. If I use smaller VMs, the --max numbers get bigger but I still see the same symptom.
The --max thing is pretty useful and we use it a lot; it allows us to use up the cluster without knowing exactly how much space we have.
OK so I think you're hitting this with the NoValidHost error: https://github.com/openstack/nova/blob/18.0.0/nova/conductor/manager.py#L120... And that's putting all of the instances into ERROR status even though 4 out of the 5 did successfully allocate resources in the scheduler. The scheduler would have rolled back the allocations here if it couldn't fit everything: https://github.com/openstack/nova/blob/18.0.0/nova/scheduler/filter_schedule... Which release did you say that the --max 5 scenario worked where 4 would be successfully built but the remaining one would go to ERROR status? I'm just trying to figure out where/when the regression in behavior occurred. -- Thanks, Matt