If I create 20 VMs at once, at least one of them fails with “Exceeded maximum number of retries.” When I look at the logs I see that the scheduler sent the VM to a host that doesn’t have enough CPU “Free vcpu 14.00 VCPU < requested 16 VCPU.”
https://paste.fedoraproject.org/paste/6N3wcDzlbNQgj6hRApHiDQ
I thought that this must be caused by a race condition, so I stopped the scheduler and conductor on 2 controllers, and then created 20 more VMs. Now I see the logs only on controller 3, and some of the failures are now saying “Unable to
establish connection to <LB>” but I still see the single scheduler sending VMs to a host that lacks resources “Free vcpu 14.00 VCPU < requested 16 VCPU.”
https://paste.fedoraproject.org/paste/lGlVpfbB9C19mMzrWQcHCQ
I’m looking at my nova.conf but don’t see anything misconfigured. My filters are pretty standard:
enabled_filters = RetryFilter,AvailabilityZoneFilter,CoreFilter,RamFilter,ComputeFilter,ComputeCapabilitiesFilter,ImagePropertiesFilter,ServerGroupAntiAffinityFilter,ServerGroupAffinityFilter,DifferentHostFilter,SameHostFilter
What should I be looking for here? Why would a single scheduler send a VM to a host that is too full? We have lots of compute hosts that are not full:
https://paste.fedoraproject.org/paste/6SX9pQ4V1KnWfQkVnfoHOw
This is the command line I used:
openstack server create --flavor s1.16cx120g --image QSC-P-CentOS6.6-19P1-v4 --network vg-network --max 20 alberttestB