[openstack-dev] [Nova] Does Nova really need an SQL database?

Chris Friesen chris.friesen at windriver.com
Tue Nov 19 20:18:16 UTC 2013


On 11/19/2013 01:51 PM, Clint Byrum wrote:
> Excerpts from Chris Friesen's message of 2013-11-19 11:37:02 -0800:
>> On 11/19/2013 12:35 PM, Clint Byrum wrote:
>>
>>> Each scheduler process can own a different set of resources. If they
>>> each grab instance requests in a round-robin fashion, then they will
>>> fill their resources up in a relatively well balanced way until one
>>> scheduler's resources are exhausted. At that time it should bow out of
>>> taking new instances. If it can't fit a request in, it should kick the
>>> request out for retry on another scheduler.
>>>
>>> In this way, they only need to be in sync in that they need a way to
>>> agree on who owns which resources. A distributed hash table that gets
>>> refreshed whenever schedulers come and go would be fine for that.
>>
>> That has some potential, but at high occupancy you could end up refusing
>> to schedule something because no one scheduler has sufficient resources
>> even if the cluster as a whole does.
>>
>
> I'm not sure what you mean here. What resource spans multiple compute
> hosts?

Imagine the cluster is running close to full occupancy, each scheduler 
has room for 40 more instances.  Now I come along and issue a single 
request to boot 50 instances.  The cluster has room for that, but none 
of the schedulers do.

>> This gets worse once you start factoring in things like heat and
>> instance groups that will want to schedule whole sets of resources
>> (instances, IP addresses, network links, cinder volumes, etc.) at once
>> with constraints on where they can be placed relative to each other.

> Actually that is rather simple. Such requests have to be serialized
> into a work-flow. So if you say "give me 2 instances in 2 different
> locations" then you allocate 1 instance, and then another one with
> 'not_in_location(1)' as a condition.

Actually, you don't want to serialize it, you want to hand the whole set 
of resource requests and constraints to the scheduler all at once.

If you do them one at a time, then early decisions made with 
less-than-complete knowledge can result in later scheduling requests 
failing due to being unable to meet constraints, even if there are 
actually sufficient resources in the cluster.

The "VM ensembles" document at
https://docs.google.com/document/d/1bAMtkaIFn4ZSMqqsXjs_riXofuRvApa--qo4UTwsmhw/edit?pli=1 
has a good example of how one-at-a-time scheduling can cause spurious 
failures.

And if you're handing the whole set of requests to a scheduler all at 
once, then you want the scheduler to have access to as many resources as 
possible so that it has the highest likelihood of being able to satisfy 
the request given the constraints.

Chris



More information about the OpenStack-dev mailing list