[openstack-dev] [Nova][Scheduler] Policy Based Scheduler and Solver Scheduler
Yathiraj Udupi (yudupi)
yudupi at cisco.com
Tue Feb 11 16:18:15 UTC 2014
I thought of adding some more points about the Solver Scheduler to this
conversation.
Think of SolverScheduler as a placement decision engine, which gives an
optimal solution for the specified request based on the current
information available at a specific time. The request could potentially
be a set of instances of the same flavor or different flavors (assuming we
eventually support scheduler APIs that can provide this).
Once the optimal placement decision is known, then we need to allocate the
resources. Currently Nova supports the final allocation of resources
(resource provisioning) one at a time. I definitely agree there will be
more success in allocating all the requested instances, if there is a
support for reserving the spots as soon as a placement decision is taken
by the SolverScheduler. This is something to explore to add in Nova, using
a local service or external service (need to explore Climate).
If the required set of instances is known, irrespective of whether it is
part of one instance group, or multiple instance groups, you always get a
more optimal solution, if all of this is requested in one shot to the
solver scheduler, as the constraint solving happens in one shot, letting
you know if the entire set of instances is feasible to be placed given the
existing resource capacity.
If it is okay to support partial instantiation of subset of instances,
then it makes sense to provide support to retry one instance group at a
time, when the entire request is not feasible.
To add another point about instance group api implementation for icehouse,
it was decided in the HongKong summit to initially support only flat
instance groups without nesting. Hence if an application requires a big
topology of instances, they can easily belong to multiple instance groups,
and hence if you want the entire application requirement to be satisfied,
the entire set of instances from multiple flat instance groups should be
requested as a single shot to the solver scheduler. Also, there is
additional work required to add new scheduler APIs to support requesting
instance groups of multiple flavors.
I think I have reiterated some of the points what Chris has mentioned
below. But Yes, like I had stated earlier in this thread, we need to
separate the decision making phase from the initial request making, and
the final allocation provisioning (or orchestration). In these phases, a
reservation phase, after the decision making, will add additional
guarantees to allocate the placed instances.
Thanks,
Yathi.
On 2/11/14, 7:09 AM, "Chris Friesen" <chris.friesen at windriver.com> wrote:
>On 02/11/2014 03:21 AM, Khanh-Toan Tran wrote:
>>> Second, there is nothing wrong with booting the instances (or
>> instantiating other
>>> resources) as separate commands as long as we support some kind of
>>> reservation token.
>>
>> I'm not sure what reservation token would do, is it some kind of
>>informing
>> the scheduler that the resources would not be initiated until later ?
>
>Like a restaurant reservation, it would "claim" the resources for use by
>someone at a later date. That way nobody else can use them.
>
>That way the scheduler would be responsible for determining where the
>resource should be allocated from, and getting a reservation for that
>resource. It would not have anything to do with actually instantiating
>the instance/volume/etc.
>
>> Let's consider a following example:
>>
>> A user wants to create 2 VMs, a small one with 20 GB RAM, and a big one
>> with 40 GB RAM in a datacenter consisted of 2 hosts: one with 50 GB RAM
>> left, and another with 30 GB RAM left, using Filter Scheduler's default
>> RamWeigher.
>>
>> If we pass the demand as two commands, there is a chance that the small
>>VM
>> arrives first. RamWeigher will put it in the 50 GB RAM host, which will
>>be
>> reduced to 30 GB RAM. Then, when the big VM request arrives, there will
>>be
>> no space left to host it. As a result, the whole demand is failed.
>>
>> Now if we can pass the two VMs in a command, SolverScheduler can put
>>their
>> constraints all together into one big LP as follow (x_uv = 1 if VM u is
>> hosted in host v, 0 if not):
>
>Yes. So what I'm suggesting is that we schedule the two VMs as one call
>to the SolverScheduler. The scheduler then gets reservations for the
>necessary resources and returns them to the caller. This would be sort
>of like the existing Claim object in nova/compute/claims.py but
>generalized somewhat to other resources as well.
>
>The caller could then boot each instance separately (passing the
>appropriate reservation/claim along with the boot request). Because the
>caller has a reservation the core code would know it doesn't need to
>schedule or allocate resources, that's already been done.
>
>The advantage of this is that the scheduling and resource allocation is
>done separately from the instantiation. The instantiation API could
>remain basically as-is except for supporting an optional reservation
>token.
>
>> That responses to your first point, too. If we don't mind that some VMs
>> are placed and some are not (e.g. they belong to different apps), then
>> it's OK to pass them to the scheduler without Instance Group. However,
>>if
>> the VMs are together (belong to an app), then we have to put them into
>>an
>> Instance Group.
>
>When I think of an "Instance Group", I think of
>"https://blueprints.launchpad.net/nova/+spec/instance-group-api-extension"
>.
> Fundamentally Instance Groups" describes a runtime relationship
>between different instances.
>
>The scheduler doesn't necessarily care about a runtime relationship,
>it's just trying to allocate resources efficiently.
>
>In the above example, there is no need for those two instances to
>necessarily be part of an Instance Group--we just want to schedule them
>both at the same time to give the scheduler a better chance of fitting
>them both.
>
>More generally, the more instances I want to start up the more
>beneficial it can be to pass them all to the scheduler at once in order
>to give the scheduler more information. Those instances could be parts
>of completely independent Instance Groups, or not part of an Instance
>Group at all...the scheduler can still do a better job if it has more
>information to work with.
>
>Chris
>
>_______________________________________________
>OpenStack-dev mailing list
>OpenStack-dev at lists.openstack.org
>http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev
More information about the OpenStack-dev
mailing list