[openstack-dev] [heat] [scheduler] Bringing things together for Icehouse

Debojyoti Dutta ddutta at gmail.com
Tue Sep 24 23:40:14 UTC 2013


Joining the party late :)

I think there have been a lot of interesting ideas around wholistic
scheduling over the last few summits. However seems there is no clear
agreement on 1) where it should be specified and implemented 2) what
the specifications look like - VRT, policies, templates etc etc etc 3)
how to scale such implementations given the complexity.

In order to tackle the seemingly complex problem, why dont we get
together during the summit (and do homework beforehand) and converge
upon the above independent of where we implement. Maybe we implement
it in a separate scheduling layer which is independent of existing
services so that it touches less things.

Mike: agree that we should be specifying a VRT, constraints etc and
matching them via a constrained optimization ... all this is music to
my ears. However I feel we will just make it harder to get it done if
we dont simplify the problem without giving up room for where we want
to be. At the end of the day, in order to place resources efficiently
you need a demand vector/matrix etc and you match with the available
resource vector/matrix .... so why not have an abstract resource
placement layer that hides details except for the quantities that are
really needed. That way it can be used inside Heat or it can be used
standalone ....

To that effect - if you look at the BP
https://blueprints.launchpad.net/nova/+spec/solver-scheduler, and the
associated code (WIP), its an attempt to do the same within the
current Nova framework. One thing we could do is to have a layer that
abstracts the VRT and constraints so that a simple optimization
framework could then make the decisions instead of hand crafted
algorithms that are harder to extend (e.g. the entire suite of
scheduler filters that exist today).

debo

On Tue, Sep 24, 2013 at 7:01 AM, Zane Bitter <zbitter at redhat.com> wrote:
> On 24/09/13 05:31, Mike Spreitzer wrote:
>>
>> I was not trying to raise issues of geographic dispersion and other
>> higher level structures, I think the issues I am trying to raise are
>> relevant even without them.  This is not to deny the importance, or
>> relevance, of higher levels of structure.  But I would like to first
>> respond to the discussion that I think is relevant even without them.
>>
>> I think it is valuable for OpenStack to have a place for holistic
>> infrastructure scheduling.  I am not the only one to argue for this, but
>> I will give some use cases.  Consider Hadoop, which stresses the path
>> between Compute and Block Storage.  In the usual way of deploying and
>> configuring Hadoop, you want each data node to be using directly
>> attached storage.  You could address this by scheduling one of those two
>> services first, and then the second with constraints from the first ---
>> but the decisions made by the first could paint the second into a
>> corner.  It is better to be able to schedule both jointly.  Also
>> consider another approach to Hadoop, in which the block storage is
>> provided by a bank of storage appliances that is equidistant (in
>> networking terms) from all the Compute.  In this case the Storage and
>> Compute scheduling decisions have no strong interaction --- but the
>> Compute scheduling can interact with the network (you do not want to
>> place Compute in a way that overloads part of the network).
>
>
> Thanks for writing this up, it's very helpful for figuring out what you mean
> by a 'holistic' scheduler.
>
> I don't yet see how this could be considered in-scope for the Orchestration
> program, which uses only the public APIs of other services.
>
> To take the first example, wouldn't your holistic scheduler effectively have
> to reserve a compute instance and some directly attached block storage prior
> to actually creating them? Have you considered Climate rather than Heat as
> an integration point?
>
>
>> Once a holistic infrastructure scheduler has made its decisions, there
>> is then a need for infrastructure orchestration.  The infrastructure
>> orchestration function is logically downstream from holistic scheduling.
>
>
> I agree that it's necessarily 'downstream' (in the sense of happening
> afterwards). I'd hesitate to use the word 'logically', since I think by it's
> very nature a holistic scheduler introduces dependencies between services
> that were intended to be _logically_ independent.
>
>
>>   I do not favor creating a new and alternate way of doing
>> infrastructure orchestration in this position.  Rather I think it makes
>> sense to use essentially today's heat engine.
>>
>> Today Heat is the only thing that takes a holistic view of
>> patterns/topologies/templates, and there are various pressures to expand
>> the mission of Heat.  A marquee expansion is to take on software
>> orchestration.  I think holistic infrastructure scheduling should be
>> downstream from the preparatory stage of software orchestration (the
>> other stage of software orchestration is the run-time action in and
>> supporting the resources themselves).  There are other pressures to
>> expand the mission of Heat too.  This leads to conflicting usages for
>> the word "heat": it can mean the infrastructure orchestration function
>> that is the main job of today's heat engine, or it can mean the full
>> expanded mission (whatever you think that should be).  I have been
>> mainly using "heat" in that latter sense, but I do not really want to
>> argue over naming of bits and assemblies of functionality.  Call them
>> whatever you want.  I am more interested in getting a useful arrangement
>> of functionality.  I have updated my picture at
>>
>> https://docs.google.com/drawings/d/1Y_yyIpql5_cdC8116XrBHzn6GfP_g0NHTTG_W4o0R9U---
>> do you agree that the arrangement of functionality makes sense?
>
>
> Candidly, no.
>
> As proposed, the software configs contain directives like 'hosted_on:
> server_name'. (I don't know that I'm a huge fan of this design, but I don't
> think the exact details are relevant in this context.) There's no
> non-trivial processing in the preparatory stage of software orchestration
> that would require it to be performed before scheduling could occur.
>
> Let's make sure we distinguish between doing holistic scheduling, which
> requires a priori knowledge of the resources to be created, and automatic
> scheduling, which requires psychic knowledge of the user's mind. (Did the
> user want to optimise for performance or availability? How would you infer
> that from the template?) There's nothing that happens while preparing the
> software configurations that's necessary for the former nor sufficient for
> the latter.
>
> cheers,
> Zane.
>
>
> _______________________________________________
> OpenStack-dev mailing list
> OpenStack-dev at lists.openstack.org
> http://lists.openstack.org/cgi-bin/mailman/listinfo/openstack-dev



-- 
-Debo~



More information about the OpenStack-dev mailing list