Open Stack

Tue Sep 24 14:01:26 UTC 2013

On 24/09/13 05:31, Mike Spreitzer wrote:
> I was not trying to raise issues of geographic dispersion and other
> higher level structures, I think the issues I am trying to raise are
> relevant even without them.  This is not to deny the importance, or
> relevance, of higher levels of structure.  But I would like to first
> respond to the discussion that I think is relevant even without them.
>
> I think it is valuable for OpenStack to have a place for holistic
> infrastructure scheduling.  I am not the only one to argue for this, but
> I will give some use cases.  Consider Hadoop, which stresses the path
> between Compute and Block Storage.  In the usual way of deploying and
> configuring Hadoop, you want each data node to be using directly
> attached storage.  You could address this by scheduling one of those two
> services first, and then the second with constraints from the first ---
> but the decisions made by the first could paint the second into a
> corner.  It is better to be able to schedule both jointly.  Also
> consider another approach to Hadoop, in which the block storage is
> provided by a bank of storage appliances that is equidistant (in
> networking terms) from all the Compute.  In this case the Storage and
> Compute scheduling decisions have no strong interaction --- but the
> Compute scheduling can interact with the network (you do not want to
> place Compute in a way that overloads part of the network).

Thanks for writing this up, it's very helpful for figuring out what you 
mean by a 'holistic' scheduler.

I don't yet see how this could be considered in-scope for the 
Orchestration program, which uses only the public APIs of other services.

To take the first example, wouldn't your holistic scheduler effectively 
have to reserve a compute instance and some directly attached block 
storage prior to actually creating them? Have you considered Climate 
rather than Heat as an integration point?

> Once a holistic infrastructure scheduler has made its decisions, there
> is then a need for infrastructure orchestration.  The infrastructure
> orchestration function is logically downstream from holistic scheduling.

I agree that it's necessarily 'downstream' (in the sense of happening 
afterwards). I'd hesitate to use the word 'logically', since I think by 
it's very nature a holistic scheduler introduces dependencies between 
services that were intended to be _logically_ independent.

>   I do not favor creating a new and alternate way of doing
> infrastructure orchestration in this position.  Rather I think it makes
> sense to use essentially today's heat engine.
>
> Today Heat is the only thing that takes a holistic view of
> patterns/topologies/templates, and there are various pressures to expand
> the mission of Heat.  A marquee expansion is to take on software
> orchestration.  I think holistic infrastructure scheduling should be
> downstream from the preparatory stage of software orchestration (the
> other stage of software orchestration is the run-time action in and
> supporting the resources themselves).  There are other pressures to
> expand the mission of Heat too.  This leads to conflicting usages for
> the word "heat": it can mean the infrastructure orchestration function
> that is the main job of today's heat engine, or it can mean the full
> expanded mission (whatever you think that should be).  I have been
> mainly using "heat" in that latter sense, but I do not really want to
> argue over naming of bits and assemblies of functionality.  Call them
> whatever you want.  I am more interested in getting a useful arrangement
> of functionality.  I have updated my picture at
> https://docs.google.com/drawings/d/1Y_yyIpql5_cdC8116XrBHzn6GfP_g0NHTTG_W4o0R9U---
> do you agree that the arrangement of functionality makes sense?

Candidly, no.

As proposed, the software configs contain directives like 'hosted_on: 
server_name'. (I don't know that I'm a huge fan of this design, but I 
don't think the exact details are relevant in this context.) There's no 
non-trivial processing in the preparatory stage of software 
orchestration that would require it to be performed before scheduling 
could occur.

Let's make sure we distinguish between doing holistic scheduling, which 
requires a priori knowledge of the resources to be created, and 
automatic scheduling, which requires psychic knowledge of the user's 
mind. (Did the user want to optimise for performance or availability? 
How would you infer that from the template?) There's nothing that 
happens while preparing the software configurations that's necessary for 
the former nor sufficient for the latter.

cheers,
Zane.

Open Stack

[openstack-dev] [heat] [scheduler] Bringing things together for Icehouse

OpenStack

Community

Documentation

Branding & Legal