Open Stack

Mon Jul 20 21:30:53 UTC 2015

On 07/20/2015 02:04 PM, Clint Byrum wrote:
> Excerpts from Chris Friesen's message of 2015-07-20 12:17:29 -0700:

>> Some questions:
>>
>> 1) Could you elaborate a bit on how this would work?  I don't quite understand
>> how you would handle a request for booting an instance with a certain set of
>> resources--would you queue up a message for each resource?
>>
>
> Please be concrete on what you mean by resource.
>
> I'm suggesting if you only have flavors, which have cpu, ram, disk, and rx/tx ratios,
> then each flavor is a queue. Thats the easiest problem to solve. Then if
> you have a single special thing that can only have one VM per host (lets
> say, a PCI pass through thing), then thats another iteration of each
> flavor. So assuming 3
> flavors:
>
> 1=tiny cpu=1,ram=1024m,disk=5gb,rxtx=1
> 2=medium cpu=2,ram=4096m,disk=100gb,rxtx=2
> 3=large cpu=8,ram=16384,disk=200gb,rxtx=2
>
> This means you have these queues:
>
> reserve
> release
> compute,cpu=1,ram=1024m,disk=5gb,rxtx=1,pci=1
> compute,cpu=1,ram=1024m,disk=5gb,rxtx=1
> compute,cpu=2,ram=4096m,disk=100gb,rxtx=2,pci=1
> compute,cpu=2,ram=4096m,disk=100gb,rxtx=2
> compute,cpu=8,ram=16384,disk=200gb,rxtx=2pci=1
> compute,cpu=8,ram=16384,disk=200gb,rxtx=2

<snip>

> Now, I've made this argument in the past, and people have pointed out
> that the permutations can get into the tens of thousands very easily
> if you start adding lots of dimensions and/or flavors. I suggest that
> is no big deal, but maybe I'm biased because I have done something like
> that in Gearman and it was, in fact, no big deal.

Yeah, that's what I was worried about.  We have things that can be specified per 
flavor, and things that can be specified per image, and things that can be 
specified per instance, and they all multiply together.

>> 2) How would it handle stuff like weight functions where you could have multiple
>> compute nodes that *could* satisfy the requirement but some of them would be
>> "better" than others by some arbitrary criteria.
>>
>
> Can you provide a concrete example? Feels like I'm asking for a straw
> man to be built. ;)

Well, as an example we have a cluster that is aimed at high-performance network 
processing and so all else being equal they will choose the compute node with 
the least network traffic.  You might also try to pack instances together for 
power efficiency (allowing you to turn off unused compute nodes), or choose the 
compute node that results in the tightest packing (to minimize unused resources).

>> 3) The biggest improvement I'd like to see is in group scheduling.  Suppose I
>> want to schedule multiple instances, each with their own resource requirements,
>> but also with interdependency between them (these ones on the same node, these
>> ones not on the same node, these ones with this provider network, etc.)  The
>> scheduler could then look at the whole request all at once and optimize it
>> rather than looking at each piece separately.  That could also allow relocating
>> multiple instances that want to be co-located on the same compute node.
>>
>
> So, if the grouping is arbitrary, then there's no way to pre-calculate the
> group size, I agree. I am wont to pursue something like this though, as I
> don't really think this is the kind of optimization that cloud workloads
> should be built on top of. If you need two processes to have low latency,
> why not just boot a bigger machine and do it all in one VM? There are a
> few reasons I can think of, but I wonder how many are in the general
> case?

It's a fair question. :)  I honestly don't know...I was just thinking that we 
allow the expression of affinity/anti-affinity policies via server groups, but 
the scheduler doesn't really do a good job of actually scheduling those groups.

Chris

Open Stack

[openstack-dev] [nova] Proposal for an Experiment

OpenStack

Community

Documentation

Branding & Legal