Open Stack

Tue May 31 14:34:32 UTC 2016

On 05/30/2016 11:22 PM, Cheng, Yingxin wrote:
> Hi, cdent:
>
> This problem arises because the RT(resource tracker) only knows to
> consume the DISK resource in its host, but it still doesn’t know
> exactly which resource provider to place the consumption. That is to
> say, the RT still needs to *find* the correct resource provider in
> the step 4. The *step 4* finally causes the explicit problem that
> “the RT can find two resource providers providing DISK_GB, but it
> doesn’t know which is right”, as you’ve encountered.
>
> The problem is: the RT needs to make a decision to choose a resource
> provider when it finds multiple of them according to *step 4*.
> However, the scheduler should already know which resource provider to
> choose when it is making a decision, and it doesn’t send this
> information to compute nodes, either. That’s also to say, there is a
> missing step in the bp g-r-p that we should “improve filter scheduler
> that can make correct decisions with generic resource pools”, the
> scheduler should tell the compute node RT not only about the
> resources consumptions in the compute-node resource provider, but
> also the information where to consume shared resources, i.e. their
> related resource-provider-ids.

Well, that is the problem with not having the scheduler actually do the 
claiming of resources on a provider. :(

At this time, the compute node (specifically, its resource tracker) is 
the thing that does the actual claim of the resources in a request 
against the resource inventories it understands for itself.

This is why even though the scheduler "makes a placement decision" for 
things like which NUMA cell/node that a workload will be placed on [1], 
that decision is promptly forgotten about and ignored and the compute 
node makes a totally different decision [2] when claiming NUMA topology 
resources after it receives the instance request containing NUMA 
topology requests. :(

Is this silly and should, IMHO, the scheduler *actually* do the claim of 
resources on a provider? Yes, see [3] which still needs a spec pushed.

Is this going to change any time soon? Unfortunately, no.

Unfortunately, a compute node isn't aware that it may be consuming 
resources from a shared storage pool, which is what Step #4 is all 
about: making the compute node aware that it is using a shared storage 
pool if it is indeed using a shared storage pool. I'll answer Chris' 
email directly with more details.

Best,
-jay

[1] 
https://github.com/openstack/nova/blob/83cd67cd89ba58243d85db8e82485bda6fd00fde/nova/scheduler/filters/numa_topology_filter.py#L81
[2] 
https://github.com/openstack/nova/blob/83cd67cd89ba58243d85db8e82485bda6fd00fde/nova/compute/claims.py#L215
[3] 
https://blueprints.launchpad.net/nova/+spec/resource-providers-scheduler-claims

> Hope it can help you.
>

Open Stack

[openstack-dev] [nova] [placement] aggregates associated with multiple resource providers

OpenStack

Community

Documentation

Branding & Legal